Monday 27 April 2009

Multiple years in a Cognos ReportNet Graph

If you visualize data to a line chart with 'zero-values' (or in my case, they do have values but they are excluded because of the fact that they belong to weeks with a public holiday) and you need to create a distinction between the years, there will be gaps in the line. See the graph below:

sales20082009

I needed to create a graph starting in week 44 (the start of the impact of the financial crises) without any gaps. The periods (200844-200943 and 200744-200843) had to have different to colours. If you create a chart without series, there is no distinction anymore between
the years.

graph_normal

Query:

query1

Tabular data: (the data item 'Serie'' is defined in Report Studio).

salestable

To place all historical data on one line, I added a week_key column which is the concatination of the year and the week. I added the 'serie column' as a serie (instead of year). In the properties of the level item of the week_key I set 'Sort' on 'ascending':

week_key_no_key

chart_adapted

It didn't work immediately as the 'key property' of the level item 'week_key' had to be set on 'Yes':

week_key_key

After this, the result was:

sales200744-

Another usage of this:

20090428salesquarter

Query:

20090428 - Query SalesQuarter

Sunday 26 April 2009

Public holiday indicator in your Date Dimension

For a lot of reasons we want to determine whether a day is a working day or not.
For example: productivity figures are only valid if we exclude the bank / public holidays.
Trend analysis reports show representative charts if we exclude weeks with signifcant low values (as a public holiday is in the week).

No RDBMS supports public holidays, not static ones (like New Year), neither will dynamical holidays be supported (like Christian holidays which depend on the date of Easter Day).

Whit Monday is seven weeks after Easter Day and Ascension day is 10 days before Whit Monday.
So we only have to calculate the date of Easter Day. This is a complex calculation as it is originated on a non-accurate version of the Hebrewian calendar. There is a correct formula: Jewish months start on Full Moon. The 14th or 15th day must follow immediately after Full Moon. That's why the decision has been made that Easter Day is the sunday after the first Full Moon after ‘Vernal Equinox’ in spring (the crusification of Jesus). ‘Vernal Equinox’ is on March, 21.
Full Moon before Easter Day is being called: 'Paschal Full Moon'. This conflicts with the date of the real Full Moon (astronomical). As 'Paschal Full Moon' is not variabel, Easter Day can be calculated (hence all other Christian holidays).


I used the following calculation in a stored procedure to calculate the Public Holiday Indicator in a date dimension:

Easter calculation
Source: Claus Tøndering, Frequently Asked Questions about Calendars, 25 september 2000


Prerequisites:

CREATE SCHEMA BSM

CREATE TABLE [bsm].[DIM_date]
( [date_id] [int] NOT NULL
, [date] [datetime] NOT NULL
, [day_nr] [int] NOT NULL
, [month_nr] [int] NOT NULL
, [month] [varchar](10) NOT NULL
, [month_key] [int] NOT NULL
, [quarter_nr] [int] NOT NULL
, [quarter] [varchar](2) NOT NULL
, [quarter_key] [int] NOT NULL
, [year] [int] NOT NULL
, [week_nr] [int] NOT NULL
, [week] [varchar](7) NOT NULL
, [week_key] [int] NOT NULL
, [weekday_nr] [int] NOT NULL
, [day] [varchar](9) NOT NULL
, [public_holiday] [varchar](25) NOT NULL
, [weekend] [varchar](1) NOT NULL
, [working_day] [varchar](1) NOT NULL
, [nr_working_days] [int] NOT NULL
, [nr_days] [int] NOT NULL
, CONSTRAINT [PK_DIM_date_id] PRIMARY KEY CLUSTERED
( [date_id] ASC)WITH (IGNORE_DUP_KEY = OFF)
ON [PRIMARY]) ON [PRIMARY]


The function below is needed as SQL SERVER 2005 doesn't support ISO8601 weeknumber (only United States weeknumbers). You do not need this in SQL SERVER 2008. Oracle 9i and higher support it by using the 'IW'-schema.

CREATE FUNCTION [bsm].[ISO8601WeekNr] (@Date datetime)
RETURNS INT
AS
BEGIN
DECLARE @ISO8601Week int

DECLARE @day int
DECLARE @month int
DECLARE @year int
DECLARE @a int
DECLARE @y int
DECLARE @m int
DECLARE @JD int -- Julian day number
DECLARE @d4 int
DECLARE @L int
DECLARE @d1 int

SET @year = YEAR(@Date)

SET @month = MONTH(@Date)
SET @day = DAY(@Date)
SET @a = (14-@month)/12
SET @y = @year +4800 -@a
SET @m = @month + (12*@a) - 3
SET @JD = @day + ( ((153*@m)+2)/5 ) + (365*@y) + (@y/4) - (@y/100) + (@Y/400) - 32045
SET @d4 = (@JD + 31741 - (@JD % 7)) % 146097 % 36524 %1461SET @L = @d4/1460
SET @d1 = ((@d4-@L)%365)+@L
SET @ISO8601Week = (@d1/7) +1
RETURN @ISO8601Week

END

The load script:

CREATE PROCEDURE [bsm].[pLoadDIM_Date] AS
BEGIN
DECLARE @d int -- day counter
DECLARE @date datetime
DECLARE @day_nr int
DECLARE @month varchar(25)
DECLARE @year int
DECLARE @yearvarchar varchar(4)
DECLARE @quarter_nr int
DECLARE @month_nr int
DECLARE @day varchar(9)
DECLARE @g int -- Golden number (-1): Years with the same Golden number have 'New Moon' on the same date

DECLARE @c int -- year/4
DECLARE @h int -- number of days since 'New Moon'
DECLARE @i int -- number of days since March, 21 and 'Paschal Full Maan'
DECLARE @j int -- 'Paschal Full Moon day'
DECLARE @l int -- number of days between March, 21 and Sunday on or before 'Paschal Full Moon' (between -6 and 28)
DECLARE @Easter_month int
DECLARE @Easter int
DECLARE @Easter_date datetime
DECLARE @public_holiday int
DECLARE @x int -- working days counter
DECLARE @y int -- indicator in case week 53 from year x is in year x+1 (100=yes;0=no)

DECLARE @z int -- indicator in case week 1 from year x is in year x-1 (100=yes;0=no)

DECLARE @weekday_nr int

truncate table bsm.dim_date

SET DATEFIRST 1 -- First day of week is Monday (Default is Sunday)
SET @d = -5475 -- First date is current date minus 5475 days
SET @x = 0

WHILE @d <5475

BEGIN

SET @date= DATEADD(dd, @d, GETDATE())
SET @year =YEAR(@date)
SET @yearvarchar = CONVERT(VARCHAR,DATEPART(yy,DATEADD(dd, @d, GETDATE())))

SET @quarter_nr = DATEPART(qq,DATEADD(dd, @d, GETDATE()))
SET @month_nr = DATEPART(mm, DATEADD(dd, @d, GETDATE()))
SET @weekday_nr = DATEPART(dw, DATEADD(dd, @d, GETDATE()))
SET @day_nr = DATEPART(dd, DATEADD(dd, @d, GETDATE()))
SET @day = DATENAME(dw,DATEADD(dd, @d, GETDATE()))
SET @month = DATENAME(mm,DATEADD(dd, @d, GETDATE()))
SET @g = @year%19 SET @c = @year/100SET @h = (@c-(@c/4)-((8*@c+13)/25)+(19*@g)+15)%30
SET @i = @h-((@h/28)*(1-(29/(@h+1)))*((21-@g)/11))
SET @j = (@year+(@year/4)+@i+2-@c+(@c/4))%7SET @l = @i-@j
SET @Easter_month = 3+((@l+40)/44)
SET @Easter = @l+28-31*(@Easter_month/4)
SET @Easter_date = CONVERT(CHAR(10),(CONVERT(DATETIME,CAST(@Easter_month AS CHAR)+'/'+CAST(@Easter AS CHAR)+'/'+CAST(DATEPART(yy, DATEADD(dd,@d, GETDATE())) AS CHAR))),110)

IF @weekday_nr = 6 OR @weekday_nr = 7 OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) OR (@day_nr= 1 AND @month_nr = 1) OR (@day_nr = 30 AND @month_nr = 4) OR (@day_nr = 25 AND @month_nr = 12) OR (@day_nr = 26 AND @month_nr = 12)
SET @public_holiday = 1 ELSE select @public_holiday = 0

IF @public_holiday =1 or @weekday_nr in(6,7)
SET @x = @x ELSE
SET @x =@x+1


IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) >51 and @month_nr = 1
SET @y=100 ELSE SET @y=0 -- if a week belongs party to the next year (2004.53 belongs party to 2005) set indicator on 100 so -- if you subtract 100 of 200553, week 200553 is still week 200453.

IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) =1 and @month_nr = 12
SET @z=100 ELSE SET @z=0 -- if a week is in last year (first week of 2004 belongs partly to 2003) set indicator on 100 so -- if you add 100 to 200301 week 200301, week 200301 is still 200401

INSERT INTO bsm.dim_date
SELECT ((DATEPART(yy, DATEADD(dd, @d, GETDATE())) * 100) + DATEPART(mm,DATEADD(dd, @d, GETDATE()))) * 100 + @day_nr AS date_id
, CAST(CONVERT(VARCHAR, @date, 111) AS DATETIME) as date
, @day_nr AS day_nr
, DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_nr
, @month AS month
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_key
, DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_nr
, 'Q'+CONVERT(VARCHAR,DATEPART(qq,DATEADD(dd, @d, GETDATE()))) as quarter
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_key
, @year AS year_nr
, bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS week_nr
, 'Week '+CONVERT(VARCHAR,bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101))) AS week
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+CAST (bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS VARCHAR)-@y+@z AS week_key
, @weekday_nr AS weekday_nr
, @day AS day
, CASE WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) THEN 'Easter Day'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) THEN'Easter Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) THEN 'Whit Sunday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) THEN 'Whit Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) THEN 'Ascension day'
WHEN @day_nr= 1 AND @month_nr = 1 THEN 'New Year-s Day' WHEN @day_nr = 30 AND @month_nr = 4 THEN 'Dutch Queen-s Day'
WHEN @day_nr = 25 AND @month_nr = 12 THEN 'Christmas Day' WHEN @day_nr = 26 AND @month_nr = 12 THEN 'Boxing Day' ELSE '-'
END AS public_holiday
, CASE WHEN @weekday_nr in(6,7) THEN 'Y' ELSE 'N' END AS weekend
, CASE WHEN @public_holiday = 1 OR @weekday_nr in(6,7) THEN 'N' ELSE 'Y' END AS working_day
, @x AS nr_working_days
, @d+5476 AS nr_days

SET @d = @d + 1

END

END


Friday 24 April 2009

Monitor / Audit your Data Warehouse load

One of the most important means of monitoring the run of your Data Warehouse load, is getting insight in the status of running jobs, the duration of the jobs (in history) and which jobs doesn't seem to end. An execution of a simple update script of an admin table before the start of a job and after the end of a job can result in a table like:





































































































































































































































































































































RunIDProcessActionObjectNewUpdatedDeletedLoad_startLoad_end
1EDW - HRMPublish HRM_cubeHRM_Montly.mdc4/15/09 8:09:38 AM4/15/09 8:10:02 AM
1EDW - HRMFTP HRM cubeHRM_Montly.mdc4/15/09 8:09:22 AM4/15/09 8:09:38 AM
1EDW - HRMCreate HRM cubeHRM_Montly.mdc4/15/09 8:09:21 AM4/15/09 8:09:22 AM
1EDW - HRMLoad new facts into aggregate table in Business ModelBSM.FCT_Hour_Monthly11408004/15/09 8:09:21 AM4/15/09 8:09:21 AM
1EDW - HRMDelete existing facts from aggregate table in Business ModelBSM.FCT_Hour_Monthly003444/15/09 8:09:03 AM4/15/09 8:09:21 AM
1EDW - HRMLoad Dimension Supplier into BSM (Refresh)BSM.DIM_Supplier3004/15/09 8:09:02 AM4/15/09 8:09:03 AM
1EDW - HRMLoad Dimension Organisational Unit into BSM (Refresh)BSM.DIM_OrganisationalUnit23004/15/09 8:09:00 AM4/15/09 8:09:02 AM
1EDW - HRMLoad Dimension Employee into BSM (Refresh)BSM.DIM_Employee45632004/15/09 7:59:20 AM4/15/09 8:09:00 AM
1EDW - HRMLoad Date Dimension into Business Model (Refresh)BSM.DIM_Date365004/15/09 7:59:18 AM4/15/09 7:59:20 AM
1EDW - HRMLoad new and changed hour facts into ODSODS.FCT_Hour45632004/15/09 7:59:16 AM4/15/09 7:59:18 AM
1EDW - HRMLoad new and changed targets into ODSODS.FCT_Target12004/15/09 7:59:15 AM4/15/09 7:59:16 AM
1EDW - HRMLoad new and changed suppliers into ODSODS.DIM_Supplier0004/15/09 7:59:08 AM4/15/09 7:59:15 AM
1EDW - HRMLoad new and changed organisational units into ODSODS.DIM_OrganisationalUnit0104/15/09 7:58:58 AM4/15/09 7:59:08 AM
1EDW - HRMLoad new and changed employees into ODSODS.DIM_Employee1204/15/09 7:58:55 AM4/15/09 7:58:58 AM
1EDW - HRMTransformation step: create hierarchy for organisational unitSTG.OrganisationalUnit02304/15/09 7:58:52 AM4/15/09 7:58:53 AM
1EDW - HRMTransformation step: calculate productivitySTG.Hour04563204/15/09 7:58:49 AM4/15/09 7:58:52 AM
1EDW - HRMLookup surrogate key supplierSTG.Supplier0304/15/09 7:58:23 AM4/15/09 7:58:49 AM
1EDW - HRMLookup surrogate key organisational unitSTG.OrganisationalUnit02304/15/09 7:51:56 AM4/15/09 7:58:04 AM
1EDW - HRMLookup surrogate key employeeSTG.employee0245604/15/09 7:51:28 AM4/15/09 7:51:56 AM
1EDW - HRMDetect corrected hoursSTG.Hour03504/15/09 7:51:24 AM4/15/09 7:51:28 AM
1EDW - HRMDetect changed supplier attributesSTG.Hour0004/15/09 7:51:22 AM4/15/09 7:51:24 AM
1EDW - HRMDetect changed organisational unit attributesSTG.OrganisationalUnit0104/15/09 7:51:18 AM4/15/09 7:51:22 AM
1EDW - HRMDetect changed employee attributesSTG.Employee2004/15/09 7:51:15 AM4/15/09 7:51:18 AM
1EDW - HRMExtract Monthly TargetsSTG.Target12004/15/09 7:50:54 AM4/15/09 7:51:15 AM
1EDW - HRMExtract Fact HoursSTG.Hour45632004/15/09 7:50:49 AM4/15/09 7:50:52 AM
1EDW - HRMExtract Dimension SupplierSTG.Supplier3004/15/09 7:50:17 AM4/15/09 7:50:49 AM
1EDW - HRMExtract Dimension Organisational UnitSTG.OrganisationalUnit23004/15/09 7:49:21 AM4/15/09 7:50:17 AM
1EDW - HRMExtract Dimension EmployeeSTG.Employee2456004/15/09 7:49:20 AM4/15/09 7:49:21 AM



Firstly, create an admin table:

-- Create MetaData layer

create schema mda

-- Create Admin table

CREATE TABLE [mda].[admin_run](
[Runid] [int] NULL,
[Process] [varchar](50) ,
[Action] [varchar](50) ,
[Object] [varchar](50) ,
[Inserted] [int] NULL,
[Updated] [int] NULL,
[Deleted] [int] NULL,
[LoadStart] [datetime] NULL,
[LoadEnd] [datetime] NULL
)
Secondly, create stored procedures to update the table:

CREATE PROCEDURE [mda].[PUpdate_Admin_Run_Start]

(
@Runid int
, @Process varchar(255)
, @Action varchar(255)
, @Object varchar(255)
)

AS

BEGIN

insert into mda.admin_run
(
RunID
, Process
, Action
, Object
, Inserted
, Updated
, Deleted
, LoadStart
)
values
(
@RunID
, @Process
, @Action
, @Object
, ''
, ''
, ''
, getdate()
)

END
CREATE PROCEDURE [mda].[PUpdate_Admin_Run_End]
( @RunId int
, @Action varchar(50)
, @Records_New int = 0
, @Records_Updated int = 0
, @Records_Deleted int = 0 )
AS
BEGIN
update mda.admin_run
set LoadEnd = getdate()
,Inserted = @Records_new
,Updated = @Records_updated
,Deleted = @Records_deleted
where RunId = @RunId
and LoadEnd is null
and Action = @Action
END
Below you find an example script which updates the admin table:
begin
declare @Records_inserted int
declare @Records_updated int
declare @Records_deleted int
declare @Runid int
declare @Process varchar(50)
declare @Action varchar(50)
declare @Object varchar(50)
declare @rows int

select @rows = count(*) from mda.admin_run
if @rows = 0 set @Runid = 1 else select @runid = max(runid) from mda.admi_run

set @Process = 'EDW - HRM'
set @Action = 'Extract Dimension Employee'
set @Object = 'STG.Employee'
exec mda.PUpdate_Admin_Run_Start @RunID,@Process,@Action,@Object
--load script
select getdate()

set @Records_Inserted = @@rowcount
exec mda.PUpdate_Admin_Run_End @RunID,@Action,@Records_Inserted, @Records_updated, @Records_deleted
end

Tuesday 7 April 2009

Controlled update in SQL Server

Whenever you experience slow updates or timeouts errors on updating columns you can update it incrementally by using the script below.
The WaitForDelay is added to permit the run of other processes (including a transaction log backup.). The script only works if one of the source columns which are used to update, cannot contain NULL values. If that's not the case you have to add a column (i.e. 'updated'). You should update this with a certain value (i.e. '1'). Use this column to determine @Count ('where updated <> 1') and to filter in the 'where clause' the update query ('and updated <> 1').

SQL 2005 Syntax:

DECLARE @Batch int
DECLARE @Rownum int

DECLARE @PrevRownum int

SET @Batch = 10000
SET @Rownum = 1

WHILE (@Rownum > 0)

BEGIN

UPDATE TOP(@Batch) UPDATE_TABLE
SET
UPDATE_COLUMN1 = S.SOURCE_COLUMN1
, UPDATE_COLUM2 = S.SOURCE_COLUMN2
FROM SOURCE_TABLE c
WHERE UPDATE_TABLE.ID = SOURCE_TABLE.ID
AND UPDATE_COLUMN1 IS NULL


SET @Prevrownum = @Rownum

SELECT @Rownum = COUNT(*) FROM UPDATE_TABLE
WHERE UPDATE_COLUMN1 IS NULL


IF @Prevrownum = @Rownum
set @ROWNUM = 0

WAITFOR DELAY '000:00:05'

END
RETURN