If you visualize data to a line chart with 'zero-values' (or in my case, they do have values but they are excluded because of the fact that they belong to weeks with a public holiday) and you need to create a distinction between the years, there will be gaps in the line. See the graph below:
I needed to create a graph starting in week 44 (the start of the impact of the financial crises) without any gaps. The periods (200844-200943 and 200744-200843) had to have different to colours. If you create a chart without series, there is no distinction anymore between
the years.
Query:
Tabular data: (the data item 'Serie'' is defined in Report Studio).
To place all historical data on one line, I added a week_key column which is the concatination of the year and the week. I added the 'serie column' as a serie (instead of year). In the properties of the level item of the week_key I set 'Sort' on 'ascending':
It didn't work immediately as the 'key property' of the level item 'week_key' had to be set on 'Yes':
After this, the result was:
Another usage of this:
Query:
Monday, 27 April 2009
Sunday, 26 April 2009
Public holiday indicator in your Date Dimension
For a lot of reasons we want to determine whether a day is a working day or not.
For example: productivity figures are only valid if we exclude the bank / public holidays.
Trend analysis reports show representative charts if we exclude weeks with signifcant low values (as a public holiday is in the week).
No RDBMS supports public holidays, not static ones (like New Year), neither will dynamical holidays be supported (like Christian holidays which depend on the date of Easter Day).
Whit Monday is seven weeks after Easter Day and Ascension day is 10 days before Whit Monday.
So we only have to calculate the date of Easter Day. This is a complex calculation as it is originated on a non-accurate version of the Hebrewian calendar. There is a correct formula: Jewish months start on Full Moon. The 14th or 15th day must follow immediately after Full Moon. That's why the decision has been made that Easter Day is the sunday after the first Full Moon after ‘Vernal Equinox’ in spring (the crusification of Jesus). ‘Vernal Equinox’ is on March, 21.
Full Moon before Easter Day is being called: 'Paschal Full Moon'. This conflicts with the date of the real Full Moon (astronomical). As 'Paschal Full Moon' is not variabel, Easter Day can be calculated (hence all other Christian holidays).
I used the following calculation in a stored procedure to calculate the Public Holiday Indicator in a date dimension:
Source: Claus Tøndering, Frequently Asked Questions about Calendars, 25 september 2000
Prerequisites:
CREATE SCHEMA BSM
CREATE TABLE [bsm].[DIM_date]
( [date_id] [int] NOT NULL
, [date] [datetime] NOT NULL
, [day_nr] [int] NOT NULL
, [month_nr] [int] NOT NULL
, [month] [varchar](10) NOT NULL
, [month_key] [int] NOT NULL
, [quarter_nr] [int] NOT NULL
, [quarter] [varchar](2) NOT NULL
, [quarter_key] [int] NOT NULL
, [year] [int] NOT NULL
, [week_nr] [int] NOT NULL
, [week] [varchar](7) NOT NULL
, [week_key] [int] NOT NULL
, [weekday_nr] [int] NOT NULL
, [day] [varchar](9) NOT NULL
, [public_holiday] [varchar](25) NOT NULL
, [weekend] [varchar](1) NOT NULL
, [working_day] [varchar](1) NOT NULL
, [nr_working_days] [int] NOT NULL
, [nr_days] [int] NOT NULL
, CONSTRAINT [PK_DIM_date_id] PRIMARY KEY CLUSTERED
( [date_id] ASC)WITH (IGNORE_DUP_KEY = OFF)
ON [PRIMARY]) ON [PRIMARY]
The function below is needed as SQL SERVER 2005 doesn't support ISO8601 weeknumber (only United States weeknumbers). You do not need this in SQL SERVER 2008. Oracle 9i and higher support it by using the 'IW'-schema.
CREATE FUNCTION [bsm].[ISO8601WeekNr] (@Date datetime)
RETURNS INT
AS
BEGIN
DECLARE @ISO8601Week int
DECLARE @day int
DECLARE @month int
DECLARE @year int
DECLARE @a int
DECLARE @y int
DECLARE @m int
DECLARE @JD int -- Julian day number
DECLARE @d4 int
DECLARE @L int
DECLARE @d1 int
SET @year = YEAR(@Date)
SET @month = MONTH(@Date)
SET @day = DAY(@Date)
SET @a = (14-@month)/12
SET @y = @year +4800 -@a
SET @m = @month + (12*@a) - 3
SET @JD = @day + ( ((153*@m)+2)/5 ) + (365*@y) + (@y/4) - (@y/100) + (@Y/400) - 32045
SET @d4 = (@JD + 31741 - (@JD % 7)) % 146097 % 36524 %1461SET @L = @d4/1460
SET @d1 = ((@d4-@L)%365)+@L
SET @ISO8601Week = (@d1/7) +1
RETURN @ISO8601Week
END
The load script:
CREATE PROCEDURE [bsm].[pLoadDIM_Date] AS
BEGIN
DECLARE @d int -- day counter
DECLARE @date datetime
DECLARE @day_nr int
DECLARE @month varchar(25)
DECLARE @year int
DECLARE @yearvarchar varchar(4)
DECLARE @quarter_nr int
DECLARE @month_nr int
DECLARE @day varchar(9)
DECLARE @g int -- Golden number (-1): Years with the same Golden number have 'New Moon' on the same date
DECLARE @c int -- year/4
DECLARE @h int -- number of days since 'New Moon'
DECLARE @i int -- number of days since March, 21 and 'Paschal Full Maan'
DECLARE @j int -- 'Paschal Full Moon day'
DECLARE @l int -- number of days between March, 21 and Sunday on or before 'Paschal Full Moon' (between -6 and 28)
DECLARE @Easter_month int
DECLARE @Easter int
DECLARE @Easter_date datetime
DECLARE @public_holiday int
DECLARE @x int -- working days counter
DECLARE @y int -- indicator in case week 53 from year x is in year x+1 (100=yes;0=no)
DECLARE @z int -- indicator in case week 1 from year x is in year x-1 (100=yes;0=no)
DECLARE @weekday_nr int
truncate table bsm.dim_date
SET DATEFIRST 1 -- First day of week is Monday (Default is Sunday)
SET @d = -5475 -- First date is current date minus 5475 days
SET @x = 0
WHILE @d <5475
BEGIN
SET @date= DATEADD(dd, @d, GETDATE())
SET @year =YEAR(@date)
SET @yearvarchar = CONVERT(VARCHAR,DATEPART(yy,DATEADD(dd, @d, GETDATE())))
SET @quarter_nr = DATEPART(qq,DATEADD(dd, @d, GETDATE()))
SET @month_nr = DATEPART(mm, DATEADD(dd, @d, GETDATE()))
SET @weekday_nr = DATEPART(dw, DATEADD(dd, @d, GETDATE()))
SET @day_nr = DATEPART(dd, DATEADD(dd, @d, GETDATE()))
SET @day = DATENAME(dw,DATEADD(dd, @d, GETDATE()))
SET @month = DATENAME(mm,DATEADD(dd, @d, GETDATE()))
SET @g = @year%19 SET @c = @year/100SET @h = (@c-(@c/4)-((8*@c+13)/25)+(19*@g)+15)%30
SET @i = @h-((@h/28)*(1-(29/(@h+1)))*((21-@g)/11))
SET @j = (@year+(@year/4)+@i+2-@c+(@c/4))%7SET @l = @i-@j
SET @Easter_month = 3+((@l+40)/44)
SET @Easter = @l+28-31*(@Easter_month/4)
SET @Easter_date = CONVERT(CHAR(10),(CONVERT(DATETIME,CAST(@Easter_month AS CHAR)+'/'+CAST(@Easter AS CHAR)+'/'+CAST(DATEPART(yy, DATEADD(dd,@d, GETDATE())) AS CHAR))),110)
IF @weekday_nr = 6 OR @weekday_nr = 7 OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) OR (@day_nr= 1 AND @month_nr = 1) OR (@day_nr = 30 AND @month_nr = 4) OR (@day_nr = 25 AND @month_nr = 12) OR (@day_nr = 26 AND @month_nr = 12)
SET @public_holiday = 1 ELSE select @public_holiday = 0
IF @public_holiday =1 or @weekday_nr in(6,7)
SET @x = @x ELSE
SET @x =@x+1
IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) >51 and @month_nr = 1
SET @y=100 ELSE SET @y=0 -- if a week belongs party to the next year (2004.53 belongs party to 2005) set indicator on 100 so -- if you subtract 100 of 200553, week 200553 is still week 200453.
IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) =1 and @month_nr = 12
SET @z=100 ELSE SET @z=0 -- if a week is in last year (first week of 2004 belongs partly to 2003) set indicator on 100 so -- if you add 100 to 200301 week 200301, week 200301 is still 200401
INSERT INTO bsm.dim_date
SELECT ((DATEPART(yy, DATEADD(dd, @d, GETDATE())) * 100) + DATEPART(mm,DATEADD(dd, @d, GETDATE()))) * 100 + @day_nr AS date_id
, CAST(CONVERT(VARCHAR, @date, 111) AS DATETIME) as date
, @day_nr AS day_nr
, DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_nr
, @month AS month
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_key
, DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_nr
, 'Q'+CONVERT(VARCHAR,DATEPART(qq,DATEADD(dd, @d, GETDATE()))) as quarter
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_key
, @year AS year_nr
, bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS week_nr
, 'Week '+CONVERT(VARCHAR,bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101))) AS week
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+CAST (bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS VARCHAR)-@y+@z AS week_key
, @weekday_nr AS weekday_nr
, @day AS day
, CASE WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) THEN 'Easter Day'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) THEN'Easter Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) THEN 'Whit Sunday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) THEN 'Whit Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) THEN 'Ascension day'
WHEN @day_nr= 1 AND @month_nr = 1 THEN 'New Year-s Day' WHEN @day_nr = 30 AND @month_nr = 4 THEN 'Dutch Queen-s Day'
WHEN @day_nr = 25 AND @month_nr = 12 THEN 'Christmas Day' WHEN @day_nr = 26 AND @month_nr = 12 THEN 'Boxing Day' ELSE '-'
END AS public_holiday
, CASE WHEN @weekday_nr in(6,7) THEN 'Y' ELSE 'N' END AS weekend
, CASE WHEN @public_holiday = 1 OR @weekday_nr in(6,7) THEN 'N' ELSE 'Y' END AS working_day
, @x AS nr_working_days
, @d+5476 AS nr_days
SET @d = @d + 1
END
END
For example: productivity figures are only valid if we exclude the bank / public holidays.
Trend analysis reports show representative charts if we exclude weeks with signifcant low values (as a public holiday is in the week).
No RDBMS supports public holidays, not static ones (like New Year), neither will dynamical holidays be supported (like Christian holidays which depend on the date of Easter Day).
Whit Monday is seven weeks after Easter Day and Ascension day is 10 days before Whit Monday.
So we only have to calculate the date of Easter Day. This is a complex calculation as it is originated on a non-accurate version of the Hebrewian calendar. There is a correct formula: Jewish months start on Full Moon. The 14th or 15th day must follow immediately after Full Moon. That's why the decision has been made that Easter Day is the sunday after the first Full Moon after ‘Vernal Equinox’ in spring (the crusification of Jesus). ‘Vernal Equinox’ is on March, 21.
Full Moon before Easter Day is being called: 'Paschal Full Moon'. This conflicts with the date of the real Full Moon (astronomical). As 'Paschal Full Moon' is not variabel, Easter Day can be calculated (hence all other Christian holidays).
I used the following calculation in a stored procedure to calculate the Public Holiday Indicator in a date dimension:
Source: Claus Tøndering, Frequently Asked Questions about Calendars, 25 september 2000
Prerequisites:
CREATE SCHEMA BSM
CREATE TABLE [bsm].[DIM_date]
( [date_id] [int] NOT NULL
, [date] [datetime] NOT NULL
, [day_nr] [int] NOT NULL
, [month_nr] [int] NOT NULL
, [month] [varchar](10) NOT NULL
, [month_key] [int] NOT NULL
, [quarter_nr] [int] NOT NULL
, [quarter] [varchar](2) NOT NULL
, [quarter_key] [int] NOT NULL
, [year] [int] NOT NULL
, [week_nr] [int] NOT NULL
, [week] [varchar](7) NOT NULL
, [week_key] [int] NOT NULL
, [weekday_nr] [int] NOT NULL
, [day] [varchar](9) NOT NULL
, [public_holiday] [varchar](25) NOT NULL
, [weekend] [varchar](1) NOT NULL
, [working_day] [varchar](1) NOT NULL
, [nr_working_days] [int] NOT NULL
, [nr_days] [int] NOT NULL
, CONSTRAINT [PK_DIM_date_id] PRIMARY KEY CLUSTERED
( [date_id] ASC)WITH (IGNORE_DUP_KEY = OFF)
ON [PRIMARY]) ON [PRIMARY]
The function below is needed as SQL SERVER 2005 doesn't support ISO8601 weeknumber (only United States weeknumbers). You do not need this in SQL SERVER 2008. Oracle 9i and higher support it by using the 'IW'-schema.
CREATE FUNCTION [bsm].[ISO8601WeekNr] (@Date datetime)
RETURNS INT
AS
BEGIN
DECLARE @ISO8601Week int
DECLARE @day int
DECLARE @month int
DECLARE @year int
DECLARE @a int
DECLARE @y int
DECLARE @m int
DECLARE @JD int -- Julian day number
DECLARE @d4 int
DECLARE @L int
DECLARE @d1 int
SET @year = YEAR(@Date)
SET @month = MONTH(@Date)
SET @day = DAY(@Date)
SET @a = (14-@month)/12
SET @y = @year +4800 -@a
SET @m = @month + (12*@a) - 3
SET @JD = @day + ( ((153*@m)+2)/5 ) + (365*@y) + (@y/4) - (@y/100) + (@Y/400) - 32045
SET @d4 = (@JD + 31741 - (@JD % 7)) % 146097 % 36524 %1461SET @L = @d4/1460
SET @d1 = ((@d4-@L)%365)+@L
SET @ISO8601Week = (@d1/7) +1
RETURN @ISO8601Week
END
The load script:
CREATE PROCEDURE [bsm].[pLoadDIM_Date] AS
BEGIN
DECLARE @d int -- day counter
DECLARE @date datetime
DECLARE @day_nr int
DECLARE @month varchar(25)
DECLARE @year int
DECLARE @yearvarchar varchar(4)
DECLARE @quarter_nr int
DECLARE @month_nr int
DECLARE @day varchar(9)
DECLARE @g int -- Golden number (-1): Years with the same Golden number have 'New Moon' on the same date
DECLARE @c int -- year/4
DECLARE @h int -- number of days since 'New Moon'
DECLARE @i int -- number of days since March, 21 and 'Paschal Full Maan'
DECLARE @j int -- 'Paschal Full Moon day'
DECLARE @l int -- number of days between March, 21 and Sunday on or before 'Paschal Full Moon' (between -6 and 28)
DECLARE @Easter_month int
DECLARE @Easter int
DECLARE @Easter_date datetime
DECLARE @public_holiday int
DECLARE @x int -- working days counter
DECLARE @y int -- indicator in case week 53 from year x is in year x+1 (100=yes;0=no)
DECLARE @z int -- indicator in case week 1 from year x is in year x-1 (100=yes;0=no)
DECLARE @weekday_nr int
truncate table bsm.dim_date
SET DATEFIRST 1 -- First day of week is Monday (Default is Sunday)
SET @d = -5475 -- First date is current date minus 5475 days
SET @x = 0
WHILE @d <5475
BEGIN
SET @date= DATEADD(dd, @d, GETDATE())
SET @year =YEAR(@date)
SET @yearvarchar = CONVERT(VARCHAR,DATEPART(yy,DATEADD(dd, @d, GETDATE())))
SET @quarter_nr = DATEPART(qq,DATEADD(dd, @d, GETDATE()))
SET @month_nr = DATEPART(mm, DATEADD(dd, @d, GETDATE()))
SET @weekday_nr = DATEPART(dw, DATEADD(dd, @d, GETDATE()))
SET @day_nr = DATEPART(dd, DATEADD(dd, @d, GETDATE()))
SET @day = DATENAME(dw,DATEADD(dd, @d, GETDATE()))
SET @month = DATENAME(mm,DATEADD(dd, @d, GETDATE()))
SET @g = @year%19 SET @c = @year/100SET @h = (@c-(@c/4)-((8*@c+13)/25)+(19*@g)+15)%30
SET @i = @h-((@h/28)*(1-(29/(@h+1)))*((21-@g)/11))
SET @j = (@year+(@year/4)+@i+2-@c+(@c/4))%7SET @l = @i-@j
SET @Easter_month = 3+((@l+40)/44)
SET @Easter = @l+28-31*(@Easter_month/4)
SET @Easter_date = CONVERT(CHAR(10),(CONVERT(DATETIME,CAST(@Easter_month AS CHAR)+'/'+CAST(@Easter AS CHAR)+'/'+CAST(DATEPART(yy, DATEADD(dd,@d, GETDATE())) AS CHAR))),110)
IF @weekday_nr = 6 OR @weekday_nr = 7 OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) OR CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) OR (@day_nr= 1 AND @month_nr = 1) OR (@day_nr = 30 AND @month_nr = 4) OR (@day_nr = 25 AND @month_nr = 12) OR (@day_nr = 26 AND @month_nr = 12)
SET @public_holiday = 1 ELSE select @public_holiday = 0
IF @public_holiday =1 or @weekday_nr in(6,7)
SET @x = @x ELSE
SET @x =@x+1
IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) >51 and @month_nr = 1
SET @y=100 ELSE SET @y=0 -- if a week belongs party to the next year (2004.53 belongs party to 2005) set indicator on 100 so -- if you subtract 100 of 200553, week 200553 is still week 200453.
IF bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) =1 and @month_nr = 12
SET @z=100 ELSE SET @z=0 -- if a week is in last year (first week of 2004 belongs partly to 2003) set indicator on 100 so -- if you add 100 to 200301 week 200301, week 200301 is still 200401
INSERT INTO bsm.dim_date
SELECT ((DATEPART(yy, DATEADD(dd, @d, GETDATE())) * 100) + DATEPART(mm,DATEADD(dd, @d, GETDATE()))) * 100 + @day_nr AS date_id
, CAST(CONVERT(VARCHAR, @date, 111) AS DATETIME) as date
, @day_nr AS day_nr
, DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_nr
, @month AS month
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(mm, DATEADD(dd, @d, GETDATE())) AS month_key
, DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_nr
, 'Q'+CONVERT(VARCHAR,DATEPART(qq,DATEADD(dd, @d, GETDATE()))) as quarter
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+ DATEPART(qq,DATEADD(dd, @d, GETDATE())) AS quarter_key
, @year AS year_nr
, bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS week_nr
, 'Week '+CONVERT(VARCHAR,bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101))) AS week
, (CAST(DATEPART(yy, DATEADD(dd, @d, GETDATE())) AS VARCHAR)*100)+CAST (bsm.ISO8601weekNr(CONVERT(CHAR, (DATEADD(dd, @d, GETDATE())),101)) AS VARCHAR)-@y+@z AS week_key
, @weekday_nr AS weekday_nr
, @day AS day
, CASE WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d, GETDATE()),110) THEN 'Easter Day'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-1, GETDATE()),110) THEN'Easter Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-49, GETDATE()),110) THEN 'Whit Sunday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-50, GETDATE()),110) THEN 'Whit Monday'
WHEN CONVERT(CHAR(10),@Easter_date,110) = CONVERT(CHAR(10),DATEADD(dd, @d-39, GETDATE()),110) THEN 'Ascension day'
WHEN @day_nr= 1 AND @month_nr = 1 THEN 'New Year-s Day' WHEN @day_nr = 30 AND @month_nr = 4 THEN 'Dutch Queen-s Day'
WHEN @day_nr = 25 AND @month_nr = 12 THEN 'Christmas Day' WHEN @day_nr = 26 AND @month_nr = 12 THEN 'Boxing Day' ELSE '-'
END AS public_holiday
, CASE WHEN @weekday_nr in(6,7) THEN 'Y' ELSE 'N' END AS weekend
, CASE WHEN @public_holiday = 1 OR @weekday_nr in(6,7) THEN 'N' ELSE 'Y' END AS working_day
, @x AS nr_working_days
, @d+5476 AS nr_days
SET @d = @d + 1
END
END
Friday, 24 April 2009
Monitor / Audit your Data Warehouse load
One of the most important means of monitoring the run of your Data Warehouse load, is getting insight in the status of running jobs, the duration of the jobs (in history) and which jobs doesn't seem to end. An execution of a simple update script of an admin table before the start of a job and after the end of a job can result in a table like:
Firstly, create an admin table:
select @rows = count(*) from mda.admin_run
RunID | Process | Action | Object | New | Updated | Deleted | Load_start | Load_end |
---|---|---|---|---|---|---|---|---|
1 | EDW - HRM | Publish HRM_cube | HRM_Montly.mdc | 4/15/09 8:09:38 AM | 4/15/09 8:10:02 AM | |||
1 | EDW - HRM | FTP HRM cube | HRM_Montly.mdc | 4/15/09 8:09:22 AM | 4/15/09 8:09:38 AM | |||
1 | EDW - HRM | Create HRM cube | HRM_Montly.mdc | 4/15/09 8:09:21 AM | 4/15/09 8:09:22 AM | |||
1 | EDW - HRM | Load new facts into aggregate table in Business Model | BSM.FCT_Hour_Monthly | 11408 | 0 | 0 | 4/15/09 8:09:21 AM | 4/15/09 8:09:21 AM |
1 | EDW - HRM | Delete existing facts from aggregate table in Business Model | BSM.FCT_Hour_Monthly | 0 | 0 | 344 | 4/15/09 8:09:03 AM | 4/15/09 8:09:21 AM |
1 | EDW - HRM | Load Dimension Supplier into BSM (Refresh) | BSM.DIM_Supplier | 3 | 0 | 0 | 4/15/09 8:09:02 AM | 4/15/09 8:09:03 AM |
1 | EDW - HRM | Load Dimension Organisational Unit into BSM (Refresh) | BSM.DIM_OrganisationalUnit | 23 | 0 | 0 | 4/15/09 8:09:00 AM | 4/15/09 8:09:02 AM |
1 | EDW - HRM | Load Dimension Employee into BSM (Refresh) | BSM.DIM_Employee | 45632 | 0 | 0 | 4/15/09 7:59:20 AM | 4/15/09 8:09:00 AM |
1 | EDW - HRM | Load Date Dimension into Business Model (Refresh) | BSM.DIM_Date | 365 | 0 | 0 | 4/15/09 7:59:18 AM | 4/15/09 7:59:20 AM |
1 | EDW - HRM | Load new and changed hour facts into ODS | ODS.FCT_Hour | 45632 | 0 | 0 | 4/15/09 7:59:16 AM | 4/15/09 7:59:18 AM |
1 | EDW - HRM | Load new and changed targets into ODS | ODS.FCT_Target | 12 | 0 | 0 | 4/15/09 7:59:15 AM | 4/15/09 7:59:16 AM |
1 | EDW - HRM | Load new and changed suppliers into ODS | ODS.DIM_Supplier | 0 | 0 | 0 | 4/15/09 7:59:08 AM | 4/15/09 7:59:15 AM |
1 | EDW - HRM | Load new and changed organisational units into ODS | ODS.DIM_OrganisationalUnit | 0 | 1 | 0 | 4/15/09 7:58:58 AM | 4/15/09 7:59:08 AM |
1 | EDW - HRM | Load new and changed employees into ODS | ODS.DIM_Employee | 1 | 2 | 0 | 4/15/09 7:58:55 AM | 4/15/09 7:58:58 AM |
1 | EDW - HRM | Transformation step: create hierarchy for organisational unit | STG.OrganisationalUnit | 0 | 23 | 0 | 4/15/09 7:58:52 AM | 4/15/09 7:58:53 AM |
1 | EDW - HRM | Transformation step: calculate productivity | STG.Hour | 0 | 45632 | 0 | 4/15/09 7:58:49 AM | 4/15/09 7:58:52 AM |
1 | EDW - HRM | Lookup surrogate key supplier | STG.Supplier | 0 | 3 | 0 | 4/15/09 7:58:23 AM | 4/15/09 7:58:49 AM |
1 | EDW - HRM | Lookup surrogate key organisational unit | STG.OrganisationalUnit | 0 | 23 | 0 | 4/15/09 7:51:56 AM | 4/15/09 7:58:04 AM |
1 | EDW - HRM | Lookup surrogate key employee | STG.employee | 0 | 2456 | 0 | 4/15/09 7:51:28 AM | 4/15/09 7:51:56 AM |
1 | EDW - HRM | Detect corrected hours | STG.Hour | 0 | 35 | 0 | 4/15/09 7:51:24 AM | 4/15/09 7:51:28 AM |
1 | EDW - HRM | Detect changed supplier attributes | STG.Hour | 0 | 0 | 0 | 4/15/09 7:51:22 AM | 4/15/09 7:51:24 AM |
1 | EDW - HRM | Detect changed organisational unit attributes | STG.OrganisationalUnit | 0 | 1 | 0 | 4/15/09 7:51:18 AM | 4/15/09 7:51:22 AM |
1 | EDW - HRM | Detect changed employee attributes | STG.Employee | 2 | 0 | 0 | 4/15/09 7:51:15 AM | 4/15/09 7:51:18 AM |
1 | EDW - HRM | Extract Monthly Targets | STG.Target | 12 | 0 | 0 | 4/15/09 7:50:54 AM | 4/15/09 7:51:15 AM |
1 | EDW - HRM | Extract Fact Hours | STG.Hour | 45632 | 0 | 0 | 4/15/09 7:50:49 AM | 4/15/09 7:50:52 AM |
1 | EDW - HRM | Extract Dimension Supplier | STG.Supplier | 3 | 0 | 0 | 4/15/09 7:50:17 AM | 4/15/09 7:50:49 AM |
1 | EDW - HRM | Extract Dimension Organisational Unit | STG.OrganisationalUnit | 23 | 0 | 0 | 4/15/09 7:49:21 AM | 4/15/09 7:50:17 AM |
1 | EDW - HRM | Extract Dimension Employee | STG.Employee | 2456 | 0 | 0 | 4/15/09 7:49:20 AM | 4/15/09 7:49:21 AM |
Firstly, create an admin table:
-- Create MetaData layer
create schema mda
-- Create Admin table
CREATE TABLE [mda].[admin_run](
[Runid] [int] NULL,
[Process] [varchar](50) ,
[Action] [varchar](50) ,
[Object] [varchar](50) ,
[Inserted] [int] NULL,
[Updated] [int] NULL,
[Deleted] [int] NULL,
[LoadStart] [datetime] NULL,
[LoadEnd] [datetime] NULL
)
Secondly, create stored procedures to update the table:
CREATE PROCEDURE [mda].[PUpdate_Admin_Run_Start]
(
@Runid int
, @Process varchar(255)
, @Action varchar(255)
, @Object varchar(255)
)
AS
BEGIN
insert into mda.admin_run
(
RunID
, Process
, Action
, Object
, Inserted
, Updated
, Deleted
, LoadStart
)
values
(
@RunID
, @Process
, @Action
, @Object
, ''
, ''
, ''
, getdate()
)
END
CREATE PROCEDURE [mda].[PUpdate_Admin_Run_End]
( @RunId int
, @Action varchar(50)
, @Records_New int = 0
, @Records_Updated int = 0
, @Records_Deleted int = 0 )
AS
BEGIN
update mda.admin_run
AS
BEGIN
update mda.admin_run
set LoadEnd = getdate()
,Inserted = @Records_new
,Updated = @Records_updated
,Deleted = @Records_deleted
where RunId = @RunId
and LoadEnd is null
and Action = @Action
END
END
Below you find an example script which updates the admin table:
begin
declare @Records_inserted int
declare @Records_updated int
declare @Records_deleted int
declare @Runid int
declare @Process varchar(50)
declare @Action varchar(50)
declare @Object varchar(50)
declare @rows int
select @rows = count(*) from mda.admin_run
if @rows = 0 set @Runid = 1 else select @runid = max(runid) from mda.admi_run
set @Process = 'EDW - HRM'
set @Action = 'Extract Dimension Employee'
set @Object = 'STG.Employee'
exec mda.PUpdate_Admin_Run_Start @RunID,@Process,@Action,@Object
--load script
select getdate()
set @Records_Inserted = @@rowcount
exec mda.PUpdate_Admin_Run_End @RunID,@Action,@Records_Inserted, @Records_updated, @Records_deleted
end
Tuesday, 7 April 2009
Controlled update in SQL Server
Whenever you experience slow updates or timeouts errors on updating columns you can update it incrementally by using the script below.
The WaitForDelay is added to permit the run of other processes (including a transaction log backup.). The script only works if one of the source columns which are used to update, cannot contain NULL values. If that's not the case you have to add a column (i.e. 'updated'). You should update this with a certain value (i.e. '1'). Use this column to determine @Count ('where updated <> 1') and to filter in the 'where clause' the update query ('and updated <> 1').
SQL 2005 Syntax:
DECLARE @Batch int
DECLARE @Rownum int
DECLARE @PrevRownum int
SET @Batch = 10000
SET @Rownum = 1
WHILE (@Rownum > 0)
BEGIN
UPDATE TOP(@Batch) UPDATE_TABLE
SET
UPDATE_COLUMN1 = S.SOURCE_COLUMN1
, UPDATE_COLUM2 = S.SOURCE_COLUMN2
FROM SOURCE_TABLE c
WHERE UPDATE_TABLE.ID = SOURCE_TABLE.ID
AND UPDATE_COLUMN1 IS NULL
SET @Prevrownum = @Rownum
SELECT @Rownum = COUNT(*) FROM UPDATE_TABLE
WHERE UPDATE_COLUMN1 IS NULL
IF @Prevrownum = @Rownum
set @ROWNUM = 0
WAITFOR DELAY '000:00:05'
END
RETURN
The WaitForDelay is added to permit the run of other processes (including a transaction log backup.). The script only works if one of the source columns which are used to update, cannot contain NULL values. If that's not the case you have to add a column (i.e. 'updated'). You should update this with a certain value (i.e. '1'). Use this column to determine @Count ('where updated <> 1') and to filter in the 'where clause' the update query ('and updated <> 1').
SQL 2005 Syntax:
DECLARE @Batch int
DECLARE @Rownum int
DECLARE @PrevRownum int
SET @Batch = 10000
SET @Rownum = 1
WHILE (@Rownum > 0)
BEGIN
UPDATE TOP(@Batch) UPDATE_TABLE
SET
UPDATE_COLUMN1 = S.SOURCE_COLUMN1
, UPDATE_COLUM2 = S.SOURCE_COLUMN2
FROM SOURCE_TABLE c
WHERE UPDATE_TABLE.ID = SOURCE_TABLE.ID
AND UPDATE_COLUMN1 IS NULL
SET @Prevrownum = @Rownum
SELECT @Rownum = COUNT(*) FROM UPDATE_TABLE
WHERE UPDATE_COLUMN1 IS NULL
IF @Prevrownum = @Rownum
set @ROWNUM = 0
WAITFOR DELAY '000:00:05'
END
RETURN
Subscribe to:
Posts (Atom)