10/09/2015

Customization of triggering process of SSRS subscriptions after the completion of data processing


In our process we have more than 600 SSRS reports, on which we have more than 1000 subscriptions, which are triggered on each week day and more than 1500 subscriptions, which are triggered on weekends only.

Each report contains data for multiple sites wise i.e. on each report we are showing data for multiple sites which are categorized under the MajorSiteGroups as:

MajorSiteGroup
SiteName
Kent
JMC
Kent
JMA
Kent
JNS
Kent
JPR
Kent
JOK
Rutland
JRV
Rutland
JRD
Rutland
OMC
Rutland
MMC
Suffolk
AWA
Suffolk
RMC
Suffolk
AH
Suffolk
ETR
Norfolk
RMI
Norfolk
OLE
Norfolk
MSH

 For each site, we have transactional database server which contains updated data from the web applications and we have designed a separate data ware house, which is specially designed for reporting and analysis purpose which is containing one day back historical data from transactional DB servers for reporting purpose.

For integrating data from transactional DB servers, there are several SSIS ETL packages, which are scheduled to run on the daily basis to update the reporting data based on the transactional data.

All ETL packages are scheduled to run and finish before the 8 AM CST for each site’s data because in SSRS reports, all subscriptions are scheduled to trigger after 10 AM CST so that user can get updated data on the report through subscriptions for those sites which they want so as per our observations all ETL packages get successfully finished before 8 AM CST.

But if any ETL package gets delayed due to server slowness then can be finished before 10 AM CST, because we have two hours as buffer time for ETL package execution.

With above description, everything looks good but problem is with ETL package’s failure or if any ETL package doesn’t finished before 10 AM CST than what happened?

All the subscriptions which are scheduled for 10 AM CST or after 10 AM CST will be triggered without updated data due to ETL package’s failure or delay processing of ETL package.

So user will get incorrect data on the reports.

Solution:

So I had to build a solution to check if an ETL package gets failed for any site or package is still in process for sites data then we have to wait to trigger all those subscriptions which are designed for those sites only until the data get successfully updated by the ETL packages.

So first, we have created a table “SSISProcessingLog” in our database to store the information about processing information of each ETL package for each site as:

ID
Package
Status
Date
StatusComments
SiteName
1
ProductSummary
-1
9/18/14 4:58 PM
AF9C2C34-5AB0-4B24-9856-F11D59E1E149
JMC
2
SalesSummary
4
9/18/14 4:58 PM
Processing Complete
JMA
3
ProductSummary
1
9/18/14 4:59 PM
Start Processing
JNS
4
SalesSummary
1
9/18/14 4:59 PM
Start Processing
JPR
5
ProductSummary
1
9/18/14 4:59 PM
Start Processing
JOK
6
SalesSummary
4
9/18/14 4:59 PM
Processing Complete
JRV
7
SalesSummary
1
9/18/14 4:59 PM
Start Processing
JRD

 Here 1 is for start processing of ETL package and 4 is for complete and -1 is for Error.

So basis on this table we can create a custom solution to trigger all subscriptions to validate processing of each site’s data before sending reports to the end user through a subscription.

As we know that all information related to subscription is stored in the Report manager database in below tables:

Ø  Subscriptions

Ø  ReportSchedule

Ø  Schedule

Ø  Executionlog

Each information regarding subscriptions is stored in Subscriptions table of the report server database as:


In Subscriptions table of report server database, we can get the parameter’s information i.e. for what sites they are scheduled? Which I need for doing cross validation with SSISProcessingLog table to check which site is processed successfully so that we can trigger subscription.

Below are useful columns of Subscriptions table:

Ø  SubscriptionID: Unique GUID generated by the system for each subscription.

Ø  Report_OID:  This is report id, which is related to Catalog table for getting report name and other information on which subscription is created.

Ø  ExtensionSettings: Provides email list and rendering format information.

Ø  Description: Description of subscription provided at the creation of subscription.

Ø  Last Status: Last status of the subscription.

Ø  LastRunTime: Last execution date and time of subscription.

Ø  Parameters: This Column contains parameter information of the subscription.

Ø  DataSettings: This column contains SQL query details for data driven subscription.

Ø  DeliveryExtension: Delivery type of report i.e. email or shared file.

To schedule subscription, there are two ways we have in SSRS, first one is to provide a static date time for each subscription manually at the time of its design or use a Share Schedule.

So with Shared Schedule, we can easily identify that on which day it should be trigger , for example I have create a shared schedule “WeeKDaySchedule” where I have scheduled it to trigger on each week day only from Monday to Friday and for Weekends I have created a share schedule “WeekendsSchedule” to trigger on Saturday and Sunday only.

After creating these two shared schedule, I have updated all subscriptions to use these shared schedules for their trigger time and after this I have designed a query where I am fetching the subscription information which need to trigger to get parameter, last run time and schedule information to validate with the processing of site’ data to trigger through the SQL code after the complete processing of data or it will be easy to track.

As we know parameters information is stored in the form of XML tags so first I converted this column into XML and then I parsed the XML code to parse out the site name from the parameter column of Subscriptions table where Schedule name are in “WeeKDaySchedule” or “WeekendsSchedule”.

Below is the query which parses the parameter information for each subscription after converting parameter column into XML to get the site information for each subscription with other infromation:

      Select SubscriptionID,LastRunTime

      ,svc.value('(Name/text())[1]', 'varchar(100)') as Parameter_Name

      ,svc.value('(Value/text())[1]', 'varchar(100)') as Parameter_Value

      ,ScheduleName

      into #Sub_Temp

      from (

            Select Sub.SubscriptionID ,Sub.LastRunTime, Convert(xml,Sub.Parameters) as Parameter,Sch.Name as ScheduleName

            from ReportServer..Subscriptions Sub WITH(NOLOCK)

            Left outer join ReportServer..ReportSchedule RSch WITH(NOLOCK) on Sub.SubscriptionID=RSch.SubscriptionID

            Left outer join ReportServer..Schedule Sch WITH(NOLOCK) on RSch.ScheduleID=Sch.ScheduleID

            Where Sch.Name IN ('WeeKDaySchedule','WeekendsSchedule')

      ------------------------------------------------------------------------

      ) Subs_Temp

      Cross apply Parameter.nodes('/ParameterValues/ParameterValue') as Txn(svc)

      where svc.value('(Name/text())[1]', 'varchar(100)')='SiteName'

      and svc.value('(Value/text())[1]', 'varchar(100)') is not null

After that I came up with an issue that there are two types of subscription

Ø  Data driven Subscription and standard subscription.

Both has their own ways to trigger and definition because in subscriptions table parameters column contains only those values of parameter which are hardcoded mentioned and if parameter value will be passed through a query at run time then we need to parse that query to get site name which will be passed in the parameter at run time from the DataSettings column so below code is fetching information about the data driven subscription only:

      IF OBJECT_ID('tempdb.[dbo].[#DSub_Temp]') IS NOT NULL DROP TABLE [dbo].[#DSub_Temp]

      IF OBJECT_ID('tempdb.[dbo].[#t1]') IS NOT NULL DROP TABLE [dbo].[#t1]

 

      Select row_number() OVER (ORDER BY SubscriptionID) AS RowNumber,

      ' Select '''+ Convert(varchar(100),SubscriptionID) +''' as SubscriptionID , '+

            'FacilityCode '+Substring (a.b.value('Query[1]/CommandText[1]','varchar(max)'),

            Charindex('From',a.b.value('Query[1]/CommandText[1]','varchar(max)')),1000) as Query

      into #DSub_Temp

       from (

       -------------------------------------------------------------------

            Select SubscriptionID

            ,svc.value('(Name/text())[1]', 'varchar(100)') as Parameter_Name

            ,svc.value('(Value/text())[1]', 'varchar(100)') as Parameter_Value

            ,ScheduleName,Parameter,DataSettings

            from (

                  Select Sub.SubscriptionID,Convert(xml,Sub.Parameters) as Parameter,Convert(xml,Sub.DataSettings) as DataSettings,Sch.Name as ScheduleName 

                  from ReportServer..Subscriptions Sub WITH(NOLOCK)

                  Left outer join ReportServer..ReportSchedule RSch WITH(NOLOCK) on Sub.SubscriptionID=RSch.SubscriptionID

                  Left outer join ReportServer..Schedule Sch WITH(NOLOCK) on RSch.ScheduleID=Sch.ScheduleID

                        Where Sch.Name IN ('WeeKDaySchedule','WeekendsSchedule')   

                  )

            ) Subs_Temp 

      Cross apply Parameter.nodes('/ParameterValues/ParameterValue') as Txn(svc)

      where  svc.value('(Name/text())[1]', 'varchar(100)')='FacilityCode'

      and svc.value('(Value/text())[1]', 'varchar(100)') is  null

      ----------------------------------------------------------------------------------

      ) Subs_temp1

      Cross apply Datasettings.nodes('DataSet') a(b)

After that I have stored both queries result into a single temporary table after the removing unnecessary data which is generated by the other parameter information and etc. by below code:

----Query to get unique subscriptionID, with sitename for data driven subscription

Create table #t1 (SubscriptionID varchar(100),SiteName varchar(4))

      Declare @Query nvarchar(Max)  Declare @i int

      Set @i=1

      While(@i<=(Select MAX(Rownumber) from #DSub_Temp))

      Begin

      Select @Query =Query from #DSub_Temp where RowNumber=@i

 

      INSERT INTO #t1(SubscriptionID,SiteName)

      EXEC sp_executesql @Query

 

      Set @i=@i+1

      END

-----Insert Data Driven Subscription information into #Sub_Temp table for combining standard subcription and data driven subscription both

      Insert into #Sub_Temp (SubscriptionID,Parameter_Value,Parameter_Name)

      Select * ,'FacilityCode' from #T1

After combining both type subscriptions information I am validating each site with SSIS log table to get only those subscription detials which are having updated data in the reporting database for all those sites, for which they have designed to provide data.

Select A.SubscriptionID into  #CompleteSubscripiton  from(

            Select SubscriptionID,COUNT(Parameter_Name)as TotalRows

                  from #Sub_Temp group by SubscriptionID

            )A

            Join (

                  Select SubscriptionID,Count(Parameter_Name)as TotoalRows from (

                        Select * from #Sub_Temp S1

                              left outer join MyDB..SSISProcessingLog S2 WITH(NOLOCK)

                                    on S1.Parameter_Value=S2.SiteName

                        where Status=4 --To cehck package has finished the data processing

                        And S2.Package is not null

                        AND S2.Date >=convert(varchar,getdate(),101)

                  ) as B group by SubscriptionID

            )C

      on A.SubscriptionID=C.SubscriptionID

            and A.TotalRows=C.TotoalRows

Now, with the help of above code, we will having all those subscripions which need to trigger today and their data processing is complete but before trigger these subscriptions we need to check if any of them has already triggered or not so for this I have looked into the ExecutionLog table of report server to make sure that how many subscriptions, which are not yet triggered and ready to trigger after complete processing of their site’s data by below code:

---** Query to check how many Subscriptions, which are already Triggered or from the complet listnot **---

      Select * into #FinalSubscriptions

            from #CompleteSubscripiton where SubscriptionID not in (

                  Select Distinct SubscriptionID  from (

                        Select EL.ReportID,RequestType,EL.TimeStart,El.Timeend,Sub.Lastruntime,Sub.SubscriptionID,Sub.Report_OID

                              from Reportserver..Executionlog EL WITH(NOLOCK)

                              inner join ReportServer..Subscriptions Sub WITH(NOLOCK)

                                    on EL.ReportID=Sub.Report_OID

                              where Sub.SubscriptionID in (Select * from #CompleteSubscripiton)

                              AND EL.RequestType=1

                              AND Convert(Varchar,EL.timestart,101)  = Convert(Varchar,GETDATE(),101)

                              AND convert(varchar,Sub.LastRunTime,101)=Convert(Varchar,GETDATE(),101)   -----New Filter

                               -----  Retrigger failed subscriptions

                              and sub.SubscriptionID not in (

                                                                        select SubscriptionID from reportserver..subscriptions Sub WITH(NOLOCK)where subscriptionid in (Select * from #CompleteSubscripiton)

                                    and

                                          (( sub.laststatus LIKE '%Error%'

                                           AND sub.laststatus NOT LIKE '%; 0 Errors%' )

                                            OR sub.laststatus LIKE '%Failure%' )

                              )

                               ---

                  ) as A

      )

 

Here I am also looking those subscriptions which are having ERROR in their last execution status due to server’s reasons.

After this validation I will be having a final list of all those subscriptions which can be triggered now:

As I already mentioned above that both subscriptions are triggered in their own different way, data driven subscription is triggered through EVENT table of report server, whereas standard subscriptions are triggered through Notifications table of report server database so I have write two different insert statement for both type of subscription and the movement, insert query will be inserted the data into these tables with subscription details, a system trigger will be invoked to trigger all subscriptions:

Ø  Query to insert Data Driven subscription details into the Event table of report server:

      Insert into Reportserver..event(EventID,EventType,EventData,TimeEntered)

            Select NEWID() as EventID,

                  'TimedSubscription' as EventType,SubscriptionID ,GETDATE() as TimeEntered

                   from ReportServer..Subscriptions WITH(NOLOCK)

                        where SubscriptionID in

                                    (Select * from #FinalSubscriptions where SubscriptionID not in (Select Distinct SubscriptionID from #t1))

                                    and DataSettings is not null

Ø  Query to insert standard subscriptions details into the Notification table of report server:      Insert into ReportServer..Notifications

                  (

                        NotificationID,SubscriptionID,ReportID,ExtensionSettings,Locale,NotificationEntered,

                        SubscriptionLastRunTime,DeliveryExtension,SubscriptionOwnerID,IsDataDriven,Version,ReportZone, Parameters

                  )

                        Select NEWID()as NotificationID,SubscriptionID,Report_OID,ExtensionSettings,Locale,getdate()

                        ,Isnull(LastRunTime,Dateadd(dd,-1,GETDATE())) ,DeliveryExtension ,OwnerID

                        ,Case when DataSettings is null then 0 else 1 end as IsDatadriven ---for DataDriven

                        ,Version,ReportZone  , Parameters

                        from ReportServer..Subscriptions WITH(NOLOCK)

                        where SubscriptionID in (Select * from #FinalSubscriptions where SubscriptionID not in (Select Distinct SubscriptionID from #t1) )

                                    and DataSettings is null

After the putting all subscription infromation into the Notification and Event tables of report server we need to update LastRunTime columns of Subscriptions table because we are manually executing each subscription so SQL engine will not update the last run time of subscriptions

Ø  Query to update lastruntime of Standard Subscription which have been recently triggered:

 

      Update ReportServer..Subscriptions set LastRunTime=GETDATE()

     

where SubscriptionID in (Select * from #FinalSubscriptions where SubscriptionID not in (Select Distinct SubscriptionID from #t1))

            and DataSettings is  null

All these codes I have put in a SQL Job to run it on the interval of 20 minutes so that after 20 minutes, this code will be fetching the subscription detials which are not yet triggered and validating for their data processing with SSIS log table to trigger them through the SQL code.

 Below is ER diagram for whole process:

No comments:

Post a Comment

Tableau interview questions and answers for experienced professionals

  Tableau Interview Questions and Answers for experienced professional 1. What is TABLEAU? Tableau  is the powerful and fastest visualizing ...