Overview: Discussion of the Data Refresh options available in SharePoint for updating data contained in PowerPivot. If row-level security needs to be applied to the source data being extracted out of a source system into PowerPivot, then Option 1 won’t work.
Level: 101
Where Data Refresh Options Are Located
In the world of Self-Service BI, the producer of the data model (aka power user, author, workbook owner, data modeler) is who typically sets up and schedules the timing of data refreshes based on the business need for updated data. However, that doesn’t mean that the administrator of the system is completely uninvolved!
Data Refreshes are managed within the PowerPivot Gallery (a Silverlight-enabled SharePoint document library). When a PowerPivot model has been uploaded to SharePoint, there are 3 icons visible in the PowerPivot Gallery. The icon on the far right is used to “Manage Data Refresh.” Utilizing SharePoint is the way to automate scheduled data refreshes for PowerPivot models. This does require a special “PowerPivot for SharePoint” environment to be set up – a normal SharePoint document library won’t do the trick.
Data Refresh Options
Following are all of the options available in the “Manage Data Refresh” screen. The middle screen shot – Credentials – is the focus of the following Option 1, 2, 3 sections.
Credentials Used to Refresh Data
For the refresh Credentials, there are 3 options to choose from. Each option influences:
- What account is executing the process on the server?
- What account is being used to query the underlying data source? <—this affects how row level security is handled when the data is extracted
Option 1: Use the data refresh account configured by the administrator
Process executed by:
- The PowerPivot Unattended Account (predefined for the system by the administrator).
Data queried by:
- Retains the credentials in place when the model was uploaded to SharePoint, as follows:
- If Windows Authentication was used to initially populate the PowerPivot model: It will continue to use Windows Authentication, in which case the PowerPivot Unattended Account is considered the “current user.”
- If a specific User ID and Password (SQL Server authentication, for example) was used to initially populate the PowerPivot model: It will continue to use that User ID and Password.
Pros:
- Simple for workbook owner to complete.
- No “special” User ID and Password needs to be shared with the workbook owner.
- Should be few concerns with password changes, terminations & transfers, which simplifies some maintainability.
Cons:
- Not suitable when row level security is implemented on the source system being queried. Since there can be only one PowerPivot Unattended Account per service application, to satisfy all possible data refresh scenarios for all PowerPivot workbooks, the PowerPivot Unattended Account may have broad or unrestricted permissions to the source data. This may be far more liberal security than the original workbook author had & inadvertently result in the data refresh populating the workbook with too much data.
Prerequisites:
- The PowerPivot Unattended Account needs read rights to all possible data sources used in all PowerPivot data refresh operations.
- This user is also required to have read/write (Contribute) permissions to the PowerPivot workbook itself (preferably via permissions at the library level).
Disabling this option:
- Option #1 can be disabled by the system administrator, in which case it would be greyed out for the workbook owner when defining the data refresh schedule. The option to disable it looks like this in Central Administration (PowerPivot Service Application Settings):
Option 2: Connect using the following Windows user credentials
Process executed by:
- The Windows user specified in the User Name field.
Data queried by:
- The Windows user specified in the User Name field. This will override the credentials that were used when the model was uploaded to SharePoint.
Pros:
- If the workbook owner puts in their own credentials here, the likelihood of complications with source systems is minimal - i.e., the workbook owner already has read permissions to the source system.
- Row level security on the source data can be retained (limited to the Windows ID specified – the assumption is made that the workbook is only being shared with users who have the same permissions to see the data).
- The system administrator could provide workbook authors with a set of credentials to use that do not expire & does not have password changes.
Cons:
- It can be challenging to manage terminations, transfers, and password changes when individual user accounts are used.
- Each time the User Password is required to be changed (for example, every 90 days), the workbook owner must go into the Data Refresh Schedule for each PowerPivot model they own and re-key the new password.
Prerequisites:
- The Windows user specified needs read rights to the data source.
- The Windows user specified is required to have read/write (Contribute) permissions to the PowerPivot workbook itself.
Disabling this option:
- Option #2 can be disabled by the system administrator, in which case it would be greyed out for the workbook owner when defining the data refresh schedule. The option to disable it looks like this in Central Administration (PowerPivot Service Application Settings):
Option 3: Connect using the credentials saved in Secure Store Service
Process executed by:
- The user defined by the system administrator that corresponds to the ID provided in the field. This is similar to Option 1 above, except that Option 3 is not limited to just one account per PowerPivot service application.
Data queried by:
- The user defined by the system administrator that corresponds to the ID provided in the field. This will override the credentials that were used when the model was uploaded to SharePoint.
Pros:
- Should be very few or no concerns with password changes, terminations & transfers, which simplifies some maintainability.
- All that is shared with the workbook owner is a Secure Store Service ID (such as 4672) rather than an actual User Name and Password.
- Row level security on the source data can be retained (limited to the ID specified – the assumption is made that the workbook is only being shared with users who have the same permissions to see the data).
Cons:
- Which ID that’s shared with which set of users needs to be managed very carefully (i.e., if different IDs have varying levels of source system permissions).
- The permissions assigned to the Secure Store Service ID may be far more liberal than the original workbook author had & inadvertently result in the data refresh populating the workbook with too much data. This can be mitigated by creating and sharing IDs appropriately.
Prerequisites:
- This user needs read rights to the data source.
- This user is also required to have read/write (Contribute) permissions to the PowerPivot workbook itself.
- The ID used cannot relate to the PowerPivot Unattended Account (i.e., the account used in Option 1).
Managing Data Refresh Failures
The lower section of the PowerPivot Management Dashboard (available only to administrators in SharePoint) displays recent data refresh failures. This data is also available to be queried in the Management Data Sandbox. Possible reasons for failure might be:
- Read permissions to the data source specified do not exist for the user specified in the Data Refresh options
- Contribute permissions to the PowerPivot workbook are not defined for the user specified (because the data refresh operation updates the PowerPivot file)
- Source system is offline at the time the data refresh attempted to execute
- Data source must exist on a server which can be accessed by the data refresh process (i.e., not on a user machine)
- An individual account was used and the employee’s password expired or was changed
- An individual account was used and the individual terminated employment upon which time their source system permissions were revoked
- An individual account was used and the employee transferred to another department upon which time their source system permissions were revoked
- If data refresh would force the workbook to exceed 2GB in size
- If a custom driver or data provider is required to query a particular source system which hasn’t been installed on the PowerPivot server
- If a 32-bit data provider was used when the workbook was set up, but the server has the 64-bit version
- The workbook is checked out or is being actively edited
- The workbook uploaded may be from an older version of PowerPivot than what is running on the server
- Workbook author chose “After business hours” to refresh their workbook, but the system has no valid business hours defined (technically you can set the same start & end time to achieve no business hours, but that causes failures if users select “After business hours” in their refresh schedule – so setting a very small window is better)
- An individual account is from another domain that isn’t trusted
When Data Refreshes are Executed
Although the Data Refresh options are set up by the workbook owner, the time the data refresh is executed is not entirely within their control. The time a data refresh runs depends upon these things:
- Earliest Start Time defined for the individual PowerPivot model. This is set up by the workbook owner after the PowerPivot model is uploaded to SharePoint.
- Business Hours defined for the PowerPivot application. This is set up by the administrator within the PowerPivot options in Central Administration. These hours represent when priority is given to querying versus refreshing data. If you don’t really want to define business hours (perhaps you have an international business), then set this window to be extremely small.
- How often the PowerPivot Refresh Timer Job is scheduled to run. This is managed by the administrator within the Timer Job Definitions in Central Administration.
- How many refreshes are permitted to run concurrently. This is managed by the administrator within the Analysis Services (PowerPivot) Service Settings in Central Administration.
- Sufficient system resources are available to execute the job (for example, if the system is being taxed with a lot of queries, the data refresh process may need to wait).
Termination of a Data Refresh Schedule
By default, SharePoint will stop trying to execute Data Refreshes when one of the two happens:
- Consecutive Failures. Ten data refreshes in a row have failed, or
- Inactive Workbooks. No one has queried the PowerPivot model for ten data refresh cycles
The administrator may change these settings in the PowerPivot options in Central Administration. You may want to consider changing the Inactive Workbooks setting – there are some workbooks that are only used minimally, but it’s still important for them to have current data when they are used.
Finding More Information
TechNet – Different Ways to Update Data in PowerPivot
TechNet – PowerPivot Data Refresh
MSDN – Everything You Always Wanted to Know About PowerPivot Data Refresh but Were Afraid to Ask
Analysis Services & PowerPivot Blog – PowerPivot Data Refresh
Kasper de Jonge’s Blog – PowerPivot Data Refresh, what settings to use and why
TechNet – Troubleshooting PowerPivot Data Refresh