There are some configuration concepts for Sitecore that require some planning. This is especially true for larger environments.
Below are parts of the planning document we created for a recent client. The site being considered here is characterized by the following: staged environments (development, QA, production), multiple content delivery servers in QA and production, a large content authoring team, all running Sitecore 6.1.
An early version of this plan was discussed with a contact in Sitecore USA. His feedback has been applied to this document. Sitecore was very helpful, even providing code samples for some of our needs.
1. Preview site:
a. Authors must be able to publish content to a preview site prior to publishing to the live production site. Content must be viewable at this preview site for review without any required access to the Sitecore admin.
b. Authors must be able to revert rejected changes when viewing them in preview.
c. Authors must be able to approve changes in preview for publication to the public production site.
2. Publish on demand:
a. A group of authors must be able to publish the site on demand. This is not required for all publishes. For content and media, this should only be used for content emergencies.
b. Most publishes of content and media will be performed on a scheduled basis. The business has confirmed that they would like content to be published daily at midnight.
3. Publish scheduling:
a. Authors must be able to schedule content for release at a future date and time.
a. A workflow process must exist that allows some users to create and edit content and other users to approve/reject those content edits.
5. Cache clearing:
a. Content publishes must clear the related portions of the delivery server Sitecore caches.
b. The ability must exist to clear the Sitecore caches manually for the content delivery servers.
Challenges related to these requirements:
a. Challenge: Content in the preview site could fall out of sync with the production site. For example, content could be published to production, but not published to preview. This would make the preview experience inaccurate.
Response: Anything that is published to the production site should also be published to preview. Items may be published to preview without being published to production. But, these items must later be published to production, or reverted to match the older content that is in production.
b. Challenge: Authors may publish content to preview and then later forget to publish it to production once approved.
Response: A tool should be created to assist authors in knowing what content is in the preview state and pending publication to production.
c. Challenge: Authors could publish content to preview and then have trouble reverting that content should it be rejected in preview.
Response: Authors should have a way to revert preview changes gracefully to have them match the version that is in production.
a. Challenge: Frequent author publishes could result in frequent cache clear operations which will result in diminished performance.
Response: Author publishing frequency should be minimized. But, while minimizing, we’ll need to be sensitive that the business may need to publish content in the case of an emergency, and that they should be able to publish content with a frequency that satisfies their desire to keep content fresh.
b. Challenge: Authors could create items with more than 100 children (direct children, not all descendants) . This will likely be more common in the media folder, or for data items. Sitecore recommends that no items have more than 100 children, as this affects performance in both admin and delivery.
Response: Authors should avoid adding more than 100 direct children to an item. Ideally, tools should be created to identify items with too many children.
c. Challenge: Sitecore media has been observed to perform poorly under load. Pages with more media items will perform more and more poorly, and servers will feel more and more pain.
Response: Non-author-able media that can be stored outside of Sitecore should be. Also, Sitecore media should be cached in some way (at the load balancer, via a CDN, or via output caching).
Note: Sitecore media output caching requires an upgrade to Sitecore version 6.2 or the application of a Sitecore hotfix previously received from Sitecore support.
a. Challenge: Workflow adds additional steps and burden to content authors.
Response: The workflow should be streamlined to balance the pain of extra steps with the added benefits of closer content management.
b. Challenge: Workflow creates item versions with each content edit. This can quickly grow the number of versions for frequently updated items. Sitecore currently recommends having less than 15 versions per content item for admin and publish performance.
Response: Authors should learn when and how to manage content versions. Ideally, tools should be created to help identify items with too many versions.
4. Cache related errors:
a. Challenge: Errors can occur when some items are cached but other items are refreshed. For example, pages can look for data items that Sitecore does not see because the Page cache is cleared, but the data cache is not.
Response: The cache clearing methodology should minimize the possibility of data mismatches. This could be done through full cache clears, more frequent cache clears, or more intelligent partial cache clearing.
Also, the caches need to be large enough to minimize the possibility of cache holes (items not in cache due to cache size). Items in cache holes may refer to items in cache, creating a mismatch.
a. Challenge: Sitecore staging operations often involve the author server sending service calls to the delivery servers. This could result in errors if delivery servers are down. Also, delivery servers may be not be included if they are added but not configured in the staging service.
Response: The staging service needs to gracefully handle errors due to servers being unavailable. Also, the staging administrator needs to be aware of any server changes (removals and additions) and they should keep the staging configuration true to the server configuration.
Recommendations to achieve these requirements:
1. Publish Targets to Achieve a Preview Site:
a. Sitecore will be configured to use two publish targets (two databases for “web” content).
b. The preview site will use one target database. The public production sites will use the other target database.
c. The preview site will use the author web server, but not logged in.
d. An additional url will be created on the author web server for viewing public production content. When this url is used, the content will be served from the public production target database. This will be useful for troubleshooting cache issues on the public production servers.
2. Publish Restrictions to Safe Guard Performance:
a. The Sitecore Client Publishing role will be applied to very few users.
The intent is to limit the number of publishes that occur outside of the scheduled publish process, but to still allow for an emergency publish if required.
b. Users outside of this role will not have the ability to publish content manually.
3. Scheduled Publishing to Handle Common Publishing and Enable Scheduled Content Release:
a. Authors will be able to schedule content publishing using Sitecore publishing restrictions for a content item. This will allow them to set start and end dates during which the content is publishable.
b. The automated publish action will publish content to both publish targets. Content that should not be published to the public production target will be restricted from publication by workflow.
c. The site will use a customized version of the standard publishing agent to regularly trigger publishes. This will cause scheduled items to publish, or un-publish, according to their publishing restrictions. The standard publishing agent will be customized to publish all publishable content in the content and media nodes only. It will not publish content in the layouts, system, or templates nodes.
Scheduled publishing requires a deep (full) publish, as opposed to an incremental publish. Because of this, we are restricting the publisher range as much as possible to make the scheduled publish footprint as small as possible.
d. The publishing agent will be set to run daily at midnight per approved business requirements.
e. The scheduled timing will use the Sitecore scheduler that calls a service every 5 minutes. That service will then call the publishing agent if the server time is sufficiently close to the scheduled times above.
The scheduled times for the publish process will be manageable in the web.config.
Note: it was considered to create an external console application that would ping the website to trigger the publish agent. This console application could be called through a windows scheduled task. This option was rejected due to the desire to avoid external dependencies. Also, one of the typical benefits of this approach would be that it removes the dependency of the author site worker process being alive. However, that benefit is not required in this case due to the how the author site should be kept alive by the load balancer.
4. Workflow to Assist with Content State Management:
a. A workflow will be created and applied to all content items for the site (page and data items under the content node in Sitecore).
b. Workflow will not be applied to media, layouts, sub-layouts, renderings, templates or other items outside the content node.
c. The workflow will include the following stages:
i. Draft – with the option to Submit to the next stage.
ii. Internal Review – with the options Approve (send next stage) or Reject (send back to Draft)
iii. Preview – with the options to Approve (send to next stage) or Reject (send back to Draft)
d. The Preview state will include an auto-publish action to the Preview site target. This will require us to override workflow restrictions against publishing. Sitecore has provided us with some code that should be helpful to do this.
Any rejection of this published preview content will require an un-publication of that item version. To facilitate this a publish action will be added to the Reject action on the preview state that will republish the previous version of the given item, or un-publish the item if it is in its first version.
e. The Preview Approval action will not include an auto-publish action to the production site target. This could lead to performance issues from cache clears. Instead, this content will be published by the scheduled agent. In an emergency, the content could also be manually published by users with publish rights.
f. Author users that do not have approval/rejection rights will be notified when content they have sent for approval has been approved/rejected. The notification will include what content was updated and to what state. Authors with approval/rejection rights will not receive these notifications, as they may have approved/rejected their own content.
This is being considered a nice-to-have and may, or may not be added prior to site launch.
g. Users with approval/rejection rights will be notified every 3 hours when content exists that is pending approval/rejection. The notifications for Internal Review and Preview will be easily distinguishable. The notifications will include the number of items pending review and when the last item was added.
This is being considered a nice-to-have and may, or may not be added prior to site launch.
h. Workflow will be used to help manage the content preview by creating the ability to:
i. Approve and Reject changes
ii. See what content is in Preview, but not yet in production, via the Sitecore workbox
i. Non-workflow content (media, templates, layouts…) will always be published to both publishing targets. This limits the ability of the preview server, but it makes content states more explicit. We will update the web.config settings for the publish dialog so that the publish dialog defaults to publish to both publishing targets.
5. Staging Module for Delivery Server Cache Management:
a. The Sitecore Staging Module will be installed and configured to handle cache clearing on delivery servers.
Note: No Sitecore managed media is currently being stored in the file system. So, there is no need to stage media items via FTP or SOAP to delivery servers.
We also considered using the Shared Source Sitecore Stager Module. In our talking with Sitecore, they indicated that the Stager Module is very good and may use methodologies similar to what Sitecore will use in future versions. For example, the Stager Module may have the delivery servers call back to the authoring server to see if changes are required.
However, we lacked project time to adequately explore the two modules, through testing and code review. Also, we did have a potential security concern with the idea of the delivery servers contacting the author server, if the Stager Module does that. This may not work in future firewall scenarios. So, we defaulted to using the Sitecore supported module.
b. The Staging Module will be configured to perform full cache clears on the delivery servers.
Note: Partial cache clears were considered, as they are better for performance. However, given the likely infrequency of publishing (mostly, daily at midnight), the performance impact of full publishing will be minimal and is outweighed by the lessened risk from possible cache mismatches that could occur with partial cache clears. It has recently been found that queued multiple partial cache clears may lead to cache mismatches, when a partial clear is triggered while others are still pending.
If the solution is ever modified to use partial cache clears, the Staging Module would be overwritten with a process that first counts the staging operations. Should the number of staging operations be more than 5, all scheduled partial cache clears would be deleted and replaced with a single Full cache clear (using the cache clear web service). This process would run every 30 seconds. The number of operations required before the switch is made could be managed in the web.config.
A better solution would be if we could combine the separate partial cache clears into a single, larger, partial cache clear. The feasibility of that is currently unknown.
Regarding the 30 second timing for the staging module and partial cache clears, this timing should be explored further. Having the time too long could result in cache mismatches, as one cache is cleared and another is still waiting. Having the timing too short could result in the rapid fire of staging cache clears to the delivery servers. These repeated cache clears could stress the delivery server.
Whatever approach is made, limiting the number of publish operations is definitely ideal.
c. The Staging Module will be overwritten to delete multiple cache clears, if multiple cache clears are set to occur. This will be applied when the Staging Module has a queue of cache clears to perform. When using full cache clears, there is no advantage to clearing the cache multiple times, when one cache clear will suffice.
d. A further scheduled process will be created to delete failed staging operations. These are operations like have likely failed due to the inability to find servers. This would be required if the staging solution maintained a list of errors indefinitely.
This process would run every 10 minutes and delete staging operations that are older than one hour. These times can be managed in the web.config. The key thing is that the deletion agent doesn’t delete valid, pending staging operations.
6. Manual Cache Clearing:
a. The Staging Module will be used to trigger full cache clears as needed. As it will be used in full cache clear mode, any publication of content will trigger a full cache clear. This creates an easy mechanism for a manual cache clearing on the delivery servers, if desired.
b. If the Staging Module is ever changed to use partial cache clears, it may still be used to apply full cache clears on an on demand basis. This would be done by creating an additional publishing target that is configured in full cache clear mode. That publishing target would not be available for all users, but only for users that should have the ability to trigger the full cache clear. Then, any item published to this target would trigger a full cache clear.
c. As a future nice-to-have, an additional cache clearing web service could be set up on all servers.
i. This utility would be made available in the Sitecore admin to clear the caches of the servers within that environment.
ii. The list of servers within each environment would be configurable in the Sitecore admin as Sitecore items (similar to how the staging module manages servers).
iii. The cache clearing service would also be configured to clear page output caches and as many application level caches as can be conveniently added.
This is the main reason for having an additional cache clearing utility to supplement the Staging Module solution.
iv. The cache clearing admin utility would include the ability to select which servers are affected and which caches are cleared on those servers.
v. Should a server be un-reachable by the cache clearing service call, the error would be reported, but handled gracefully. If multiple servers were requested to have their caches cleared, all available servers would be cleared. It is likely that if the cache clearing service is not available, the server is down, and the cache does not need to be cleared (as it doesn’t exist).
7. Presentation Details Preview to Enable a Fuller Preview Experience:
a. Sitecore support has confirmed that presentation details can be versioned without major issues.. This allows for the previewing of changes to presentation details.
Item presentation details are not versioned by default . This makes it difficult to make and preview presentation detail changes. This is especially true when the publishing agent is in place, and presentation detail changes may be published automatically (and without the author realizing it).
This level preview may be required for the future. We are not recommending it for launch. If this ever is used, we recommend that it be tested thoroughly to explore any possible side effects.
b. Alternatively, we may consider a more advanced solution:
i. An additional Sitecore device could be created to allow for previewing presentation detail changes.
ii. This preview device would be the default device used for the preview site. The public production site would continue to use the existing default device.
iii. Sitecore item commands would be created that quickly enable the following:
1. Copy device settings from the Default Device to the Preview Device.
2. Copy device settings from the Preview Device to the Default Device.
iv. A publishing pipeline process would be created. On publish to the public production target, if the Preview Device settings differ from the Default device settings for the Item, or its template standard values, the Preview device settings will be copied to Default device settings.
This process may be a nice-to-have to have for the future. It is not recommended for launch.
8. Performance Guideline Monitoring:
a. A Sitecore Notification role will be created. This role will receive notifications from the system when appropriate.
b. A scheduled task will be run nightly to ensure that Sitecore performance guidelines are properly observed.
i. Number of Versions:
1. Sitecore admin performance guidelines state that an item should have no more than 15 versions. Workflow creates versions of items when edits are made. Without proper management, the number of items grows quickly.
2. Should items with more than 15 versions exist, a notification will go out to the Notification Role. The notification will state which items have too many versions and the number of versions each has.
ii. Number of children:
1. Sitecore performance guidelines state that no item should have more than 100 children.
2. Should items with more than 100 children exist, a notification will go out to the Notification Role. The notification will state which items have too many children and the number of children each has.
These are not recommended for launch, but they would be nice-to-haves for the future.
Next steps related to these recommendations:
1. Preview site configuration:
a. Set up additional web databases.
b. Set up publish targets in each environment.
c. Configure the workflow (see below) as recommended for preview approval/rejection.
a. Configure Sitecore security roles appropriately to minimize publishing rights.
b. Modify the web.config settings to set the default publish targets in the publish dialog to both publish targets. Sitecore has provided us with where to find this setting.
c. Create a custom publish agent that can be called to publish content and media items only. This can be based on the sitecore publish agent (but, it will crawl less of the tree).
d. Create the timing agent that will allow us to call our custom publish agent at specified times.
a. Create the workflow, assign it to templates, and apply security roles to the workflow.
b. Create a publish action that overrides the workflow restrictions when sending content into Preview. Sitecore has given us code to help with this.
c. Create a related publish action that overrides the workflow restrictions when content is rejected in preview.
a. Install and configure the Staging Module in each environment.
b. Override the module to clear multiple cache clear operations when they are queued.
c. Override the module as clear failed cache clear operations.
5. Cache clearing:
a. No additional steps are required to configure full cache clearing with the planned staging configuration.
a. Workflow Notifications:
i. Create a notification for authors to tell them when their changes have been approved/rejected.
ii. Create a notification for approvers to know when changes are pending approval.
b. Performance Monitoring:
i. Create a user notification security role.
ii. Create an API utility that can be called to send notifications to that user role.
iii. Create a process to crawl the tree looking for too many item versions and reporting to the notification role.
iv. Create a process to crawl the tree looking for too many children and reporting to the notification role.
c. Cache Clearing Upgrade:
i. Create the service that will allow us to clear the cache remotely.
ii. Hook in other, non-Sitecore caches (page output cache, application caches…).
iii. Create the admin interface that will allow admin users to trigger the cache clear and include any options for that cache clearing (which servers, which caches…).
d. Presentation Details Preview:
i. Consider implementing a presentation detail preview device, and set up the associated utilities to assist with managing the presentation list.