Change management best practices for a Splunk deployment
Change management (CM) is the end-to-end process that governs the life-cycle of a change from the initial request for change to the final deployment and communication of the change. As organizations consider which changes in Splunk warrant CM oversight, CM guidelines help organizations define the appropriate process steps to include, activities to complete, and define the participants in the CM process and their responsibilities. Since every organization is unique, how they decide to implement CM can vary. CM guidelines give organizations a best practices framework and tools to help them define and develop the CM process(es) that fit their unique needs and requirements. Using CM to govern changes in your Splunk environment has many benefits such as:
- Establishes a documented, defined process to authorize and deploy changes in your Splunk environment
- Provides awareness and a common and visible method to address changes
- Significantly improves the ability to respond to issues by providing a record of change to support platform heath and troubleshooting
- Executive sponsor
- Program manager
- Project manager
Differentiate between change management from change control
CM encompasses the entire change life-cycle, whereas change control is the execution of a specific subset of process steps with a core focus of ensuring releases or deployments do not conflict with other production components. Critical and time sensitive platform changes, such as a security patch, are often subjects that path directly to change control. As such, change control has a level of risk that is typically higher and often acceptable.
Change management framework
The CM framework is a set of components to help design and implement CM. This framework design provides both structure and flexibility with the goal to allow you to build a new CM process or to incorporate into an existing CM process. However, with flexibility comes the opportunity for an endless cycle of variations. Consider a moderate or limited approach that can quickly evolve through usage and experience. The CM framework includes guidelines to the define:
- CM scope
- CM process
- Responsibility assignment matrix (RACI)
- CM pathways
- Change guidance examples
Change management scope
Since a CM process deals with the management of changes, you must have a common understanding of which changes are in the scope of the process. This section describes a sample scope and includes areas which are both within and outside of the CM process scope. This sample is provided to spur thought so that customers can engage in conversations to define what scope is best for your organization.
The scope of the CM process covers all production systems and platforms of an organization. The primary functional components covered in the CM process may be similar to the following examples:
- Software development life cycle: Changes handled through the formal software development life cycle
- Hardware: Installation, modification, removal or relocation of computing equipment.
- Software: Installation, patching, upgrade or removal of software products including operating systems, access methods, commercial off-the-shelf (COTS) packages, internally developed packages and utilities.
- Database: Changes to databases or files such as additions, reorganizations and major maintenance.
- Application: Application changes being promoted to production and the integration of new application systems and the removal of obsolete elements.
- Moves, adds, changes and deletes: Changes to system configuration.
- Schedule Changes: Requests for creation, deletion, or revision to job schedules, back-up schedules or other regularly scheduled jobs managed by the company's IT organization.
Out of scope
Many activities are performed on a routine basis as part of the operations and administration of your Splunk environment that are outside the scope and governance of the CM process. It's a best practice to identify and document these activities as "out of scope" for CM to maintain operational efficiency. Examples include the following:
- Contingency or disaster recovery
- Changes to non-production elements or resources such as changes in Dev or Test environments
- Changes under the governance of large-scale projects or initiatives have a separate CM process
- Changes made within the daily administrative process.
Examples of daily administrative tasks are:
- Password changes or resets
- User adds or deletes
- Adding, deleting, or revising security groups
- Rebooting machines when there is no change to the configuration of the system
- File permission changes
Consider defining a catch-all change pathway or a standard operating procedure where submissions for change are channeled when the requester is unsure if the change is in or out of scope.
Change management process
CM is a sequence of process steps with associated activities that one follows to govern a change to drive and ensure an intended outcome. The primary goal of the change management process is to ensure standardized methods, processes, and procedures are used to facilitate efficient and prompt handling of all changes. The CM process is grouped into two distinct phases, the approval phase and the execution phase. The approval phase consists of the process steps that support the decision-making and authorization activities for the change. The execution phase addresses the process steps that are related to the actual work needed to build and deploy the change.
As you go through the exercise to define the process steps and associated activities to include in your change management process, we suggest you include every process step and the associated activities for the given process step that you may use. Use the change pathways to select which process steps and which activities to implement for a given change scenario. Think of the CM process as the checklist for defining each change pathway.
The approval phase has steps and activities to support the decision-making and authorization activities for the change. The activities in the table below are examples to reference and use to build and evolve to fit the needs of the organization.
|Identify change||Identify the reason for a change such as correction, new, or modification in requirements|
|Change request||Documents and describes the change based on predefined criteria|
The execution phase has steps and activities that are relate to the actual work needed to build and deploy the change. The activities in the table below are examples to reference and use to build and evolve to fit the needs of the organization.
|Analysis||Gather, validate, and document both business requirements and technical requirements|
|Design||Document specifications, features and operations to satisfy the functional requirements of the change|
Change management RACI
The RACI matrix defines the participation requirements for various roles to complete tasks or deliverables in a CM pathway. RACI is an acronym derived from the four key responsibilities most typically used: Responsible, Accountable, Consulted, and Informed.
- Responsible - Those who do the work to complete the task. There is at least one role with a participation type of responsible, although others can be delegated to assist in the work required.
- Accountable - The one ultimately answerable for the outcome. The role who ensures the prerequisites of *the task are met and who delegates the work to those responsible. In other words, an accountable must sign off to approve the work that responsible provides. There must be only one accountable role specified for each task or deliverable.
- Consulted - Those whose opinions are sought. Typically, this applies to subject matter experts and with whom there is two-way communication.
- Informed - Those who are kept up to date. People in this role are informed of the latest progress. Often only on completion of the task or deliverable; and with whom there is just one-way communication.
There is a distinction between a role and individually identified people. A role is a descriptor of an associated set of tasks that can be performed by many people and one person can perform many roles. For example, an organization may have ten people who can perform the role of system administrator and one person who can perform the role of system administrator may also be able to perform the role of developer. It's a best practice to assign only one participation type for each task or deliverable per role in the CM pathway. Where more than one participation type is shown, generally implies that participation is fully resolved, which can impede the value of this technique in clarifying the participation of each role on each task. See Role-based data management best practices for a Splunk deployment..
Change management pathways
A set of defined change pathways is the outcome of the CM framework and is used to facilitate the CM process. Since not all changes are alike, having multiple defined change pathways provides the flexibility needed to best accommodate changes.The change pathway identifies the steps and associated activities from the CM process and the who-does-what-when from the CM RACI to properly govern a change request. For example, changing a source type definition can have a significant impact on the ingest of data and can impact many business processes and need analysis and approval from multiple parties. To contrast, a change to a dashboard panel may not require the same level of analysis nor approval. Change pathways provide the flexibility to address various scenarios of change, improve overall efficiencies, and avoid the one-size-fits-all approval pitfall. It's a best practice to define several CM pathways to accommodate change categories.
It's important to keep the quantity of change pathways to manageable number. It's better to start with a smaller set and expand as you learn more about how your organization addresses change. It's a best practice to define default pathways as catch-alls to use for multiple scenarios.
Use the following items to define your change pathways:
- Change guidance examples to define a set of change categories to develop your change pathways.
- CM process to identify the process steps to enact against each change grouping
- CM process to select the appropriate activities within each process step to complete for each change category
- CM RACI to define the roles and select the responsibilities for the people participants for each change category
- Use the identify deployment requirements for example Dev-Test-Prod Promotion, Segregation of Duties, and so on
Identify deployment requirements
Organizations do have special situations, or pathway variation considerations, that they need to address from time to time. In keeping with a flexible framework, it's a best practice to define a "variations" option for the effected pathway as opposed to creating specific pathways for each variation. This allows more flexibility over time as new conditions or constraints are identified. In environments that support a formal development life cycle, it is typical to see both promotion process (manual or automated) and requirements for segregation of duties (coders, testers, promoters are all different people). In these cases, you must consider deployment priorities and constraints to implement a change Deployment priority levels may be similar to the following examples:
- Emergency: Significant risk if not implemented immediately, such as applying a security patch
- High: Implement soon to prevent significant negative impact to ability to conduct business
- Routine: Implement to gain benefit from the changed service
- Low: Not pressing, but implement to provide a positive outcome
There may be other events or circumstance that are known until they appear. You can use these variant options to create one-off or edge case events.
Change guidance examples
Download the change management process template and use it as a tool to plan and scope the full impact of changes to define your change management process for Splunk changes. This helps gain insight about how a change impacts people, services offerings, and businesses processes in the organization. Think about the reason for the change to identify the actions to complete in Splunk to support the change and consider the following examples to fully understand the impact and create a plan for the change:
- Downstream impacts: Potential outcomes to understand and evaluate
- Change consideration: Lists optional items such as governance documents, training, or testing to support the change
- Risk score: Use a number from 1 to 5, that corresponds to the level of difficulty, complexity, or significance of the change
- Scenario category: Type of change
- Scope of impact: Lists of rating of High, Medium, or Low to estimate the impact to the service, platform, audience, business process, and change difficulty
- Service: Domain such as your IT Ops, SOC, NOC
- Platform: Technology used in a domain such as Splunk
- Audience: Impact to the community such as executives, external customer, or partner
- Business process: Scope of a specific business process such as point of sales, security vulnerability, and so on
- Change difficulty: Complexity, level of effort, and skill set
People best practices overview
Communication best practices for a Splunk deployment
This documentation applies to the following versions of Splunk® Success Framework: ssf