Deploy Splunk
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
Deploy Splunk
Splunk is a bit paradoxical. On the one hand it is absolutely true that with Splunk you can get a simple installation up and running very quickly and get real value from it -- unlike, say, an Oracle RAC. On the other hand, if you want to deploy Splunk at the enterprise level, you absolutely need to plan your deployment and do some hard work to get Splunk configured optimally for your environment. At each stage of the process, someone in your organization needs to take ownership of the relevant high-level considerations.
Resources
- The Admin Manual gives details about setting up and deploying Splunk.
- Splunk Education provides a sequence of fee-based classes -- Using Splunk, Administrating Splunk, and Deploying Splunk -- that together give an interactive introduction to planning and implementing a Splunk deployment. Additional classes are available, including a Developer class that is very useful when extending Splunk with dashboards and custom views specific to your application environment..
- The Deployment topics on the Community Wiki, many of them written by Splunk technical staff, provide information about the components of a Splunk deployment, your options when deploying, what choices you have with respect to high availability, and information about tuning factors.
- Splunk Professional Services can help design, implement, and optimize large-scale enterprise deployments. If you need to set up an architecture that helps you get the most out of Splunk and that will scale as your Splunk deployment grows, you should consider Professional Services.
Position Splunk in your organization
In addition to the technical issues, you need to consider organizational issues that can arise. The technical silos created by most tools are mirrored in organizational silos. Because it crosses silos and provides better application integration, Splunk requires better human integration.
You need to consider the following:
- Who’s allowed to use which services?
- How is access controlled and audited?
- Who pays for the additional use?
Whatever strategy you develop, it needs flexibility to grow and change as adoption spreads across your organization. For instance, because Splunk makes it so easy to troubleshoot problems and find patterns in data, Splunk often jumps the wall from operations or an application management group to security.
Distribute indexing and searching
For mid-to-enterprise scale needs, indexing is typically split out from the data input function and sometimes from the search function as well. You may want to set up multiple indexers (index-only machines) or search heads (search-only machines) or implement load balancing and redundancy. See Advanced indexing strategy in the Admin manual for deployment considerations and Capacity planning for a larger Splunk deployment in the User manual for sizing and availability.
Plan for data acquisition
Data acquisition is a huge part of your Splunk deployment. The recommended best practice for getting the data into Splunk, especially for a production environment, is to install a Splunk forwarder -- a version of Splunk with a smaller footprint -- on each of the machines which are collecting data. You can use Splunk's deployment server to manage configurations on your forwarders; many standard methods for enterprise deployment also work. The Admin manual has an extensive discussion of how to deploy and set up forwarders, starting at About forwarding and receiving.
For devices that cannot run forwarders, you can send data over TCP or UDP, write data to a centralized log server, or use scripted inputs.
Splunk's capacity to eat massive amounts of heterogeneous data and search on it scalably solves a lot of problems, but brings its own challenges. A Splunk deployment cuts across operational groups and geographical or logical separations. To ensure your mission-critical data comes in quickly and reliably in a large deployment you need solve a number of issues:
- firewall issues
- authentication and authorization requirements
- data ownership -- will the owners of the data you need allow you to install a forwarder or use syslog on their machines? In some deployments, you may have to settle for a batch upload every hour or so.
Determine logging policies
Different types of data also need different types of handling. You need to answer questions like the following for each data source:
- Who will access this data?
- How long should this data be retained?
Access control and retention policies can be set in a number of ways, most commonly via multiple indexes. See Considerations for access control on the Community wiki for an overview of how to determine retention and access policy. See Set up multiple indexes and Set a retirement and archiving policy in the Admin manual for detailed information on implementation.
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 View the Article History for its revisions.