Hunk®(Legacy)

Hunk User Manual

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Install Hunk on Amazon Web Services with hourly pricing

Use Hunk on Amazon Web Services to:

  • Attach to Elastic MapReduce clusters using designated Hunk AMIs.
  • Run searches on HDFS and S3 data stores.

Configure your clusters this way to bill your use on an hourly basis, as opposed to a per-node basis.

Step 1: Set up an AWS account

1. Open http://aws.amazon.com.

2. Click Sign Up.

3. Follow the on-screen instructions.

4. If this is your first time hearing about EMR, click here to learn more.

Step 2: Launch an EMR Cluster

Launch an EMR cluster using the EMR Console

1. Login to your AWS account and open your EMR console.

2. Click Create Cluster.

Emr-create-cluster.png

3. Enter information in the Cluster Configuration section:

a. Enter a Cluster Name.

b. Accept the default Yes as the Termination protection setting, to prevent termination of the cluster.

c. Select your Logging and Debugging options.

Emr-cluster-configuration.png

4. In the Tags section, enter the tags that describe your EMR nodes.

Emr-tags.png

5. In the Software Configuration section, select Amazon as the Hadoop Distribution and select AMI version 3.0.0 or later.

6. For Additional applications, select "Hunk" and follow directions on the pop-up window. "Hive" and "Pig", which are selected by default, are not required by Hunk. Softwareconfiguration.png

7. Click Enable.

Enable.png

8. In the File System Configuration section, configure your preferred settings.

9. In the Hardware Configuration section:

a. Select whether to launch the cluster on a VPC or EC2-Classic.

b. Select your EC2 availability zone. Choose an availability zone that is in the same or as close as possible to your S3 region.

c. Select the number and instance type of nodes: Master, Core, and Task nodes. These settings govern the size and compute power of your cluster.

Note on cluster sizing: Allocate your computer and storage resources according to your needs. (You can select different numbers of core and task instances.) If your data is in S3 then task instances are probably a best fit. If you plan to have or move data to HDFS, then select a sufficient number core instances. The type of the instance, specifically its CPU count, also determines the number of jobs/mappers that you can run on your cluster. For more information about EMR cluster sizing, see the AWS guide here: here

Emr-hardware.png

10. For Security and Access, select your EC2 key pair and IAM user access settings.

11. For IAM Roles, select your EMR Role setting and EC2 instance profile for your cluster. Hunk requires that you run your EMR cluster using IAM roles.

Emr-security-iam.png

12. (Optional) For Bootstrap Actions, select your preferred bootstrap setting.

Emr-boot.png

13. (Optional) For Steps, select the Add step and Auto-terminate settings.

14. Click Create Cluster.

Emr-create-end.png

Launch an EMR cluster using the CLI

To launch an EMR cluster using AWS Command Line Interface you must have version 1.4.4 or later. To use Hunk hourly, specify it in the applications attribute. For example:

Launch

aws emr create-cluster --applications Name=Hunk --ami-version 3.2.1 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=c3.xlarge InstanceGroupType=CORE,InstanceCount=10,InstanceType=c3.xlarge --no-auto-terminate --region us-west-2 --use-default-roles --ec2-attributes KeyName=my-emr-key-pair

Get Cluster info:

aws emr describe-cluster --cluster-id <cluster_id> 

Step 3: Provision a Hunk Instance

Provision a Hunk Instance using CloudFormation template

You can launch the Hunk instance by using Amazon CloudFormation. Use the links provided in the table below to launch a Hunk instance in the region of your choice. You need to provide an instance type and a key pair to access the instance.


EC2 Region Name EC2 Region Id CloudFormation template
US East, N. Virginia us-east-1 Launch hunk.png
US West, N. California us-west-1 Launch hunk.png
US West, Oregon us-west-2 Launch hunk.png
Europe, Ireland eu-west-1 Launch hunk.png
Europe, Frankfurt eu-central-1 Launch hunk.png
Asia Pacific, Tokyo ap-northeast-1 Launch hunk.png
Asia Pacific, Singapore ap-southeast-1 Launch hunk.png
Asia Pacific, Sydney ap-southeast-2 Launch hunk.png
South America, São Paulo sa-east-1 Launch hunk.png


You can create the Hunk instance using the CloudFormation CLI create-stack command. For example:

aws cloudformation create-stack  --region us-west-2 --stack-name MyHunkInstance  --template-body https://s3.amazonaws.com/splunk-emr-public/cfn/hunk_cf_template.txt --parameters InstanceType=c3.xlarge,KeyName=valid-key-pair,StorageSize=128

Provision a Hunk Instance using the EC2 Console

Alternatively, you can start Hunk from the EC2 console.

1. Switch to your EC2 console while remaining on the same region, and click Launch Instance.

2. Select an Amazon Machine Image (AMI). Search for one of the AMI IDs listed in the table under the Community AMIs or My AMIs while selecting Shared with me.

Alternatively, click in the table below on the AMI IDs, based on your region, and you are redirected to the appropriate page with the correct AMI-ID selected.

EC2 Region Name EC2 Region Id AMI Id
US East, N. Virginia us-east-1 ami-c891d2a0
US West, N. California us-west-1 ami-9a6f74df
US West, Oregon us-west-2 ami-ebc49fdb
Europe, Ireland eu-west-1 ami-0dc44d7a
Europe, Frankfurt eu-central-1 ami-acb586b1
Asia Pacific, Tokyo ap-northeast-1 ami-b2958fb3
Asia Pacific, Singapore ap-southeast-1 ami-2a9cb678
Asia Pacific, Sydney ap-southeast-2 ami-712c5b4b
South America, São Paulo sa-east-1 ami-13b50a0e


3. Select an instance type. Use "Compute optimized" instances. Instances with 8 vCPUs (for example, c3.2xlarge) provide a good starting point for optimal performance.

4. Enter information for Configure Instance Details:

a. Type "1" for Number of instances.

b. Select the appropriate Network setting. Use the same as that of the EMR nodes.

c. Select the appropriate Availability Zone. Use the same as that of the EMR nodes.

d. Select an IAM role. This is required. Use the same IAM role as the EMR nodes.

e. Select the preferred settings for the rest of the fields:

  • Shutdown behavior
  • Enable terminator protection
  • Monitoring
  • EBS-optimized instance
  • Advanced Details

Emr-step3-instance-details.png

5. For Add Storage, ensure that there is sufficient storage for the instance. The instance will not be part of the cluster and will not contain any raw data. It will need storage space to host users' and apps' search results and other related artifacts. 100GB should provide enough room for normal workloads.

6. For Tag Instance, provide the tags to describe your Hunk instance.

7. For Configure Security Group, provide the information Hunk needs to communicate with the EMR cluster nodes and the users through the Splunk web port. Configure the two security groups with the instance as follows:

  • Group Name: ElasticMapReduce-master, which gets instantiated when the EMR cluster runs.
  • Another (new) group with port 8000 open inbound for user access, and optionally 22 for admin tasks.

Emr-step3-security-groups.png

8. Review your instance details, and click Launch to assign a key pair and complete the process.

Step 4: Login to Hunk Instance

1. Wait for a minute or two until the Hunk instance is running. When it is available, note its user-facing address (this is usually the Public DNS address) from the EC2 Console.

2. Copy and paste the user-facing address into your browser. Use port 8000. For example:

http://<instance address>.amazonaws.com:8000

3. Log in using the instructions on the screen, and take the tour that guides you through the Hunk, EMR, and S3 experience.

4. Check the cluster connectivity status.

5. When the cluster is ready, create a Virtual Index that points to a data set location of your choice (HDFS or S3).

Last modified on 27 January, 2016
PREVIOUS
Uninstall Hunk
  NEXT
Install Hunk on Amazon Web Services using a license

This documentation applies to the following versions of Hunk®(Legacy): 6.2, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters