Elasticsearch Deployment on AWS EC2 Instance

amitsingh · ‎27 Mar 2023

Deploy Master & Data Node on AWS-EC2 instances

We have used three instances, one as Master eligible node that can work as both Master and Data node; other two will work as data node only. One can choose to use any number of EC2 instances, here we have used three. The instances that we choose have the following configurations:

AMI ID : ami-08153220276a5d89b, this AMI has RHEL 8

EC2 Instance type : r5.xlarge

Attached EBS : 1500 GiB (Note: EBS Volume will depend on the volume of data to be indexed. For 80mln use 1.5TB, for 500mln use 10TB and 1000mln use 20TB)

Security Group : Use SG same as ACS, or use another SG with open ports to connect with ACS

Private IPs of three launched instances [“10.0.2.85", "10.0.2.81","10.0.2.68"]

Install & Configure Elasticsearch on Instances

Now out of three EC2 instances we are making one EC2 instance act as both Master and Data node and other two EC2 instances as Data node.

1st EC2 Instance which acts As Both Master And Data Node

Let’s assume its Private IP = 10.0.2.68

Install Elasticsearch by using the following command

sudo rpm -i https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.1-x86_64.rpm

This will rerun all generators , reload all unit files, and recreate the entire dependency tree.

sudo systemctl daemon-reload

Use the following command to enable elasticsearch service

sudo systemctl enable elasticsearch.service

Open the elasticsearch.yml file by using the following command

sudo vi /etc/elasticsearch/elasticsearch.yml

Add/modify the following configurations in elasticsearch.yml file

cluster.name: my-application
node.name: node-1
network.host: 10.0.2.68
http.port: 9200
discovery.seed_hosts: ["10.0.2.85", "10.0.2.81","10.0.2.68"]
cluster.initial_master_nodes: ["10.0.2.68"]
node.master: true

Use the given below command to start the elasticsearch

sudo systemctl start elasticsearch.service

Use below command to check the Elasticsearch status

sudo systemctl status elasticsearch.service

Expected Response:

elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-12-06 12:50:25 UTC; 22h ago
     Docs: http://www.elastic.co
 Main PID: 5528 (java)
    Tasks: 84 (limit: 201139)
   Memory: 28.1G
   CGroup: /system.slice/elasticsearch.service
           ├─5528 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF>
           └─5622 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Dec 06 12:50:14 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Starting Elasticsearch...
Dec 06 12:50:15 ip-10-0-2-68.eu-west-2.compute.internal elasticsearch[5528]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a fut>
Dec 06 12:50:25 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Started Elasticsearch.

2nd EC2 Instance which acts As Data Node

Let’s assume its Private IP = 10.0.2.81

Install Elasticsearch by using the following command

sudo rpm -i https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.1-x86_64.rpm

This will rerun all generators , reload all unit files, and recreate the entire dependency tree.

sudo systemctl daemon-reload

Use the following command to enable elasticsearch service

sudo systemctl enable elasticsearch.service

Open the elasticsearch.yml file by using the following command

sudo vi /etc/elasticsearch/elasticsearch.yml

Add/modify the following configurations in elasticsearch.yml file

cluster.name: my-application
node.name: node-2
network.host: 10.0.2.81
http.port: 9200
discovery.seed_hosts: ["10.0.2.85", "10.0.2.81","10.0.2.68"]
cluster.initial_master_nodes: ["10.0.2.68"]

Use the given below command to start the elasticsearch

sudo systemctl start elasticsearch.service

Use below command to check the Elasticsearch status

sudo systemctl status elasticsearch.service

Expected Response:

elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-12-06 12:50:25 UTC; 22h ago
     Docs: http://www.elastic.co
 Main PID: 5528 (java)
    Tasks: 84 (limit: 201139)
   Memory: 28.1G
   CGroup: /system.slice/elasticsearch.service
           ├─5528 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF>
           └─5622 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Dec 06 12:50:14 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Starting Elasticsearch...
Dec 06 12:50:15 ip-10-0-2-68.eu-west-2.compute.internal elasticsearch[5528]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a fut>
Dec 06 12:50:25 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Started Elasticsearch.

3rd EC2 Instance which acts as Data Node

Let’s assume its Private IP = 10.0.2.85

Install Elasticsearch by using the following command

sudo rpm -i https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.1-x86_64.rpm

This will rerun all generators , reload all unit files, and recreate the entire dependency tree.

sudo systemctl daemon-reload

Use the following command to enable elasticsearch service

sudo systemctl enable elasticsearch.service

Open the elasticsearch.yml file by using the following command

sudo vi /etc/elasticsearch/elasticsearch.yml

Add/modify the following configurations in elasticsearch.yml file

cluster.name: my-application
node.name: node-3
network.host: 10.0.2.85
http.port: 9200
discovery.seed_hosts: ["10.0.2.85", "10.0.2.81","10.0.2.68"]
cluster.initial_master_nodes: ["10.0.2.68"]

Use the given below command to start the elasticsearch

sudo systemctl start elasticsearch.service

Use below command to check the Elasticsearch status

sudo systemctl status elasticsearch.service

Expected Response:

elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-12-06 12:50:25 UTC; 22h ago
     Docs: http://www.elastic.co
 Main PID: 5528 (java)
    Tasks: 84 (limit: 201139)
   Memory: 28.1G
   CGroup: /system.slice/elasticsearch.service
           ├─5528 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF>
           └─5622 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Dec 06 12:50:14 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Starting Elasticsearch...
Dec 06 12:50:15 ip-10-0-2-68.eu-west-2.compute.internal elasticsearch[5528]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a fut>
Dec 06 12:50:25 ip-10-0-2-68.eu-west-2.compute.internal systemd[1]: Started Elasticsearch.

Now Finally we are done with making a cluster with 3 data nodes in which one node is both Data and Master Node and two other nodes are Data Node.

Create desired number of Primary and Replica Shards using below Curl command

curl -XPUT 'http://10.0.2.68:9200/alfresco?pretty' -H 'Content-Type: application/json' -d'
{
  "settings" :{
    "number_of_shards":24,
        "number_of_replicas":0
  }
}'

Now hit curl command from bastion to check Elasticsearch Cluster Details

curl -X GET http://10.0.2.68:9200/_cluster/health?pretty

Expected Response:

{
  "cluster_name" : "my-application",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 24,
  "active_shards" : 24,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Once we are done with above steps, we have our application ready for indexing the metadata, content, path of files in the repository or to be uploaded which will be later used for performing search results

Elasticsearch Deployment on AWS EC2 Instance

Elasticsearch Deployment on AWS EC2 Instance

Deploy Master & Data Node on AWS-EC2 instances

1st EC2 Instance which acts As Both Master And Data Node

2nd EC2 Instance which acts As Data Node

3rd EC2 Instance which acts as Data Node

We use cookies on this site to enhance your user experience