Kubernetes ElasticSearch Indices Snapshot And Restore Using AWS S3

Arnob
5 min readSep 26, 2023

ElasticSearch deployed in Kubernetes, so here I will describe everything about snapshots and restoring indices by using AWS S3.

At AWS S3, we need to give permission S3 Access IAM Policy

Here policy name AmazonS3BucketAccess

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::es-aws-s3-bucket"
]
},
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::es-aws-s3-bucket/*"
]
}
]
}

AWS S3 configuration done. Now need to configure kubectl elastic search.

First need to give aws access and aws secret to elasticsearch keystore. You can’t find the keystore at kibana UI. You need to add the Keystore at deployment.

Need to configure at deployment.

Need to install a repository-s3 plugin at Kubernetes

- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin install --batch repository-s3

Now add the AWS Access Key and Secret Access with mapping with elastic search keystore.

- name: add-aws-keys
env:
- name: AWS_ACCESS_KEY_ID
value: <AWS_ACCESS_KEY_ID>
- name: AWS_SECRET_ACCESS_KEY
value: <AWS_SECRET_ACCESS_KEY>
command:
- sh
- -c
- |
echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key

Here is the ElasticSearch deployment yaml.

After the deployment, you need to wait for 1–2 minutes to get deployment in service.

Snapshot

Create a _snapshort PUT request from Kibana Dev Tools. Here at AWS S3 bucket gives a bucket name.

PUT /_snapshot/s3_repository?verify=false&pretty
{
"type": "s3",
"settings": {
"bucket": "es-aws-s3-bucket",
"region": "ap-northeast-3"
}
}

Response

{
"acknowledged": true
}

After create the repository, need to check the verification status

If it give connected. Done. If not then check the AWS S3 bucket access key, secret access key, also the AWS S3 IAM role.

Now needs to back up the indices (index) to AWS S3 Buckets.

PUT _snapshot/s3_repository/backup
{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": false,
"partial": false
}

Response

{
"accepted": true
}

Also, you can create a Snapshot Policy from kibana

Snapshot Policy Create

Here need to give a snapshot details

Give a Policy name like snapshot-policy

Giving Snapshot Name like <snapshot-{now/d}>. This means it will take the snapshot in present day.

Note: If you wanted to take the snapshot previous day then you have to give the snapshot name <snapshot-{now/d-1d}>.

The created Repository will be shown here. On the 1st you create the s3_repository

Giving the snapshot schedule.

Here give your snapshot setting.

Snapshot retention, you can configure deleting snapshots after some days/m

If your setting is good then it creates the policy

Policy created. Or you can create a policy by using DevTools

PUT /_slm/policy/snapshot-policy
{
"schedule": "0 30 1 * * ?", // Set a time when the snapshot should be taken, for example, 01:30 UTC
"name": "<snapshot-{now/d-1d}>",
"repository": "s3_repository",
"config": {
"indices": "<logstash-{now/d-1d}>",
"ignore_unavailable": true,
"include_global_state": false
}
}

If you need to this policy then you see that snapshot setting at selected data streams and indices.

Selected Data streams and indices see the index patterns at the setting page.

Here the policy details view.

Need to check the AWS S3 Buckets Objects, Are indices(index) backup or not

Indices are backed up in AWS S3.

When the schedule snapshot is taken, the snapshot history looks like this

Restore

Here I show you, how you restore the backup.

If you using KGL(Kibana Query Language) then

POST _snapshot/s3_repository/backup/_restore
{
"indices": "logstash-2023.09.26",
}

If you had some naming indices then you can rename the indices

POST _snapshot/s3_repository/backup/_restore
{
"indices": "logstash-2023.09.26",
"rename_pattern": "logstash-(.+)",
"rename_replacement": "logstash-ret-$1"
}

Here the rename pattern is the selected indices pattern to which you want to change the name. The rename will place the replacement rename before the $1.

Here $1 is the first group from your regular expression. The (.+) is regex (regular expression) consists of a sequence of sub-expressions. In this example, [0–9] and +. The […], known as a character class (or bracket list), encloses a list of characters. It matches any SINGLE character in the list.

Response

{
"acknowledged": true
}

After restoring the indices, if you search the indices

GET _cat/indices

You see the logstash-2023.09.26 is restore to logstash-ret-2023.09.26.

Restoring Done.

PUT _snapshot/s3_repository/backup if this response give this message then

{
"error": {
"root_cause": [
{
"type": "repository_exception",
"reason": "[s3_repository] Could not determine repository generation from root blobs"
}
],
"type": "repository_exception",
"reason": "[s3_repository] Could not determine repository generation from root blobs",
"caused_by": {
"type": "i_o_exception",
"reason": "Exception when listing blobs by prefix [index-]",
"caused_by": {
"type": "amazon_s3_exception",
"reason": "The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: EPKD1F7TRQEFYRJQ; S3 Extended Request ID: x9cP0DSRbIzQlUExCjnmTItFJlkhEXx3wFMMVKUTMH+Nxeu6aQAApWmEJdUGwaH6bX9p8VWAAhs=; Proxy: null)"
}
}
},
"status": 500
}

This means your ElasticSearch not connecting with AWS S3.
You have to check the AWS S3 Access keys. If not solve the problem, then you have to add another access key, it will solve the problem

Thank you.

Happy Learning ….

--

--