Using AWS S3 snapshot repository for Elasticsearch

AWSCLIautomationDevOps

18 August 2021

Sergii Dolgushev

Lead Developer

Contextual Code

This post is also available in German and in French.

Contextual Code specializes in enterprise-level projects for state government agencies. We routinely tackle difficult web content management implementations, migrations, integrations, customizations, and operations. We know what it takes to get a project off the ground and onto the web.

Upsun is our primary hosting platform; it’s incredibly flexible, and it provides a vast list of services that can be set up with minimal configuration.

There are several possible scenarios, such as creating additional backups or syncing data to local development environments, when you may need to extract data from these services. In many cases, it’s simple to extract this data when you’re using Upsun. For example, you can get MariaDB/MySQL via the Upsun CLI tool command.

But in some cases, it’s more complex to extract the data you need; more advanced tools are necessary. We covered one such case in our Backup Solr on Upsun blog post. Today we’ll cover another—how to use AWS Elasticsearch S3 snapshot repository for Elasticsearch on Upsun.

Getting started

First, let's make sure we have an Elasticsearch service in the .upsun/config.yaml configuration file, under the services key:

services:
  elasticsearch:
    type: elasticsearch:7.2

Then let’s inject the service into the application via the elasticsearch relationship in .upsun/config.yaml:

relationships:
  elasticsearch:

Also, in the AWS Management Console, we need to:

Create a new AWS S3 bucket
Use AWS IAM to create a new user with read and write permissions for the newly created bucket

Registering the Elasticsearch S3 snapshot repository

The Elasticsearch S3 plugin is extremely easy to enable on Upsun. We just need to add repository-s3 in configuration.plugins for the elasticsearch service in .upsun/services.yaml:

elasticsearch:
  type: elasticsearch:7.2
  configuration:
    plugins:
      - repository-s3

After we deploy this change, we need to SSH to the application container and register a new snapshot repository by running the following command:

# SSH to the Upsun.com app container
upsun ssh

# Replace the value for these variables
AWS_BUCKET_NAME="<YOUR_AWS_BUCKET_NAME>"
AWS_ACCESS_KEY_ID="<YOUR_AWS_ACCESS_KEY_ID>"
AWS_SECRET_ACCESS_KEY="<YOUR_AWS_SECRET_ACCESS_KEY>"

# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')

# Register the snapshot repository
curl -X PUT "http://${ES_HOST}:${ES_PORT}/_snapshot/aws-s3?pretty" -H 'Content-Type: application/json' -d'{
 "type": "s3",
 "settings": {
   "bucket": "'"${AWS_BUCKET_NAME}"'",
   "client": "default",
   "access_key": "'"${AWS_ACCESS_KEY_ID}"'",
   "secret_key": "'"${AWS_SECRET_ACCESS_KEY}"'"
 }
}'

Once that is done, all new Elasticsearch snapshots will be stored on the AWS S3 bucket.

Creating the new Elasticsearch snapshots

We’ll use a simple bash script that will need to be executed in the app container --make-elasticsearch-snapshot.sh in the root for your project:

# Extract snapshot parameters
SNAPSHOT_ID=$(date +"%Y%m%d-%H%M%S")
SNAPSHOT_NAME=$(echo "${PLATFORM_PROJECT}-${PLATFORM_BRANCH}-${SNAPSHOT_ID}")
SNAPSHOT_DATE=$(date +"%Y-%m-%d %H:%M:%S")

# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')

# Create a new snapshot
curl -X PUT "http://${ES_HOST}:${ES_PORT}/_snapshot/aws-s3/${SNAPSHOT_NAME}?wait_for_completion=true&pretty" -H 'Content-Type: application/json' -d'{
 "ignore_unavailable": true,
 "include_global_state": false,
 "metadata": {
   "taken_by": "Upsun.com cron",
   "taken_on": "'"${SNAPSHOT_DATE}"'",
   "taken_because": "Daily backup"
 }
}

Add this script as elasticsearch_snapshot to the cron jobs in your application in .upsun/config.yaml:

crons:
  ....
  elasticsearch_snapshot:
    spec: '15 23 * * *'
      commands:
        start: bash make-elasticsearch-snapshot.sh

And deploy it:

git add .upsun/config.yaml make-elasticsearch-snapshot.sh
git commit -m "Added Elasticsearch snapshot cron job"
git push

After this is deployed, we can run the script in the app container:

# SSH to the Upsun.com app container
upsun ssh

# Run the newly deployed script
bash make-elasticsearch-snapshot.sh

The new snapshot will be created and stored in our AWS S3 bucket.

Using Elasticsearch snapshots

We can get a list of available snapshots by running the following commands:

# SSH to the Upsun.com app container
upsun ssh

# Extract Elasticsearch host and port from relationships
ES_HOST=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].host')
ES_PORT=$(echo "$PLATFORM_RELATIONSHIPS" | base64 --decode | jq -r '.elasticsearch[0].port')

# Get the list of available snapshots
curl -X GET "http://${ES_HOST}:${ES_PORT}/_cat/snapshots/aws-s3?v"

Our next steps would be:

Register the same s3 snapshot repository for our local Elasticsearch
Choose the snapshot we want to restore on our local installation
Restore the snapshot on our local Elasticsearch:

curl -X POST "http://%LOCAL_ELASTICSEARCH%/_snapshot/aws-s3/%SNAPSHOT_NAME%/_restore"

Once these steps are done, we export the data from Upsun Elasticsearch to our local installation. And we can repeat these steps whenever we need.

Now it’s your turn

I hope you found this post interesting and useful. Hopefully, it illustrates how flexible and extensible the Upsun framework is. Feedback and comments are appreciated. Happy snapshotting!

(Reprinted with permission.)

Using AWS S3 snapshot repository for Elasticsearch

Getting started

Registering the Elasticsearch S3 snapshot repository

Creating the new Elasticsearch snapshots

Using Elasticsearch snapshots

Now it’s your turn

Stay updated

Your greatest work
is just on the horizon

Using AWS S3 snapshot repository for Elasticsearch

Getting started

Registering the Elasticsearch S3 snapshot repository

Creating the new Elasticsearch snapshots

Using Elasticsearch snapshots

Now it’s your turn

Stay updated

Your greatest work.css-2vew0q{display:inline-block;background:rgb(250, 65, 255);background:linear-gradient(90deg, #806bff 0%, #ed49f0 100%);-webkit-background-clip:text;-webkit-background-clip:text;background-clip:text;-webkit-text-fill-color:transparent;}is just on the horizon

Your greatest work
is just on the horizon