Elasticsearch Log Rotation
Warning
These instructions apply only to Kibana/Elasticsearch versions 7.4 or higher. Earlier versions of Elasticsearch and Kibana did not provide all of the UI features mentioned in this tutorial. Instead, for versions 6.8 or earlier, you should refer to our aptible/elasticsearch-logstash-s3-backup application.
If you're using Elasticsearch to hold log data, you'll almost certainly be creating new indexes periodically - by default, Logstash or our Log Drains will do so daily. This will necessarily mean that as time passes, you'll need more and more disk space, but also less obviously, more and more RAM. Elasticsearch allocates RAM on a per-index basis, and letting your log retention grow unchecked will almost certainly lead to fatal issues when the database runs out of RAM or disk space.
Components
We recommend using a combination of Elasticsearch's native features to ensure you do not accumulate too many open indexes, by backing up your indexes to S3 in your own AWS account:
- Index Lifecycle Management can be configured to delete indexes over a certain age
- Snapshot Lifecycle Management can be configured to back up indexes on a schedule, for example to S3
- The Elasticsearch S3 Repository Plugin, which is installed by default
Configuring a snapshot repository in S3
The first thing you will want to do is create an S3 bucket. For this example, we will use "aptible_logs" as the bucket name.
Then, Elasticsearch recommends creating an IAM policy with the mimumim access level required. They provide a recommended policy here in their documentation. Creating a dedicated user for this access is recommended, to minimize the permissions of the access key which will be stored in the Database.
Finally, because the Kibana UI does not provide you a way to specifcy your IAM keypair, you will need to register the snapshot repository using the Elasticsearch API directly. In this example we'll call the repository "s3_repository", and configure it to use the "aptible_logs" bucket created above:
curl -X PUT "https://username:[email protected]:9200/_snapshot/s3_repository?pretty" -H 'Content-Type: application/json' -d'
{
"type": "s3",
"settings": {
"bucket" : "aptible_logs",
"access_key": "AWS_ACCESS_KEY_ID",
"secret_key": "AWS_SECRET_ACCESS_KEY",
"protocol": "https",
"server_side_encryption": true
}
}
'
Be sure to provide the correct username, password, host and port needed to connect to your Database, likely as provided by the Database Tunnel, if you're connecting that way.
The full documentation of available options is here : https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-s3-usage.html
Backing up your indexes
In order to backup your indexes, you'll want to use Elasticsearch's Snapshot Lifecycle Management to automate daily backups of your indexes. In Kibana, you'll find these settings under Elasticsearch Management > Snapshot and Restore.
Snapshots are incremental, so you can set the schedule as frequently as you like, but at least daily is recommended.
You can find the full documentation for creating a policy here: https://www.elastic.co/guide/en/kibana/7.x/snapshot-repositories.html#kib-snapshot-policy
Limiting the live retention
Now that you have a Snapshot Lifecycle policy configured to backup your data to S3, the final step is to ensure you are deleting indexes after a certain time period in Elasticsearch. Deleting indexes will ensure both RAM and disk space requirements are relatively fixed, given a fixed volume of logs. For example, you may keep only 30 days in Elasticsearch, and if you need older indexes, you can retrieve them by restoring the snapshot from S3.
From Elasticsearch Management > Index Lifecycle Policies, create a new policy. Under "Hot phase" disable rollover - we're already creating a new index daily, and that should be sufficient. Enable the "Delete phase", and set it for 30 days from index creation (or to your desired live retention).
Next, we'll need to tell Elasticsearch which new indexes you want this policy to automatically apply to. In Kibana, go to Elasticsearch Management > Index Management, and then click Index Templates. Create a new template using the Index pattern logstash-*
. You can leave all other settings at default. This will ensure all new daily indexes get the lifecycle policy applied to them.
{
index.lifecycle.name": "rotation"
}
Finally, you'll need to apply the lifecycle policy to any existing indexes. Under Elasticsearch Management > Index Management, select one by one each logstash-*
index, click Manage, and then Apply Lifecycle Policy. Choose the policy you created earlier. If you want to apply the policy in bulk, you'll need to use the update settings API directly.
Snapshot Lifecycle Management as an alternative to Deploy Backups
Aptible Database Backups allow for the easy restoration of a backup to an Aptible Database using a single CLI command. However, the data retained with Snapshot Lifecycle Management is sufficient to restore the Elasticsearch Database in event of corruption, and can be configured to take much more frequent backups.
Updated about 2 years ago