Missing logs in Elasticsearch logs at midnight

Case: of the Missing logs

I was debugging a curious case of my Elasticsearch instance on my vagrant dev box going to RED state every night at 00:00:00.  Consistently as far back as I can remember.

Right the obvious thing to do is look at the logs right? Except for this set of rotated logs there are no lines between 23:40hrs to 00:00:05.  Not in the current un-rotated log or the previous set.

At First Pass:

  1. Elasticsearch rotates its own log.  Could it be this process causing the missing Elasticsearch log lines?
  2. Marvel Creates new daily indices at 00:00:00.  Could it be this causing the missing Elasticsearch log lines?

What was the real was causing the missing logs

Well By default Elasticsearch uses log4j.  However, instead of the standard log4j.property file you get with log4j Elasticsearch is using a translated format to YAML format excluding all of the log4j pre-fix giveaways.  Another closer look at the configuration lead to the curious investigation of the type of rolling appender ; DailyRollingFile. This lead to this revelation :

DailyRollingFileAppender extends FileAppender so that the underlying file is rolled over at a user chosen frequency. DailyRollingFileAppender has been observed to exhibit synchronization issues and data loss. The log4j extras companion includes alternatives which should be considered for new deployments and which are discussed in the documentation for org.apache.log4j.rolling.RollingFileAppender.

Source :  Apache’s DailyRollingFileAppender Documentation

Missing Elastic logs Root Cause:

The sync issue with the DailyRollingFileAppender must be the cause to the missing Elasticsearch log lines around midnight.

Missing Elastic logs fix:

Use a log4j alternatives to DailyRollingFileAppender.  In this case a RollingFileAppender, changing my rolling strategy to roll my logs when they reach a certain file size. Replace DailyRollingFileAppender with RollingFileAppender and removing the  datePattern which was for the DailyRollingFileAppender.

Example:

file:
    type: rollingFile
    file: ${path.logs}/${cluster.name}.log
    maxFileSize: 10000000
    maxBackupIndex: 10
    layout:
        type: pattern
        conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"

Note: YAML is particular about tabs!

Happy Ending

Marvel turns out to be the cause of the my Elasticsearch cluster going into RED state at mid-night on new .Marvel*** Index creation.  Which makes sense as there will be a few milliseconds-seconds when this new index will have been created with shards, replicas etc missing.

Lean Maven Release

The Lean Maven Release (AKA Maven Release on Steroids)
Simply put to get rid of Maven Release Plugin’s repetitive and time wasting inefficient builds and multiple checkins to SCM script this process to:
mvn clean
mvn versions:set
mvn deploy
mvn scm:tag
This can be setup in both Jenkins and Team-city.  I have been able configured this within a few minutes replacing my teams Maven Release Plugin.  This is a huge time safer and more people really need to do this especially within true Continuous Delivery team or any type of setting with frequent need for builds.

Benefits

 So how big exactly was the improvement of Releases On Steroids over the Release Plugin?

See for yourself

Releases on Steroids Maven Release Plug-in
Clean/Compile/Test cycle 1 3
POM transformations 0 2
Commits 0 2
SCM revisons
1 (binary source tag)
3

See more at Axel Fontaine’s blog where I first came across this piece of treasure after manager tipped me about it.

 

A Typical Fix

Typically The following is all I ever need to add on any project to getting maven on steriods pattern working

Add the Properties

...
<maven.compiler.plugin.version>3.1</maven.compiler.plugin.version>
<maven.release.plugin.version>2.5</maven.release.plugin.version>
<maven.source.plugin.version>2.2.1</maven.source.plugin.version>
<maven.javadoc.plugin.version>2.9.1</maven.javadoc.plugin.version>
<maven.gpg.plugin.version>1.5</maven.gpg.plugin.version>
...

Deployment path settings

... Local deployment
<distributionManagement>
    <repository>
        <id>internal.repo</id>
        <name>Internal repo</name>
        <url>file:///${user.home}/.m2/repository/internal.local</url>
    </repository>
</distributionManagement>
...
... or Remote deployment
<distributionManagement>
    <repository>
      <uniqueVersion>false</uniqueVersion>
      <id>corp1</id>
      <name>Corporate Repository</name>
      <url>scp://repo/maven2</url>
      <layout>default</layout>
    </repository>
</distributionManagement>

Apache maven plugins

<pluinManagement>
<plugins>
...
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>${maven.compiler.plugin.version}</version>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-release-plugin</artifactId>
    <version>${maven.release.plugin.version}</version>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-source-plugin</artifactId>
    <version>${maven.source.plugin.version}</version>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-javadoc-plugin</artifactId>
    <version>${maven.javadoc.plugin.version}</version>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-gpg-plugin</artifactId>
    <version>${maven.gpg.plugin.version}</version>
</plugin>
...
<plugins>
...
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-release-plugin</artifactId>

    <configuration>
        <useReleaseProfile>false</useReleaseProfile>
        <releaseProfiles>release</releaseProfiles>
        <goals>deploy</goals>
    </configuration>
</plugin>
<plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <configuration>
        <outputDirectory>${project.build.directory}/releases/</outputDirectory>
        <descriptors>
            <descriptor>${basedir}/src/main/assemblies/plugin.xml</descriptor>
        </descriptors>
    </configuration>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>single</goal>
            </goals>
        </execution>
    </executions>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.3</version>
    <configuration>
        <source>${java.version}</source>
        <target>${java.version}</target>
    </configuration>
</plugin>
<plugin>
    <artifactId>maven-clean-plugin</artifactId>
    <version>2.6.1</version>
    <configuration>
        <filesets>
            <fileset>
                <directory>overlays</directory>
                <includes>
                    <include>**/*</include>
                </includes>
                <followSymlinks>false</followSymlinks>
            </fileset>
        </filesets>
    </configuration>

</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-deploy-plugin</artifactId>
    <version>2.8.2</version>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-source-plugin</artifactId>
    <version>2.4</version>
</plugin>
...

The Release Profile

The Maven Release profile will be infered to for the maven deploy option

<profiles>
...
    <profile>
        <id>release</id>
        <properties>
            <activatedProperties>release</activatedProperties>
        </properties>
        <build>
            <pluginManagement>
                <plugins>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-compiler-plugin</artifactId>
                        <version>${maven.compiler.plugin.version}</version>
                    </plugin>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-release-plugin</artifactId>
                        <version>${maven.release.plugin.version}</version>
                    </plugin>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-source-plugin</artifactId>
                        <version>${maven.source.plugin.version}</version>
                    </plugin>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-javadoc-plugin</artifactId>
                        <version>${maven.javadoc.plugin.version}</version>
                    </plugin>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-gpg-plugin</artifactId>
                        <version>${maven.gpg.plugin.version}</version>
                    </plugin>
                </plugins>
            </pluginManagement>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-source-plugin</artifactId>
                    <executions>
                        <execution>
                            <id>attach-sources</id>
                            <goals>
                                <goal>jar</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-javadoc-plugin</artifactId>
                    <executions>
                        <execution>
                            <id>attach-javadocs</id>
                            <goals>
                                <goal>jar</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-gpg-plugin</artifactId>
                    <executions>
                        <execution>
                            <id>sign-artifacts</id>
                            <phase>verify</phase>
                            <goals>
                                <goal>sign</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </profile>
</profiles>
...


Assemly description

And the plugin assemly description xml.  Location: src/main/assemblies/plugin.xml

<?xml version="1.0" encoding="UTF-8"?>
<assembly>
    <id>plugin</id>
    <formats>
        <format>zip</format>
    </formats>
    <includeBaseDirectory>false</includeBaseDirectory>
    <dependencySets>
        <dependencySet>
            <outputDirectory>/</outputDirectory>
            <useProjectArtifact>true</useProjectArtifact>
            <useTransitiveFiltering>true</useTransitiveFiltering>
            <excludes>
            </excludes>
        </dependencySet>
    </dependencySets>
</assembly>

Note if you intend to sign the package with the GPG plugin you’ll need to further configure this for your environment. I will write a seperate blog for this later. You may skip it with the bellow:

mvn deploy -Prelease -Dgpg.skip=true

Thats It!

Intelli j Tweaks

Hot deploy/swap to Servlet Server

  1. File –> Settings –> Debugger –> HotSwap
    1. enable class reload classed on the background : true
    2. enable class reload classed after compilations : always
  2. Run/Debug Configurations 
    1. select “Update resources” in the drop-down “On frame deactivation”.

More here

 

Git Alias Configuration Example

Edit your .gitconfig file in your $HOME directory for some serious time saving git shortcuts

[core]
 excludesfile = /Users/moses.mansaray/.gitignore_global
 autocrlf = input

[user]
 name = moses.mansaray
 email = moses.mansaray@domain.com

[push]
 default = simple

[alias]
 co = checkout
 cob = checkout -b
 cod = checkout develop
 ci = commit
 st = status

 save = !git add -A &amp;&amp; git commit -m
 br = branch

 rhhard-1 = reset --hard HEAD~1
 rhhard-o = reset head --hard

 hist = log --pretty=format:\"%h %ad | %s%d [%an]\" --graph --date=short

 type = cat-file -t
 dump = cat-file -p

 llf = log --pretty=format:"%C(yellow)%h%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --numstat
 lld = log --pretty=format:"%C(yellow)%h\\ %ad%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --date=short

 amend = commit -a --amend

ElasticSearch Curator Short Guide

Elasticsearch Curate Features

Curate, or manage your Elasticsearch indices
  • Alias Management – add, remove
  • Shard routing allocation
  • Indices Management – Close , Delete indices, Open closed indices, Optimize indices and modify number of replicas
  • Snapshot(backups management) – Show, Backup, Restore.
  • Change the number of replicas per shard for indices
  • Pattern Matching for statements (e.g delete all indices matching .marvel*)

 

Example Elasticsearch Curate Use Cases

Automatically/manually

  • Snapshots Index creation
    • All indices except Marvels regex support
  • Restore from snapshots (using the Elasticsearch end point via curl ….. Not with curator currently)
    • All indices in the back up
    • Specific index from the backup
    • Keep cluster state as is
  • Delete indices
    • Older than a date range
    • By a regex matching pattern

Background

  • Started off as clearESindices.py for a simple goal to delete indices.
  • It then became logstash_index_cleaner.py
  • Then moved under logstash repository as “expire_logs”.
  • After Jordan Sissel was hired by Elastic it then became Elasticsearch Curator and is now hosted at https://github.com/elastic/curator
  • Today : Curator now performs many operations on your Elasticsearch indices, from delete to snapshot to shard allocation routing.

 

Curator all-in Command Line

  • Curator 2.0 underlying feature was the first attempt to detach the popular Elasticsearch API from the CLI.
  • These allows scripting to carry out any of the tasks featured above meaning you can for instance:
    • Automate your back up and restore procedures.
  • Also with version 2.0, Curator ships in an API along side the wrapper scripts/ entry points. This API allows you to roll out your own scripts to perform similar or totally different tasks to Curator with the same underlying code that curator uses…
  • Documentation can be found here

 

Curator installation for Mac


# you may ignore the python installations if you already have this
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py

wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py

wget https://pypi.python.org/packages/source/u/urllib3/urllib3-1.8.3.tar.gz
pip install urllib3-1.8.3.tar.gz

wget https://pypi.python.org/packages/source/c/click/click-3.3.tar.gz -O click-3.3.tar.gz
sudo pip install click-3.3.tar.gz

# install elasticsearch-py
wget https://github.com/elastic/elasticsearch-py/archive/1.6.0.tar.gz -O elasticsearch-py.tar.gz
sudo pip install elasticsearch-py.tar.gz

# install elasticsearch-curator
wget https://github.com/elastic/curator/archive/v3.3.0.tar.gz -O elasticsearch-curator.tar.gz
sudo pip install elasticsearch-curator.tar.gz

# To test : Verify version
curator --version
# should echo: "curator, version 3.3.0"

# To example Commands try the help menu
curator --help
# should echo : help menu options

For more installation guides please see the official Elasticsearch Guides

 

Curator Command Line Flags

 

Index and Snapshot Selection

–newer-than

–older-than

–prefix

–suffix

–time-unit

–timestring

–regex

–exclude

Index selection only

–index

–all-indices

Snapshot selection only

–snapshot

–all-snapshots

–repository

 

Elasticsearch Cheat Sheet and Short Examples

Quick short, Elasticsearch cheat API End-Point calls that takes a while to remember.  If I have missed your favourite or want to make a recommendation to add in please do leave comment

States


# Show all indices
GET /_cat/indices?v-
  
# cluster health state
GET /_cluster/health
  
# Show all nodes
GET /_cat/nodes?

# Show largest Index. Leverages the _CAT api
curl 'localhost:9200/_cat/indices?bytes=b' | sort -rnk8 | grep -v marvel,kibana

Index

Indexing


# Bulk Indexing Example
POST /factory/_bulk
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"swift","make":"suzuki", "mark":1, "release_year":"1998-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"swift","make":"suzuki", "mark":2, "release_year":"2003-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"baleno","make":"suzuki", "mark":1, "release_year":"2000-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"focus","make":"ford","mark":1, "release_year":"2001-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"focus","make":"ford","mark":2, "release_year":"2007-01-01"}
	  {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"rs","make":"ford","mark":2, "release_year":"2011-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"rav4","make":"toyota","mark":3, "release_year":"2009-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"mondeo","make":"ford","mark":2, "release_year":"2007-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"st","make":"ford","mark":1, "release_year":"2007-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
 { "model":"5 series","make":"bmw","mark":3, "release_year":"2009-01-01"}

 

Index Management


PUT /my_index/_settings
 
{
  "index": {
    "number_of_replicas": 4
  }
}
 
# Add a single alias
PUT /lmg_sem_v4/_alias/lmg
 
 
# Move Shard to another node
POST /_cluster/reroute
{
    "commands" : [ {
        "move" :
            {
              "index" : "amg_sem_v12", "shard" : 0,
              "from_node" : "UK-SEARCH-STG-02", "to_node" : "UK-SEARCH-STG-01"
            }
        }
    ]
}


Index Cloning

From ElasticSearch 2.3 you you may now use the built in _reindex API to re-index data


POST /_reindex
{
  "source": {
    "index": "my-index"
  },
  "dest": {
    "index": "my-new-index"
  }
}

 

Cloning with a filter/query


POST /_reindex
{
  "source": {
    "index": "my-index",
    "query": {
      "term": {
        "has-index-cloning-with-filter-on": true
      }
    }
  },
  "dest": {
    "index": "my-new-index"
  }
}

 


# Show cluster-wide Recovery state
GET /_recovery?pretty&amp;amp;amp;amp;amp;amp;amp;amp;human
GET /_recovery?pretty&amp;amp;amp;amp;amp;amp;amp;amp;human&amp;amp;amp;amp;amp;amp;amp;amp;active_only=true
 
# show tabular cluster-wide status summary
GET /_cat/recovery?v
 
# Show me all snapshots
GET /_snapshot/_all
 
# Show settings details of snapshot repo "my_backup"
GET /_snapshot/my_backup
 
# Show all snapshot details of repo "my_backup"
GET /_snapshot/my_backup/_all
 
# Delete snapshot "snapshot_2015_09_07-13_50_48" from repo "prod-0009"
DELETE /_snapshot/prod-0009/snapshot_2015_09_07-13_50_48/
 
# Register Repo + no need to verify permission on path location
PUT /_snapshot/prod-0009?verify=false
{
   "type": "fs",
   "settings": {
      "location": "/vagrant/prod-0009",
      "compress": true,
      "max_snapshot_bytes_per_sec": "200000000",
      "max_restore_bytes_per_sec": " 500mb"
      }
}
 
# Take Snapshot of just "cmg_sem_v6" index
PUT /_snapshot/one-off-repo?wait_for_completion=true
{
  "indices": "cmg_sem_v6",
  "ignore_unavailable": "true",
  "include_global_state": false
}
 
 
# Restore Snapshots of all index + global state
POST /_snapshot/prod-0009/snapshot_2015_09_11-10_23_29/_restore
 
# Restore Snapshots of only "log_river" index
POST /_snapshot/prod-0009/snapshot_2015_09_11-10_23_29/_restore
{
  "indices": "log_river",
  "rename_pattern": "index_pattern",  
  "rename_replacement": "restored_pattern" 
  "ignore_unavailable": "true",
  "include_global_state": false
}
 
# Speed up Recovery Speed
PUT /_cluster/settings
{
   "persistent": {
      "cluster.routing.allocation.node_concurrent_recoveries": "5",
      "indices.recovery.max_bytes_per_sec": "200mb",
      "indices.recovery.concurrent_streams": 5
   }
}
 

Having trouble With .Marvel* index creation?


# You can view the current settings template with :
curl -XGET localhost:9200/_template/marvel
 
# Modify settings with:
PUT /_template/marvel_custom
{
    "order" : 1,
    "template" : ".marvel*",
    "settings" : {
        "number_of_replicas" : 0,
        "number_of_shards" : 5
    }
}
 

 

More here

Move/Route shards to another elasticsearch node


POST /_cluster/reroute
{
    "commands" : [ {
        "move" :
            {
              "index" : "amg_sem_v12", "shard" : 0,
              "from_node" : "UK-SEARCH-STG-02", "to_node" : "UK-SEARCH-STG-01"
            }
        }
    ]
}

Top Elasticsearch plugins

These are my top must have Elasticsearch plugins, from monitoring clusters to moving indices and managing Elasticsearch snapshots.  If you are here you may already know of Elasticsearch’s marvel plugin, with combination with Sense you mostly have all you will need to manage Elasticsearch.  If like me Marvel and Sense are not enough for your workflow or just curious about what plugins by the Elasticsearch community can offer you please read on.

NOTE: Some of these plugins are not being maintained anymore since site plugins got removed from new version of Elasticsearch.  I will soon update this list with workaround/alternatives.  Meanwhile you may follow the links to the plugins landing pages of for the latest update/news/alternative.

Head

 

Elasticseach head cluster OverviewBy default the first thing I will always install.  Most times even before I install the official elastic search Marvel plugin suits.

  • Not quite polished for aesthetics but does what it needs to do very very good and that is quick snap via of the status of your cluster, nodes, shards etc.
  • Quick handy drop-down buttons to drop an index, alias or set and alias.
  • You can also easily write up your own queries and view them as tables or json.
  • Follow/contribute to the project on GitHub – elasticsearch-head

kopf

 

elasticsearch kopf cluster view

Another lovely tool aimed more at administering your elasticsearch cluster.  Very light weight and covers commonly peformed tasks, by no means comprehensive but its getting better and better.  Best thing this have over HQ is the REST client which you can leverage to explore most of what Elasticsearch API exposes.

Note this can also be run locally without being installed as a plugin albeit some browser limitations:

git clone git://github.com/lmenezes/elasticsearch-kopf.git
cd elasticsearch-kopf
git checkout <a given branch final version>
open _site/index.html

Note this can also be run locally without being installed as a plugin albeit some browser limitations:

 

WhatsOn

 

elasticsearch whatson

Aimed at elasticsearch cluster monitoring, visualisation and inspection with lots of stats. Can be used without installing at Whatson Git Hub Page.  Most Useful in large clusters.

 

ElasticSearch-Hq

 

HQ Elastic clusterhealth

For

  • Performance metrics reports
  • Monitoring
  • Charts
  • Open-source project.
  • Elastic search management tool.
  • Available as a Plug/Hosted/Self-Hosted war
  • Query functionality (limited, currently not customisable)
  • Follow/contribute to the project on GitHub – elasticsearch-HQ

For ElasticHQ users usage Stats  go here

 

Big Desk

 

bigdesk cpu

In short Big Desk pulls data from Elasticsearch via REST API calls in convert this to numerous charts of statistics about your elasticsearch cluster.

  • can be installed as a plugin
  • download and run locally or
  • run from the web
  • follow/contribute to the project on GitHub-bigdesk

tokenizers

 

 

 

 

 

 

  • Priceless tool for tuning results relevancy.  Exposes easy means to testing your Analysers, mappings and queries anatomy.
  • Fully dedicated tab for testing Tokenisers, Analysers and most importantly your Queries
  • This is a a tool I will highly recommend for learning as it quickly helps you learn how elasticsearch interpret, parses and match/not queries
  • Shame this plugin has not been updated in recent time but feel free to contribute at Git-hub elasticsearch-inquisitor