Missing logs in Elasticsearch logs at midnight

Case: of the Missing logs

I was debugging a curious case of my Elasticsearch instance on my vagrant dev box going to RED state every night at 00:00:00.  Consistently as far back as I can remember.

Right the obvious thing to do is look at the logs right? Except for this set of rotated logs there are no lines between 23:40hrs to 00:00:05.  Not in the current un-rotated log or the previous set.

At First Pass:

  1. Elasticsearch rotates its own log.  Could it be this process causing the missing Elasticsearch log lines?
  2. Marvel Creates new daily indices at 00:00:00.  Could it be this causing the missing Elasticsearch log lines?

What was the real was causing the missing logs

Well By default Elasticsearch uses log4j.  However, instead of the standard log4j.property file you get with log4j Elasticsearch is using a translated format to YAML format excluding all of the log4j pre-fix giveaways.  Another closer look at the configuration lead to the curious investigation of the type of rolling appender ; DailyRollingFile. This lead to this revelation :

DailyRollingFileAppender extends FileAppender so that the underlying file is rolled over at a user chosen frequency. DailyRollingFileAppender has been observed to exhibit synchronization issues and data loss. The log4j extras companion includes alternatives which should be considered for new deployments and which are discussed in the documentation for org.apache.log4j.rolling.RollingFileAppender.

Source :  Apache’s DailyRollingFileAppender Documentation

Missing Elastic logs Root Cause:

The sync issue with the DailyRollingFileAppender must be the cause to the missing Elasticsearch log lines around midnight.

Missing Elastic logs fix:

Use a log4j alternatives to DailyRollingFileAppender.  In this case a RollingFileAppender, changing my rolling strategy to roll my logs when they reach a certain file size. Replace DailyRollingFileAppender with RollingFileAppender and removing the  datePattern which was for the DailyRollingFileAppender.

Example:

file:
    type: rollingFile
    file: ${path.logs}/${cluster.name}.log
    maxFileSize: 10000000
    maxBackupIndex: 10
    layout:
        type: pattern
        conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"

Note: YAML is particular about tabs!

Happy Ending

Marvel turns out to be the cause of the my Elasticsearch cluster going into RED state at mid-night on new .Marvel*** Index creation.  Which makes sense as there will be a few milliseconds-seconds when this new index will have been created with shards, replicas etc missing.

Elasticsearch Cheat Sheet and Short Examples

Quick short, Elasticsearch cheat API End-Point calls that takes a while to remember.  If I have missed your favourite or want to make a recommendation to add in please do leave comment

States


# Show all indices
GET /_cat/indices?v-
  
# cluster health state
GET /_cluster/health
  
# Show all nodes
GET /_cat/nodes?

# Show largest Index. Leverages the _CAT api
curl 'localhost:9200/_cat/indices?bytes=b' | sort -rnk8 | grep -v marvel,kibana

Index

Indexing


# Bulk Indexing Example
POST /factory/_bulk
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"swift","make":"suzuki", "mark":1, "release_year":"1998-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"swift","make":"suzuki", "mark":2, "release_year":"2003-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"baleno","make":"suzuki", "mark":1, "release_year":"2000-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"focus","make":"ford","mark":1, "release_year":"2001-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"focus","make":"ford","mark":2, "release_year":"2007-01-01"}
	  {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"rs","make":"ford","mark":2, "release_year":"2011-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"rav4","make":"toyota","mark":3, "release_year":"2009-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"mondeo","make":"ford","mark":2, "release_year":"2007-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
	 { "model":"st","make":"ford","mark":1, "release_year":"2007-01-01"}
	 {"index":{"_index":"factory", "_type":"cars"}}
 { "model":"5 series","make":"bmw","mark":3, "release_year":"2009-01-01"}

 

Index Management


PUT /my_index/_settings
 
{
  "index": {
    "number_of_replicas": 4
  }
}
 
# Add a single alias
PUT /lmg_sem_v4/_alias/lmg
 
 
# Move Shard to another node
POST /_cluster/reroute
{
    "commands" : [ {
        "move" :
            {
              "index" : "amg_sem_v12", "shard" : 0,
              "from_node" : "UK-SEARCH-STG-02", "to_node" : "UK-SEARCH-STG-01"
            }
        }
    ]
}


Index Cloning

From ElasticSearch 2.3 you you may now use the built in _reindex API to re-index data


POST /_reindex
{
  "source": {
    "index": "my-index"
  },
  "dest": {
    "index": "my-new-index"
  }
}

 

Cloning with a filter/query


POST /_reindex
{
  "source": {
    "index": "my-index",
    "query": {
      "term": {
        "has-index-cloning-with-filter-on": true
      }
    }
  },
  "dest": {
    "index": "my-new-index"
  }
}

 


# Show cluster-wide Recovery state
GET /_recovery?pretty&human
GET /_recovery?pretty&human&active_only=true
 
# show tabular cluster-wide status summary
GET /_cat/recovery?v
 
# Show me all snapshots
GET /_snapshot/_all
 
# Show settings details of snapshot repo "my_backup"
GET /_snapshot/my_backup
 
# Show all snapshot details of repo "my_backup"
GET /_snapshot/my_backup/_all
 
# Delete snapshot "snapshot_2015_09_07-13_50_48" from repo "prod-0009"
DELETE /_snapshot/prod-0009/snapshot_2015_09_07-13_50_48/
 
# Register Repo + no need to verify permission on path location
PUT /_snapshot/prod-0009?verify=false
{
   "type": "fs",
   "settings": {
      "location": "/vagrant/prod-0009",
      "compress": true,
      "max_snapshot_bytes_per_sec": "200000000",
      "max_restore_bytes_per_sec": " 500mb"
      }
}
 
# Take Snapshot of just "cmg_sem_v6" index
PUT /_snapshot/one-off-repo?wait_for_completion=true
{
  "indices": "cmg_sem_v6",
  "ignore_unavailable": "true",
  "include_global_state": false
}
 
 
# Restore Snapshots of all index + global state
POST /_snapshot/prod-0009/snapshot_2015_09_11-10_23_29/_restore
 
# Restore Snapshots of only "log_river" index
POST /_snapshot/prod-0009/snapshot_2015_09_11-10_23_29/_restore
{
  "indices": "log_river",
  "rename_pattern": "index_pattern",  
  "rename_replacement": "restored_pattern" 
  "ignore_unavailable": "true",
  "include_global_state": false
}
 
# Speed up Recovery Speed
PUT /_cluster/settings
{
   "persistent": {
      "cluster.routing.allocation.node_concurrent_recoveries": "5",
      "indices.recovery.max_bytes_per_sec": "200mb",
      "indices.recovery.concurrent_streams": 5
   }
}
 

Having trouble With .Marvel* index creation?


# You can view the current settings template with :
curl -XGET localhost:9200/_template/marvel
 
# Modify settings with:
PUT /_template/marvel_custom
{
    "order" : 1,
    "template" : ".marvel*",
    "settings" : {
        "number_of_replicas" : 0,
        "number_of_shards" : 5
    }
}
 

 

More here

Move/Route shards to another elasticsearch node


POST /_cluster/reroute
{
    "commands" : [ {
        "move" :
            {
              "index" : "amg_sem_v12", "shard" : 0,
              "from_node" : "UK-SEARCH-STG-02", "to_node" : "UK-SEARCH-STG-01"
            }
        }
    ]
}

Notes from Elastic {on} Tour London 3rd November 2015

 Whilst the Elastic Team are preparing material and editing the Tour’s videos here are my brief MVP notes from elastic{on} tour conference

New features of ES 2.0. lots……

  1. Elasticsearch migration plugin
    1. migration plugin to help detect any issues that may occur during upgrading to Elasticsearch 2.0. This can be installed and run before upgrading..
  2. Compartible with indices created in version 0.90 and above
  3. Faster fs to disk
  4. More use of kernel for cachingNotes from Elastic {on} Tour London 3rd November 2015
  5. Better indexing,
  6. Problem free upgrading from versions to versions
    1. Invert of index plus the actual data for faster analysis
  7. New plugin-ins, connectors are all targeting this version as minimum
  8. Removed :
    1. Rivers
    2. Facets for –> Aggregations
    3. Delete by query – is now a framework
    4. Shutdown API – removed

For more on these see Docs

 

Admin Features

  1.   – reindexing of the same index to :
    1. Same cluster
    2. Different cluster
    3. Modifying destination indice settings (shards, replicas e.t.c)

Yes. Elastic are now competing With Becchi Niccolo’s Index cloner and my Spring Boot FrontEnd App for it.  So yes if you are not moving onto Elastic V2.0 anytime soon, you can make use these to move indices around your cluster(s) and yes modify destination indices settings too

ELK

Kinana has its own server and can be scale as needed, configuration in a centralised place too.

Monitoring

  1.   Official tool :
    1. PacketBeats <— Acquired by elastic
  2.    Alternatives that I have played with – HQ, HEAD
  3.    There are multiple Beats
    1. File | Topbeat | Packetbeat
      1. Checkout demo by creators
    2. Analysis tops and push to elastic – “Top” as in unix command
      1.   can then be visualised by kibana
      2.   can be run from mutilple OSs

Extensions:

Shield Security features: can restricted to users/roles/groups to..

  1. Individual fields
  2. Individual docs
  3. Specific index/indices
  4. Type of queries? (I might have mis-heard this!)

Marvel 2

  •    Built on top of Kibana 4
  •    Easier to use
  •    Streamlined metrics

Use- cases

Excelian:

Consulting company that used elastic search as part of their solutions to build a grid for a finance firm:
  1.  40,000 cores | scalable to 100,000 cores
  2.  2 regions,one load balancer, one cluster in each region, one master node in each region (this is the holistic view)
  3.  Secure | monitor-able | ldap integration for login with SHIELD.
  4.  They used Ansible (alternatives shared puppet/chef) – open source project on packaging dependencies in one app
    1.  Use case = when running in a banking environment with no internet..
    2.  Everything installed from a single server (vs puppet/chef master and clients)

Pipeline Aggregation Talk:

Moving averages | data histogram | historical aggregations

  1. Supports multiple scripting languages e.g lucene expressions e.t.c
  2. Example Data to play with:
    1.  Nasa data sets of launches
    2.  London property sales data (London property prices)
      1. March 2012. has an anomaly that the presenter was not sure of.
      2. Underlying data for this point seemed okay.
      3. Quick googling : there was a new tax levied on properties around this date!

Goldman Sachs Search:

(a consistent Search Experience In No Time) – Reuben Tonna -Vice President
Requirements
  1.  Consistent user experience – diff data same tool
  2.  Zero ui dev effort
  3.  Self service on-boarding
  4.  Operate on large data set quickly – search and filtering
  5.  Enable dev to focus on modelling data
  6.  Facilitate adoption of new ui technologies
  7.  Support for various data source technologies
Elastic benefits on top
  1.  Great performance- improves the u.x.
  2.  Scales to very large data sets
  3.  Aggregation provide a way to slice and dice the data
  4.  Quality documentation – lowered the entry barrier…
  5.  Less development time
Configurations
  1.  Each data is configured for “entitlement” as part of the on-boarding process
  2.  All users have access to the same UI but only data they are allowed to see
  3.  Diff dataset form multiple sources – elasticsearch, OData and SQL via their respective adapters to the UI (GS Search UI and Services)
  4.  Lots of commonality with kibana, this could have been done with kibana???

Goldman Sachs Search:

(Building a firm wide single task list) – Stephen Coster -Vice President
Requirement
  1.     Develop a web based single task list manager for the whole firm (.NET – Gui but back-end all linux in Golman’s)
  2.     Five million tickets
  3.     Distributed to 38,000 users around the globe
  4.     Latency region of 4 – 10secs
  5.     Data to be source from mult production instances
  6.     Live updating site as user updates the datasources
  Architecture
  1.    In memory elstic index
  2.    Index sequence stream of data
  3.    No need for the server to maintain any user specific sessions state
  4.    Use an offset to returned data set to enable infinite scroll funtionality
  5.    Server side facet calculation
  6.    6 production cluster instances (“Sequence sources”)
    1.    these are then aggregated to a single endpoint via a software load balancer….
    2. Camunda BPM – BPMN 2.0 Engine

Between Two Ferns – fireside chat

Richard Owens – Senior Systems Engineer at Huddle interview by: Marty Messer – VP customer care | elastic

  1. Example use case was to Query logs
  2. Monitor different events from tons of applications
  3. Indexing diff docs – pdf, logs etc.
    1. Used own service for extracting text from various documents before moving to elastic search
    2. Meta data index in elastic search –
    3. Web ui hits –> files api which then hits –> Elasticsearch and returns
  4. Historically coming from a monolith to micro-services architecture.
  5. Great use of the ELK stack – Marvel seem to help them in production
  6. Initially with self implemented security, then moved to shield –
  7. SHIELD was the main driver for their enterprise subscription as they wanted ssl, https protection etc

Questions for Shay Banon – Founder & CTO of Elastic

  ES3 new features? Nothing official here yet?

  1.  Trunk currently has new changes happening
  2.  Consistency of data improvements
  3.  Multiple cluster replication
  4.  Ability to deploy a plugin that can coherently plug itself across the stack… not just for kibana or elastic
 

Others:

  1.  For integrations just stream your immutable data into elastic from the source /  but no plan for a direct integration with any sql db such as SQL Server 20XX

Feature of elastic as a company?

  1.   Not static… looking forward for new problems… 300 clever diverse people.
  2.   Lots of innovations based on people e.g mark with graph on elastic
  3.   Learn to expect the unexpected and embrace it especially from clever people.
  4.   Built on top of open-source – making huge investments 300-400 commits to open-source projects (a week, month?… i don’t remember)
    1.   approx top 8 developers for lucene are employed by elastic
    2.   commercial aspects is adding on top of open-source… but open-source never stagnant.
  5.   Lots of excitement around found and kibana and the notion of double clicking and having all these good stuff available with great smart defaults

Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. Elasticsearch is the second most popular enterprise search engine after Apache Solr. Wikipedia | Elastic Blog