Personal notes

install ~ setup

Orchestrated with fabric

`fabfile.py`


def install_elastic_search():
    sudo("wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.2.1.deb")
    sudo("dpkg -i elasticsearch-1.2.1.deb")
    sudo("service elasticsearch start")
    run("echo export PATH=\$PATH:/usr/share/elasticsearch/bin/ >> ~/.bashrc")

def es_init():
    //changing permission
    //see document

Cooperation with td-agent

Receive logs from various servers on the log server containing elasticsearch Conditional branching with td-agent (fluentd) forest plugin or extracting information from tags is recommended.

`td-agent.conf`



<source>
  type forward
  port 24224
</source>

<match **>
 type forest
 subtype elasticsearch
 <template>
   host localhost
   port 9200
   index_name ${tag_parts[2]}
   type_name ${tag_parts[1]}

   buffer_type memory
   flush_interval 3s
   retry_limit 17
   retry_wait 1.0
   num_threads 1
   flush_at_shutdown true
 </template>
</match>

This is a little flush_interval 3s is not good. Maybe in production it's so frequent that it affects performance.

By the way, if you set interval to the timing like Akan with td-agent, the data that should have been PUT to ElasticSearch will not be reflected easily, so if you think that it is slow to reflect, you may want to play with it.

test

Existence test

`fabfile_test.py`



def all():
  availability_test("td-agent")
  availability_test("elasticsearch")

def availability_test(name):
  env.warn_only = True

  if name == "td-agent":
    version_checker = name + " --version"
  elif name == "elasticsearch":
    version_checker = "export PATH=$PATH:/usr/share/elasticsearch/bin/ && " + name + " -v"

  if "command not found" in run(version_checker):
    print(name + " hasn't been installed")
  else:
    print(name + " has been installed")
  env.warn_only = False

If you think about it now, it's better to use the fabric test tool envassert, and I should be refactoring it now.

envassert is easy to set up and doesn't have to be as rugged as serverspec. No, it's very good to be able to write in rspec-like notation, and I think that mocha is also good in JS, but serverspec has too much to prepare and I want to test the environment for infrastructure testing. I came and touched it, but I'm afraid. So it's rubyist but fiblic, not chef.

Log system integration test

Dirty Rewrite around python

`bash`



#! /bin/bash
###############################################
# function
###############################################

initializing () {
  if ! expr $before : '[0-9]*' 1> /dev/null 2> /dev/null ; then
    before=0
  fi

  if [ -z $num ]; then
    num=0
  fi

  if [ $num -le $before ]; then
    num=$(($before+1))
  fi
}

buffering () {
  waiting=$*
  for i in `seq 1 $waiting`
  do
    left=$(($waiting - $i))
    echo $left sec
    sleep 1
  done
}

log_into_td_es () {
  fab -u <your_name> -i <your_pem> -H <your_domain> all
  buffering 1
}

diff_check () {
  diff=$(($after - $before))
  if [ $diff -eq 1 ]; then
    echo diff: $diff
  else
    echo not changed
    exit 1
  fi
}

# delete_all () {
#   curl -XDELETE 'http://$*:9200/*' 1> /dev/null 2> /dev/null 
# }




###############################################
# main
###############################################
before=`curl -XGET http://$*:9200/fluentd/_count 2> /dev/null | cut -d "," -f 1 | cut -d ":" -f 2`
echo before_count: $before
initializing
curl -XPUT http://$*:9200/fluentd/info/$num -d '{ "test" : "hoge" }' 1> /dev/null 2> /dev/null
after=`curl -XGET http://$*:9200/fluentd/_count 2> /dev/null | cut -d "," -f 1 | cut -d ":" -f 2`
echo after_count: $after
diff_check

Use logs

You should use ʻelasticsearch-py`. I'm rubyist, but python may be used in infrastructure. Fit to the team. I'm not particular about it.

curl -X GET 'http://hoge.com:9200/_index/_type/_search?pretty=true&size=1000&sort=desc'

You can also check it.

See the state of ES

Eye grep

curl -X GET 'http://hoge.com:9200/_stats?pretty

Thousands of lines of statistical information come out. I usually look at the amount of documents (and storage). I haven't sharded or replicated so far, so I'll do it later.

Life and death monitoring

elasticsearch-head
elasticsearch-paramedic
SPM for ElasticSearch

Those who get ES statistics on a regular basis and visualize them.

something like that.

Tips for using ElasticSearch in a good way

fabfile.py