7rep --Insert Dataframe To Elasitcsearch

Code sample to put TSV (CSV) into Elasic search

from elasticsearch import Elasticsearch
from elasticsearch import helpers
import pandas as pd
import datetime
import time
import json
import random
from pandas.io.json import json_normalize

# Elasticsearch
es = Elasticsearch("{ES_IP}")
INDEX = "{ES_Index_Name}"

fname="{FileName}"
reader = pd.read_csv(fname, chunksize=1000, sep='\t',low_memory = False)
df_all = reader.get_chunk() #chunk to dataframe

# json
df_lines = df_all.to_json(force_ascii=False, orient='records', lines=True)

# Bulk inser
actions = []
for i in iter(df_lines.split("\n")):
    v_json = json.loads(i)
    actions.append({
        "_index": INDEX,
        "_type": "{ES_Type}",
        "_source": v_json
    })

helpers.bulk(es, actions)

"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||yyyyMMdd||epoch_millis" If you connect two pipes, you can specify other date formats, and even dates with different shapes can be picked up from Kibana.

PUT hoge
{
  "mappings": {
    "books": { 
      "properties": {
        "hoge1":     { "type": "integer"  },
        "hoge2":    { "type": "text"  }, 
        "hoge3":     { "type": "text"  },
        "hoge4":     { "type": "text"  },
        "hoge5":     { "type": "integer"  },
        "hoge6":     { "type": "text"  },         
        "hoge7":     { "type": "integer"  },
        "hoge8":     { "type": "text"  },
        "hoge9":     { "type": "text"  },
        "create_date":  {
          "type":   "date", 
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||yyyyMMdd||epoch_millis"
        }
      }
    }
  }
}

Recommended Posts

7rep --Insert Dataframe To Elasitcsearch
I want to INSERT a DataFrame into MSSQL
Export pandas dataframe to excel
Convert list to DataFrame with python
Bulk Insert Pandas DataFrame with psycopg2