443-970-2353
[email protected]
CV Resume
Elasticsearch is an open source distributed full text search engine and it is the most popular enterprise search engine. Kibana, on the other hand, helps us to visualize and analyze data that resides in Elasticsearch. Elasticsearch is used by many notable companies such as Facebook, GitHub, Quora, etc. I am planning to make a couple of video tutorials on using the Elastic Stack and this is the first one. The Elastic Stack consists of Elasticsearch, Kibana and Logstash. We can analyze, visualize and search both structured and unstructured data in real time by using the Elastic Stack. All of them are free. Logstash is used for data collection and log parsing. We can import data to Elasticsearch from various sources using Logstash.
In this part of the blog post series, we will analyze and visualize New York City 311 service requests. You can download the data from NYC Opendata. The data has more than 15 million records and it is about 10 GB. It is from 2010 to present. The Logstash configuration code is shown below. It is also availble on Github.
In this tutorial, we will:
Using Kibana to Visualize New York City 311 Service Requests - A
Using Kibana to Visualize New York City 311 Service Requests - B
Using Kibana to Visualize New York City 311 Service Requests - C
Using Kibana to Visualize New York City 311 Service Requests - D
input{
file{
path => "C:/fish/elasticSearch_Kibana/tutorials/NYC311calls/NYC_311_Service_Requests_from_2010_to_Present.csv"
start_position => "beginning"
"sincedb_path" => "/dev/null"
}
}
filter{
csv{
separator => ","
columns => ["Unique Key", "Created Date", "Closed Date", "Agency", "Agency Name", "Complaint Type", "Descriptor", "Location Type", "Incident Zip",
"Incident Address", "Street Name", "Cross Street 1", "Cross Street 2", "Intersection Street 1", "Intersection Street 2", "Address Type",
"City", "Landmark", "Facility Type", "Status" , "Due Date", "Resolution Description", "Resolution Action Updated Date", "Community Board",
"Borough", "X Coordinate (State Plane)", "Y Coordinate (State Plane)", "Park Facility Name", "Park Borough","School Name", "School Number",
"School Region", "School Code", "School Phone Number", "School Address","School City", "School State","School Zip", "School Not Found",
"School or Citywide Complaint", "Vehicle Type", "Taxi Company Borough", "Taxi Pick Up Location", "Bridge Highway Name", "Bridge Highway Direction",
"Road Ramp", "Bridge Highway Segment", "Garage Lot Name", "Ferry Direction" , "Ferry Terminal Name", "Latitude", "Longitude","Location" ]
}
date {
locale => "eng"
match => ["Created Date", "MM/dd/yyyy HH:mm:ss aa", "ISO8601"]
target => "Date"
remove_field => ["Created Date"]
}
mutate { convert => {"Latitude" => "float"} }
mutate { convert => {"Longitude" => "float"} }
mutate { rename => {"Latitude" => "[location][lat]"} }
mutate { rename => {"Longitude" => "[location][lon]"} }
}
output{
elasticsearch {
hosts => ["localhost:9200"]
index => "nyc311calls"
document_type => "calls"
user => "elastic"
password =>"changeme"}
# stdout { codec => rubydebug { metadata => true } }
stdout { codec => dots }
}
PUT /_template/nyc311calls
{
"order": 0,
"template": "nyc311calls*",
"mappings": {
"_default_": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}