One-off Log Analysis with ELK

How to Use ELK to Solve Your One-off Log Analysis Problems

John Moran

June 23, 2016 - Posted by John Moran to Security Insight

Log Analysis

Performing log analysis with divergent data sets can be the stuff nightmares are made of. If you are lucky, your organization may have only a few dozen different log types throughout your environment. If you perform log analysis as a service, forget about it. There are many fantastic log management solutions on the market today, including our own ActiveGuard service. These solutions have robust log collection, analysis, and search capability. For a comprehensive, enterprise log analysis solution they are ideal, however they require substantial implementation and tuning for your specific environment and are intended for long term log aggregation and monitoring. 

It is not always feasible to stand up one of these solutions on short notice or for a one-off project.

So where does that leave you? Manual log normalization and analysis? Manual techniques do have their place, but as those who have been there can attest, they do not always scale well. Let me try to save you some of the pain and suffering I have endured and offer you an alternative solution: ELK.

ELK is actually a combination of three products from Elastic designed to work together: Elasticsearch, Logstash, and Kibana. Elasticsearch provides a flexible and scalable backend database, Logstash provides the log ingestion and normalization, and Kibana provides a customizable front end with all of the search and visualization features you need to perform in-depth analysis. The purpose here is not to walk you through installing and setting up ELK, that is what YouTube is for. Suffice to say you can have a modest installation of ELK up and running in a virtual machine in less than an hour. Instead, the purpose here is to highlight ELK’s ability to provide a free one-off solution to your log analysis needs.

The real power of ELK’s flexibility in creating custom, one-off log analysis solutions comes from Logstash. Logstash uses configuration files to define how the data comes in, how the data goes out, and most important, how the data is transformed. Data can come in through a variety of plugins, including plugins for syslog, Windows event logs, SQLite, network connections, and various text file formats. Similarly, data can go out through a multitude of plugins, although for our purposes we will be using the Elasticsearch plugin to forward our data to an Elasticsearch database.

Most of the Logstash magic takes place during the data transformation using the various filter plugins. Filter plugins include aggregate, alter, csv, date, dns, geoip, json, kv, mutate, split, and urlencode. Information on these plugins and more can be found here, although the basic function of many of these plugins can be derived from their names.

One of the most powerful filter plugins in the Logstash arsenal is the grok plugin. grok is a  pattern matching language, similar to regex, that can be used to pull meaningful data from otherwise unstructured or unusable log files. Logstash includes over 100 default grok patterns, such as EMAILADDRESS, IP, WINPATH, URI, and DATESTAMP_RFC2822 (a full list can be found on their GitHub page). If none of the over 100 default grok patterns meet your needs, you can define your own within Logstash. grok patterns are used in the format:

          %{SYNTAX:symantic}

For example, %{IP:ip_address} would extract a string matching the IP address grok pattern and place it in to the ‘ip_address’ field.

Take the following data:

          2016-04-04T14:25:45.12+0500 179.2.61.56 GET /index.html 16321 0.021
          2016-04-04T14:25:45.36+0500 179.2.61.56 GET /styles.css 564 0.034
          2016-04-04T14:27:12.87+0500 49.55.3.145 GET /contact.html 14225 0.064
          2016-04-04T14:30:56.64+0500 62.41.123.54 GET /secret.html 56981 0.031
          2016-04-04T14:32:37.50+0500 9.38.148.125 GET /about.html 8945 0.044

You have a timestamp in ISO 8601 format, followed by an IP address, then a single word, followed by a URI and finally two numbers, all separated by a single space. Or in grok format:

%{TIMESTAMP_ISO8601:date_time} %{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}

Congratulations, you have just written your first grok pattern! Instead of grep’ing through gigabytes of web log data, you are just some import time away from instant searching and visualization of your logs. Feels good, doesn’t it?

Of course, it can get far more complicated than this, but even with this simple example you should begin to see the power of ELK. For a slightly more complex example, see a sample configuration used to import Windows Security event logs exported from ActiveGuard on my GitHub page. If you plan on writing your own grok patterns, I highly recommend a grok debugger, such as the one found here, to save yourself some headaches.

Happy hunting!

References:

https://www.elastic.co/products/logstash

https://www.elastic.co/guide/en/logstash/current/filter-plugins.html

https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

https://github.com/logstash-plugins/logstash-patterns-core/blob/939210be0635200ee44418f9af55de254a1ddeb3/patterns/grok-patterns

https://www.elastic.co/products/elasticsearch

https://www.elastic.co/products/kibana

https://github.com/jtmoran/logstash

https://grokdebug.herokuapp.com

Read more on Solutionary Minds about:

comments powered by Disqus

Voted one of the Best Computer Security Blogs 2016
NTT Security (US), Inc. (formerly Solutionary) is a security consulting and managed security services provider. The NTT Security blog is a place for IT professionals to both learn and talk about the latest in IT security and compliance.

Get the NTT Security Blog delivered to your inbox!

Enter your Email:

(We will not share your email or use it for anything else.)

LATEST TWEETS