splogTASH - SPL to Lucene translator

Recently I took part in the migration of Splunk over to ELK (Elastic Search/Logstash/Kibana). One of the main road-blocks with this migration was the syntax changes that were introduce since the query language was different in the new system.

Splunk uses a proprietary language called "search processing language" or SPL for short. Kibana and the ELK stack make use of Apache Lucene. These two languages are not difficult by themselves, but converting queries over from SPL to Lucene is not always a trivial task.

I wanted to find an easy way of converting Splunk queries over to Lucene, so I started working on a translator.

What is splogTASH?

splogTASH is a Sinatra application that translates queries depending on the definitions provided. The main definitions are stored in a JSON formatted file /helpers/lucene.json.

{
    "lucene": {
        "/data/syslog/current/web/info.log": "web_info_log AND",
        "/data/syslog/current/web/access.log": "web_access_log AND",
        "/data/syslog/current/web/error.log": "web_error_log AND",
        "source=": "type:",
        "uri=": "request_uri:",
        "uri_path=": "request_uri:",
        "http_response=": "reponse:",
        "=": ":"
    },
    "pre_lucene": {
        "| where eventcount >": "min_doc_count:",
        "transaction ": " agg:terms field:",
        "stats count by ": " agg:stats field:"
        }
}

As you can see there are two objects in the schema, lucene and prelucene. Since splogTASH takes the whole inputed query and tokenizes each matching term, I added prelucene to easily convert before tokenizations. This comes in handy when dealing with complex queries that do aggregations for example.

Under the lucene object we see an example for a web info log:

"/data/syslog/current/web/info.log": "web_info_log AND"

Here we are saying that any time we match "/data/syslog/current/web/info.log", we want "webinfolog AND" in return. It's as simple as that, to add your own definitions follow the patter of "term" : "definition".

Lastly, I would like to point out the main splogtash.rb file (line 39-48):

        # post processing
        inputs.split.each do |query|
          queries << query
        end
        queries.each do |q|
          temp_var = String.new
          temp_var = q.dup
          lucene.each {|f, u| temp_var.gsub!(f, u)}
          case temp_var
          when /AND/
               temp_var = temp_var.delete('"')
          when /!/
               temp_var = temp_var.delete("!").insert(0, "-")
          when /\//
               temp_var = temp_var.gsub!("/","\\\/")
          end
          lucenyze << temp_var
        end
        rtg << lucenyze.join(' ')
        return rtg
      end

In the post-processing section there is a case statement which handles global translations. In this example I'm cleaning the query by escaping "/" and also replacing "!" with "-" (exclusions).

After a query has been converted we can then click on the translated query to test on Kibana. You might be interested in customizing the URL to best fit you. Look at ../views/search.erb, replace "kibana.initech.com".

<div class="panel panel-success">  
  <div class="panel-body">


{gfm-js-extract-pre-1}
  </div>
</div>  

Try it yourself!

It's real simple to get started. The source can be found on my code page: http://dyurk.com/code/dyurk/splogTASH

$ git clone http://dyurk.com/code/dyurk/splogTASH.git
$ cd splogTASH/ && rackup -p 8080

Now go to http://localhost:8080

Patches welcome!!


Tagged under: splunk, logstash, kibana, lucene