How to parse CouchPotato logs using Logstash

This is a quick Logstash configuration share to parse CouchPotato logs for display in Kibana interface. This guide assumes that you already have an ELK stack installed. If you don't, check out my guide on how to get up and running with ELK Stack here

Creating Pattern Variables

Pattern variables are basically regular expressions that have been given a name recognized by grok. To define a pattern all you need to do is create a text file and tell Logstash where to find extra patterns (but that's later).

To parse my CouchPotato logs, I've decided to create couple of patterns for use with grok filter. These patterns are as follows:

# parses strings such as: 07-31 09:56:08
DATESTAMP_CP [0-9]{2}-[0-9]{2} %{TIME}
# parses strings such as: '\u001b\[0m' or '^[[0m' or '\e[0m'
METACHAR_CP ((\\u001b|\^\[|\e)\[\d+m)?
# assigns regular expression that matches Java classes to a new variable name.
FACILITY_CP %{JAVACLASS}

Now that you have the required patterns, you should add them to the /etc/logstash/patterns/extras file by running:

sudo editor /etc/logstash/patterns

and pasting in patterns mentioned above.

Logstash Configuration File

Now, lets start adding all the necessary pieces to get this working.

Input Block

First thing's first, we need to define input for Logstash to know what to read. This is done by adding the following block:

input {
	file {
		path => "/path/to/your/CouchPotato.log"
		type => "couchpotato"
	}
}

What the block above does is the following:

  • Reads a file a a path defined by the path => parameter
  • Adds a type tag to the log line by the type => parameter

While that is adequate most of the time, sometimes you need to treat multiple lines as a single log entry (i.e. stack trace)
Luckily for us, we could add the following block of code to make Logstash look for the pattern and based on the result either append line to the previous log item or start a new log item

codec => multiline {
	patterns_dir => "/etc/logstash/patterns/"
	pattern => "^%{DATESTAMP_CP}"
	negate => true
	what => previous
}

To read more about the multiline codec, check out Logstash's documentation.

Filter Block

Next, we need to tokenize our log item. This is done by the filter block like so:

filter {
	if [type] == "couchpotato" {
		grok {
			patterns_dir => "/etc/logstash/patterns/"
			match => [ "message", "(?m)%{DATESTAMP_CP:date}%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}%{METACHAR_CP}\[%{FACILITY_CP:facility}\]%{GREEDYDATA:msg}" ]
		}
 
		mutate {
			gsub => [ "msg", "\e\[\d+m", " " ]
			strip => [ "msg" ]
		}
	}
}

The block above is responsible for:

  • Tokenizing the log message
    • The leading (?m) tells grok filter that we are dealing with a multi-line entry
    • Every occurrence of %{} is a grok pattern variable, where pattern is a regular expression
    • Use of the %{DATESTAMP_CP:date} means that we are using a pattern DATESTAMP_CP and we are going to capture its result with a name date
  • Replacing a string that matches regular expression \e\[\d+m and replacing it with a space (" ")
  • Removing all leading and trailing spaces

So, the parsed result for a log line: 07-31 19:01:32 INFO ^[[0m[hpotato.core.plugins.base] Opening url: get https://www.binsearch.info/index.php?q=tt12345678&minsize=5000&maxsize=20000&adv_nfo=on&adv_age=1900&max=400&m=n&adv_sort=date&adv_col=on, data: []^[[0m would become:

{
  "date":"07-31 19:01:32",
  "logLevel":"INFO",
  "facility":"hpotato.core.plugins.base",
  "msg":"Opening url: get https://www.binsearch.info/index.php?q=tt12345678&minsize=5000&maxsize=20000&adv_nfo=on&adv_age=1900&max=400&m=n&adv_sort=date&adv_col=on, data: []"
}

Output Block

Finally, we need to send this parsed log entry to our Elasticsearch instance. This is done with an output block like so:

output {
	elasticsearch {
		bind_host => "127.0.0.1"
		cluster => "elasticsearch"
		host => "127.0.0.1"
	}
}

Putting it all together

Here is the complete configuration file that puts all the building blocks together.

input {
	file {
		path => "/path/to/your/CouchPotato.log"
		type => "couchpotato"
		codec => multiline {
			patterns_dir => "/etc/logstash/patterns/"
			pattern => "^%{DATESTAMP_CP}"
			negate => true
			what => previous
		}
	}
}
 
filter {
	if [type] == "couchpotato" {
		grok {
			patterns_dir => "/etc/logstash/patterns/"
			match => [ "message", "(?m)%{DATESTAMP_CP:date}%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}%{METACHAR_CP}\[%{FACILITY_CP:facility}\]%{GREEDYDATA:msg}" ]
		}
 
		mutate {
			gsub => [ "msg", "\e\[\d+m", " " ]
			strip => [ "msg" ]
		}
	}
}
 
output {
	elasticsearch {
		bind_host => "127.0.0.1"
		cluster => "elasticsearch"
		host => "127.0.0.1"
	}
}

After you've placed your configuration CouchPotato configuration file in the /etc/logstash/conf.d folder, restart Logstash service by running:

sudo service logstash restart

Comments