Input, Filter, Output

Basic File Input

To configure logstash to auto-update the config file, when running logstash, we can provide the option:

  • --config.reload.automatic

Conf example:

input {
    file {
        path => "/etc/logstash/local/data/apache_access.log"
        start_position => "beginning"
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

Run:

/usr/share/logstash/bin/logstash -f /etc/logstash/local/basic2/file_input.conf --config.reload.automatic

Note that logstash will store the file pointer (~fish bucket) in:

  • /usr/share/logstash/data/plugins/inputs/file/.sincedb_*

If you want logstash to re-read a file, you have to remove the sincedb first.

Grok Processor

Example of Apache access log:

184.252.108.229 - - [20/Sep/2017:13:22:22 +0200] "GET /products/view/123 HTTP/1.1" 200 12798 "https://codingexplained.com/products" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"

Config using the grok pattern HTTPD_COMBINEDLOG

input {
    http {
        host => "172.16.100.149"
        port => 8080
    }
}
filter {
    grok {
        match => { "message" => "%{HTTPD_COMBINEDLOG}" }
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

Then run:

/usr/share/logstash/bin/logstash -f /etc/logstash/local/basic2/file_input.conf --config.reload.automatic

Make an http request with a log entry:

curl -XPUT http://172.16.100.149:8080 -H "Content-Type: text/plain" -d '18252.108.229 - - [20/Sep/2017:13:22:22 +0200] "GET /products/view/123 HTTP/1.1" 200 12798 "https://codingexplained.com/products" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"'

However, in the referrer and agent fields, the double quote character shows up \". This is because the GROK pattern HTTPD_COMBINEDLOG does not cater the double quotes.

Modify the config:

input {
    file {
        path => "/etc/logstash/local/data/apache_access.log"
        start_position => "beginning"
    }
    http {
        host => "172.16.100.149"
        port => 8080
    }
}
filter {
    grok {
        match => { "message" => '%{HTTPD_COMMONLOG} "%{GREEDYDATA:referrer}" "%{GREEDYDATA:agent}"' }
    }
    mutate {
        convert => { 
            "response" => "integer"
            "bytes" => "integer"
        }
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

Note that GREEDYDATA is equal to .* in regex.

Accessing Field Values

To add a static field (type in this example), we could just define in the config as follows:

input {
    http {
        host => "172.16.100.149"
        port => 8080
        type => "access"
    }
}

The field value can be called within the config. For example, if we want the filename outputted to be access.log, we can do:

output {
    file {
        path => "/tmp/%{type}_%{+yyyy-MM-dd}.log"
    }
}

%{+yyyy-MM-dd} will take the @timestamp date.

Setting the event time

Although Logstash processes the event time field timestamp, by default Logstash uses @timestamp for the event timestamp.

To let Logstash use the time in the log event instead, we can add a filter plugin date:

    date {
        match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    }

The format spec is as follows:

Conditional

Final

Access

input {
    file {
        path => "/etc/logstash/local/data/apache_access.log"
        start_position => "beginning"
    }
}
filter {
    mutate {
        replace => { type => "access" }
    }
    grok {
        match => { "message" => '%{HTTPD_COMMONLOG} "%{GREEDYDATA:referrer}" "%{GREEDYDATA:agent}"' }
    }
    if "_grokparsefailure" in [tags] {
        drop { }
    }
    mutate {
        convert => { 
            "response" => "integer"
            "bytes" => "integer"
        }
    }
    date {
        match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
        remove_field => [ "timestamp" ]
    }
    geoip {
        source => "clientip"
    }
    useragent {
        source => "agent"
        target => "ua"
    }
    # Drop requests to Admin page / Static file / favicon.ico / Spider
    if ([request] =~ /^\/admin\//) 
        or ( [request] =~ /.*\.[css|js]/ ) 
        or ( [request] in ["/favicon.ico"] ) 
        or ( [ua][device] == "Spider" ) {
        drop { }
    }
    mutate {
        remove_field => [ "headers", "@version", "host" ]
    }
    
}
output {
    #elasticsearch {
    #    hosts => [ "" ]
    #    #index => "%{type}-%{+YYYY-MM-dd}"
    #    http_compression => true
    #    user => ""
    #    password => ""
    #}
    stdout {
        codec => rubydebug {
            metadata => true
        }
    }
}

Java Error

input {
    file {
        path => "/etc/logstash/local/data/java_errors.log"
        start_position => "beginning"
        codec => multiline {
            pattern => "^%{CATALINA_DATESTAMP}"
            negate => true
            what => "previous"
            auto_flush_interval => 5
        }
    }
}
filter {
    mutate {
        replace => { type => "error" }
    }
    grok {
        match => { "message" => "%{CATALINA_DATESTAMP:[@metadata][timestamp]} %{LOGLEVEL:level} %{JAVACLASS:class}: (?<error_message>.*?)(\n|\r\n)"}
    }
    date {
        match => [ "[@metadata][timestamp]", "MMM dd, yyyy HH:mm:ss a" ]
    }
    mutate {
        remove_field => [ "headers", "@version", "host" ]
    }
    
}
output {
    stdout {
        codec => rubydebug {
            metadata => true
        }
    }
    
}

Last updated