NewRelic is a fantastic tool to get great insights of your application happenings and services surrounding it. It collects a massive amount of data and makes it easy accessible. Almost every metric and dashboard they offer is crucial to any DevOps or Cloud Engineer.

Now that Elastic acquired Packetbeat, which is essentially similar in the functionality to NewRelic’s agent (e.g. you can now collect data not only anymore from log files, but system metrics and external services via network sniffing), can the ELK stack, as open source alternative, replace NewRelic?

tl;dr: almost 🙂

I already did a post back in 2015 when I first got in touch with the ELK stack, this time however I will go a little more in detail and offer a full installation guide bringing together the following components:

  • ELK (ElasticSearch, Logstash & Kibana)
  • AWS ElasticSearch Service
  • ElasticBeanstalk (via ebextension)
  • Laravel (exception logs)
Conveniently Amazon Web Services now offers ElasticSearch as a Service, so it is no longer necessary to maintain a self-hosted version on EC2.

1) Create ElasticSearch Domain

The setup is pretty boring, but you might want to do something along the following screenshots:
Set the name of the ElasticSearch instance
 Set the ElasticSearch cluster dimension/size.
 Set the ElasticSearch storage.
 
In our setup we will not communicate directly to ElasticSearch, but instead instances will communicate via filebeat (formerly known as logstash-forwarder) to a Logstash instance. Hence we only need to whitelist the public and internal IP of the Logstash instance (see step 2).
 We end up receiving our ElasticSearch endpoint. Remember: AWS ships with Kibana pre-installed – for your convenience.

2) Create SSL certificate

We will need a SSL certificate to establish a secure and authenticated connection between agent/instance and Logstash. This might not be needed if you are running everything within the same VPC, though.
The next few steps get very surreal.. but trust me, it works. Please set the correct IP of your Logstash instance:

(openssl.cnf)

[ req ]
#default_bits  = 2048
#default_md  = sha256
#default_keyfile  = privkey.pem
distinguished_name = req_distinguished_name
attributes  = req_attributes
req_extensions = v3_req

[ req_distinguished_name ]
countryName   = Country Name (2 letter code)
countryName_min   = 2
countryName_max   = 2
stateOrProvinceName  = State or Province Name (full name)
localityName   = Locality Name (eg, city)
0.organizationName  = Organization Name (eg, company)
organizationalUnitName  = Organizational Unit Name (eg, section)
commonName   = Common Name (eg, fully qualified host name)
commonName_max   = 64
emailAddress   = Email Address
emailAddress_max  = 64

[ req_attributes ]
challengePassword  = A challenge password
challengePassword_min  = 4
challengePassword_max  = 20

[ v3_req ]
subjectAltName=@alt_names
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = CA:true

[alt_names]
IP.1 = XXX.XXX.XXX.XXX

And then do the following steps:

 

$ sudo mkdir -p /etc/pki/tls/certs
$ sudo mkdir /etc/pki/tls/private
$ sudo openssl req -x509 -nodes -days 3650 -newkey rsa:4096 
-keyout /etc/pki/tls/private/logstash.key 
-out /etc/pki/tls/certs/logstash.crt 
-config /etc/ssl/openssl.cnf 
-extensions v3_req

$ sudo chown logstash: /etc/pki/tls/private/logstash.key /etc/pki/tls/certs/logstash.crt
$ sudo chmod 600 /etc/pki/tls/private/logstash.key /etc/pki/tls/certs/logstash.crt

 

The whole custom configuration is necessary so the certificate can be correctly verified by both the Logstash and beats. Basically we are creating a self authorized certificate with the IP of Logstash as SAN (Subject Alternative Name – IP).

3) Logstash

Next we will need an EC2 instance that will run Logstash, thus be responsible for receiving logs & metrics from our application servers and passing them through to our ElasticSearch endpoint.
It won’t need a lot of resources, so you can start with a t2.medium and work yourself up if needed.Additionally we are going to host a nginx reverse-proxy for the Kibana endpoint. This will allow us to “bridge” the auth-system of AWS and instead replace it with our own simple http-auth.
Logstash is a Java application, so you will have to install it first – if you are on Ubuntu or Debian you can use my java ansible role to do so 🙂
Use something similar to the following as your nginx vhost config:
(nginx-vhost.conf)

 

server {
  listen 80;
  server_name kibana.acme.com;

  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header X-Forwarded-Proto $scheme;

  auth_basic "/dev/null";
  auth_basic_user_file /etc/nginx/htpasswd.conf;
  proxy_set_header Authorization "";

  location /.kibana-4 {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com;
  }

  location ~* ^/(filebeat|topbeat|packetbeat)- {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com;
  }

  location ~ ^/_(aliases|nodes)$ {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com;
  }

  location ~ ^/.*/_search$ {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com;
  }

  location ~ ^/.*/_mapping$ {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com;
  }

  location / {
    proxy_pass https://search-webapplogs-xxx.eu-west-1.es.amazonaws.com/_plugin/kibana/;
  }
}

 

Now download and install Logstash:

$ wget https://download.elastic.co/logstash/logstash/packages/debian/logstash_2.1.1-1_all.deb
$ sudo dpkg -i logstash_2.1.1-1_all.deb

The following Logstash config files have to be put under /etc/logstash/conf.d/

$ wget 
https://raw.githubusercontent.com/elastic/beats/master/topbeat/etc/topbeat.template.json 
https://raw.githubusercontent.com/elastic/beats/master/packetbeat/etc/packetbeat.template.json 
https://raw.githubusercontent.com/logstash-plugins/logstash-output-elasticsearch/master/lib/logstash/outputs/elasticsearch/elasticsearch-template.json
$ mv elasticsearch-template.json /etc/logstash/filebeat-template.json
$ sed -i 's/logstash/filebeat/' /etc/logstash/filebeat-template.json

(01-beats-input.conf)

input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/etc/pki/tls/certs/logstash.crt"
    ssl_key => "/etc/pki/tls/private/logstash.key"
  }
}

This will accept connections from beats on port 5044 if SSL certificate matches.

(10-syslog.conf)

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    
    syslog_pri { }
    
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

Simple syslog configuration/grok.

(11-apache.conf)

filter {
  if [type] == "apache" {
    grok {
      match => { "message" => "%{IP:clientip} - - [%{HTTPDATE:timestamp}] %{HOSTNAME:domain} "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} %{NUMBER:bytes:int} "(?:%{URI:referrer}|-)" %{QS:agent}" }
    }

    date {
      match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
    }

    if [clientip] {
      geoip {
        source => "clientip"
        target => "geoip"
        add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
        add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
      }
      
      mutate {
        convert => [ "[geoip][coordinates]", "float" ]
      }
    }
  }
}

Apache access-log configuration. Will also try to resolve the clientip to a geolocation.

(12-laravel.conf)

filter {
  if [type] == "laravel" {
    multiline {
      pattern => "^["
      what => "previous"
      negate=> true
    }

    grok {
      match => { "message" => "(?m)[%{TIMESTAMP_ISO8601:timestamp}] %{WORD:env}.%{LOGLEVEL:severity}: %{GREEDYDATA:content}" }
    }

    mutate {
      replace => [ "message", "%{content}" ]
      remove_field => [ "content" ]
    }
  }
}

Multi-line Laravel exception logs parser.

(30-es-output.conf)

output {
  elasticsearch {
    hosts => ["search-webapplogs-xxx.eu-west-1.es.amazonaws.com:80"]
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
    template_overwrite => true
    template => "/etc/logstash/filebeat-template.json"
    template_name => "filebeat"
  }
}

Finally push it to our ElasticSearch endpoint.

Lets give it a try:

$ sudo /etc/init.d/logstash restart

Manually set index templates for topbeat and packetbeat:

$ curl -XPUT 'http://search-webapplogs-xxx.eu-west-1.es.amazonaws.com/_template/topbeat' -d@topbeat.template.json
$ curl -XPUT 'http://search-webapplogs-xxx.eu-west-1.es.amazonaws.com/_template/packetbeat' -d@packetbeat.template.json

4) ElasticBeanstalk ebextension

As with my other ebextensions, I like writing the heavy part in pure bash, this also allows me to enable certain ebextensions on a project to project basis by setting the activator params/envvars.

(12-beats.config)

# beats
#
# Author: Gunter Grodotzki 
# Version: 2016-01-18
#
# install and configure beats
# BEATS: enable
container_commands:
  01-beats:
    command: ".ebextensions/beats.sh"

(beasts.sh)

#!/bin/bash
#
# Author: Gunter Grodotzki (gunter@grodotzki.co.za)
# Version: 2016-01-18
#
# install and configure beats

set -e

if [[ "${BEATS}" == "enable" ]]; then

  export HOME="/root"
  export PATH="/sbin:/bin:/usr/sbin:/usr/bin:/opt/aws/bin"

  # lets do everything inside .ebextensions so it will clean itself
  cd .ebextensions

  # set optimized LogFormat
  sed -i '/^s*LogFormat/d' /etc/httpd/conf/httpd.conf
  sed -i '/^s*CustomLog/d' /etc/httpd/conf/httpd.conf

  cat <<'EOB' > /etc/httpd/conf.d/10-logstash.conf
SetEnvIf Remote_Addr "::1" dummy
SetEnvIf Remote_Addr "127.0.0.1" dummy
LogFormat "%a - - %t %{Host}i "%r" %>s %B "%{Referer}i" "%{User-Agent}i"" combined
CustomLog "logs/access_log" combined env=!dummy
EOB

  # add bash_history logging
  echo 'PROMPT_COMMAND='"'"'history -a >(tee -a ~/.bash_history | logger -t "$USER[$$]")'"'"'' > /etc/profile.d/logstash.sh

  # add key
  mkdir -p /etc/pki/tls/certs

  cat <<'EOB' > /etc/pki/tls/certs/logstash.crt
ENTER HERE THE CONTENT OF THE SSL CERTIFICATE WE CREATED
EOB

  # install beats
  packages=( filebeat-1.0.1 topbeat-1.0.1 packetbeat-1.0.1 )
  for package in "${packages[@]}"; do
    if ! rpm -qa | grep -qw ${package}; then
      rpm -i ${package}-x86_64.rpm
    fi
  done

  # configure filebeat
  cat < /etc/filebeat/filebeat.yml
filebeat:
  prospectors:
    -
      paths:
        - "/var/log/secure"
        - "/var/log/messages"
      document_type: syslog
    -
      paths:
        - "/var/log/httpd/access_log"
      document_type: apache
    -
      paths:
        - "/var/app/current/storage/logs/laravel*"
      document_type: laravel
output:
  logstash:
    hosts: ["IP.OF.LOGSTASH:5044"]
    tls:
      certificate_authorities: ["/etc/pki/tls/certs/logstash.crt"]
EOB

  # configure topbeat
  cat <<'EOB' > /etc/topbeat/topbeat.yml
input:
  period: 10
  procs: [".*"]
  stats:
    system: true
    proc: true
    filesystem: true
output:
  logstash:
    hosts: ["IP.OF.LOGSTASH:5044"]
    tls:
      certificate_authorities: ["/etc/pki/tls/certs/logstash.crt"]
EOB

  # configure packetbeat
  cat <<'EOB' > /etc/packetbeat/packetbeat.yml
interfaces:
  device: eth0
  type: af_packet
protocols:
  memcache:
    ports: [11211]
  mysql:
    ports: [3306]
  redis:
    ports: [6379]
output:
  logstash:
    hosts: ["IP.OF.LOGSTASH:5044"]
    tls:
      certificate_authorities: ["/etc/pki/tls/certs/logstash.crt"]
EOB

  # start + enable beats
  /etc/init.d/filebeat restart > /dev/null 2>&1
  /etc/init.d/topbeat restart > /dev/null 2>&1
  /etc/init.d/packetbeat restart > /dev/null 2>&1
  chkconfig filebeat on
  chkconfig topbeat on
  chkconfig packetbeat on

fi

5) Kibana

The first time you visit your Kibana installation in your browser you will have to add the beats inputs (filebeat-*, topbeat-* and packetbeat-*) as seen here:

6) Curation

The way how ELK works, data will keep on growing. Mainly because of costs you might want to throw away older logs.

You can easily do this with curator and a cronjob:

$ sudo apt install python-pip python-dev
$ sudo pip install pyasn1
$ sudo pip install --upgrade ndg-httpsclient
$ sudo pip install elasticsearch-curator

Run at midnight:

$ curator --port 80 --host search-webapplogs-xxx.eu-west-1.es.amazonaws.com delete indices --older-than 35 --time-unit days --timestring '%Y.%m.%d'

DONE! Phew… wowses.. Creating all those fancy dashboards are out of this scope though. You can try to bootstrap your Kibana with ready made configurations: elastic/beats-dashboards.

As of now I wasn’t able to get packetbeat working with RDS. And there are still some features missing to fully replace NewRelic (though other features are much better – like actually searching for logs) – but I am very keen on seeing what might still come this year.

Update (2016-02-03):

I actually forgot to do some stuff which meant geo_point and some stuff on topbeat/packetbeat were not working 😉