Logstash Into
Logstash (the L
in the ELK Stack) is probably the most popular log analytic platform. Its responsible for data aggregation from a different sources, processing it, and sending it down the pipeline, usually to be directly indexed in Elasticsearch.
In presented setup Logstash bundles the messages that come from the filebeats
, processes it and passes further to Elasticsearch. In our case we have Elasticsearch Cluster (Open Distro ) managed by AWS. However, mostly the rest runs in a Kubernetes cluster, the Logstash as well.
Logstash Deployment
While it’s possible to run several Logstash instances, it’s not needed in our case. So this is example of the Deplyoment with a single instance. Also here we are using the OSS build docker.elastic.co/logstash/logstash-oss:7.7.1
Logstash OSS Docker otherwise we had connection problems to AWS Elastic Service (Open Distro)
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: logstash-deployment
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: logstash
template:
metadata:
labels:
app: logstash
spec:
containers:
- name: logstash
env:
- name: LOGSTASH_PW
valueFrom:
secretKeyRef:
name: elasticsearch-secrets
key: LOGSTASH_PASSWORD
image: docker.elastic.co/logstash/logstash-oss:7.7.1
ports:
- containerPort: 5044
volumeMounts:
- name: config-volume
mountPath: /usr/share/logstash/config
- name: logstash-pipeline-volume
mountPath: /usr/share/logstash/pipeline
resources:
limits:
memory: "4Gi"
cpu: "2500m"
requests:
memory: "4Gi"
cpu: "800m"
volumes:
- name: config-volume
configMap:
name: logstash-configmap
items:
- key: logstash.yml
path: logstash.yml
- name: logstash-pipeline-volume
configMap:
name: logstash-configmap
items:
- key: logstash.conf
path: logstash.conf
Kubernetes Service
Logstash is exposed as service to our cluster:
kind: Service
apiVersion: v1
metadata:
name: logstash-service
namespace: kube-system
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
spec:
selector:
app: logstash
ports:
- protocol: TCP
port: 5044
targetPort: 5044
type: LoadBalancer
Basic Logs Processing Configuration
The configuration of Logstash processing pipeline starts in logstash.conf
usually.
Below you find basically example with 3 sections
- input - defines source of events
- filters - defines your processing
- output - defines the sink
apiVersion: v1
kind: ConfigMap
metadata:
name: logstash-configmap
namespace: kube-system
data:
logstash.yml: |
http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
logstash.conf: |
input {
beats {
port => 5044
}
}
filter {
if [kubernetes][labels][logstyle] == "nginx" {
#Nginx
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]}( \"%{DATA:[nginx][access][referrer]}\")?( \"%{DATA:[nginx][access][agent]}\")?",
"%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \\[%{HTTPDATE:[nginx][access][time]}\\] \"-\" %{NUMBER:[nginx][access][response_code]} -" ] }
}
# date {
# match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
# remove_field => "[nginx][access][time]"
# }
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [kubernetes][pod][labels][app] == "filebeat" {
#filebeat
grok {
match => [ "message", "(?<timestamp>%{TIMESTAMP_ISO8601})\s+%{LOGLEVEL:level}\s+%{DATA}\s+%{GREEDYDATA:logmessage}" ]
}
}
else {
#HTD java
grok {
match => [ "message", "(?<timestamp>%{TIMESTAMP_ISO8601}) - \[(?<thread>[A-Za-z0-9-]+)\] %{LOGLEVEL:level}\s+(?<class>[A-Za-z0-9.]*\.[A-Za-z0-9#_]+)\s* - %{GREEDYDATA:logmessage}" ]
}
}
}
output {
elasticsearch {
ilm_enabled => false
hosts => ["https://notforeveryone-eyes.es.amazonaws.com:443"]
user => 'logstash'
password => '${LOGSTASH_PW}'
index => "logstash-beta-%{+YYYY.MM.dd}"
}
}
That’s it. Feel free to continue with Installing Filebeat to Kubernetes
Evaluation
This setup works very robust in my scenario. In our case we haven’t found better place for configuration as the ConfigMap
. That has one drawback. The change in configuration do not affect restart of a Logstash (cool be a good thing as well). We don’t suffer too much from this, but if you have any improvement proposal, I’m happy to hear any feedback.
I hope it helps someone start using Logstash.