How to Install ELK Stack (Elasticsearch, Logstash, and Kibana) on RHEL 7
Contents
Some terms you should know:
DevOps: DevOps (development and operations) is an enterprise software development phrase used to mean a type of agile relationship between development and IT operations. The goal of DevOps is to change and improve the relationship by advocating better communication and collaboration between these two business units.
What does a DevOps Engineer do?
They are either developers who get interested in deployment and network operations, or sysadmins who have a passion for scripting and coding and move into the development side where they can improve the planning of test and deployment.
Log Analytics:
Servers, applications, websites, and connected devices generate discrete, time-stamped records of events called logs. Processing and analyzing these logs is called log analytics. Early log analytics solutions were designed for IT operational intelligence use cases, such as root cause analysis and infrastructure monitoring. Over time, log analytics solutions have incorporated additional data sources, machine learning, and other analytical techniques to enable additional use cases in application performance management (APM), security intelligence and event management (SIEM), and business analytics.
Importance of Log Analytics:
Log monitoring systems oversee network activity, inspect system events, and store user actions (e.g., renaming a file, opening an application) that occur inside your operating system. Some systems with logging capabilities do not automatically enable logging so it’s important to ensure all systems have logs turned on. Some systems generate logs but don’t provide event log management solutions
Companies want to capture and centralize all this data, so they can understand the relationship between operational, security, and change management events and maintain a comprehensive view of their infrastructure.
Why DevOps Tools?
It makes it easier for DevOps teams to scale automation and speed up productivity. IT automation eliminates repetitive tasks that allow teams to do more strategic work. It is an ideal tool to manage complex deployments and speed up the development process.
DevOps tools fit into one or more activities, which supports specific DevOps initiatives: Plan, Create, Verify, Package, Release, Configure, and Monitor.
Some of the DevOps tools are as below:
Nagios, ELK stack, Consul.io, Jenkins, etc.
In this article, we are going to talk about ELK stack and how-to setup on Linux machine.
What is ELK?
The ELK Stack is the most common log analytics solution in the modern IT world. It collects logs from all services, applications, networks, tools, servers, and more in an environment into a single, centralized location for processing and analysis. We use it for analytical purposes (e.g., to troubleshoot problems, monitor services, and reduce the time it takes to solve operational issues). Another use for this tool is for security and auditing (e.g., to monitor changes in security groups and changes in permissions). After receiving alerts on these issues, it is easy to act on unauthorized users and activities.
"ELK" is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana.
Elasticsearch is a search and analytics engine. It is a No-SQL database that indexes and stores information.
Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch.
Kibana lets users visualize data with charts and graphs in Elasticsearch.
Filebeat: Installed on client servers that will send their logs to Logstash, Filebeat serves as a log shipping agent that utilizes the lumberjack networking protocol to communicate with Logstash.
Why ELK?
It would be great to have all your logs aggregated into one place, so you can see the process flow and perform queries against the logs from all applications from one place. Debugging an issue requires logging into each individual box to look at the logs. With a small number of apps, it's not an issue, but it quickly becomes tedious as the number of apps increase.
Enter ELK stack. It will also improve your routine because no need to log into 100 different boxes to follow the logs.
How to setup ELK server:
We will install the first three components on a single server, which we will refer to as our ELK Server. Filebeat will be installed on all the client servers that we want to gather logs for, which we will refer to collectively as our Client Servers.
Pre-requisite:
-
Install java 8
Machine on which we will install ELK should have Java version 8 installed on it as. So, make sure that java open-jdk version 1.8.0_* is installed and running and in case it is not installed, then run the yum command to install or you can also use rpm package to install the same.
--install java through yum
# “yum install java-1.8.0-openjdk”
Post installation check
Check the version of java
# “java -version”
It should be java 8 version
Check the path of java (to assure in which directory it is installed)
“echo $JAVA_HOME”
--install java through rpm package
a) Firstly, you need to download the rpm package (java-1.8.0-openjdk*.rpm) from the java site.
b) Secondly, put the rpm package in the directory (where you want to install the java)
c) Thirdly, install the rpm package by the following command
# “rpm -ivh java-1.8.0-openjdk*.rpm”
Post installation check
# “java -version”
It should be java 8 version
Check the path of java (to assure in which directory it is installed)
“echo $JAVA_HOME”
To set the java home path you can follow the following steps:
Method 1:
JAVA_HOME=/opt/jdk1.8* (you can put the path as per your requirement)
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH
Method 2:
Edit the file “vim /etc/profile” and append with the following lines
export JAVA_HOME=/opt/jdk1.8* (you can put the path as per your requirement)
export PATH=$PATH:$JAVA_HOME/bin
- Set the hostname
# hostnamectl set-hostname "elk-stack.example.com"
Update /etc/hosts file
xx.xx.xx.xx elk-stack.example.com elk-stack (Note: xx.xx.xx.xx should be replaced with your server IP)
- Python should be installed on the server
Installation steps of ELK Stack:
-
Install Elasticsearch
We will start by importing the GPG keys for elasticsearch, this key will also be shared with logstash & kibana. To install elasticsearch, run the following command
“rpm –import https://packages.elastic.co/GPG-KEY-elasticsearch”
Firstly, you need to create a repo for elastic search to install the same.
- Go to the directory “cd /etc/yum.repos.d/” and edit the file
“vim /etc/yum.repos.d/elasticsearch.repo” name=Elasticsearch repository baseurl=http://packages.elastic.co/elasticsearch/ gpgcheck=1 gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch enabled=1
--run the following command to make the repo
# “yum repolist”
--Once the repo has been created, install the elastic search using yum
#” yum install elasticsearch”
--now start the service to check
# “systemctl start elasticsearch” # “systemctl enable elasticsearch”
NOTE: Check the firewall, do open the port 9200 for elastic search as default port for elastic search is 9200
Check with the following command to make sure it is working fine
# curl -XGET 'localhost:9200/_cat/indices?v’
You will look something like as below
getrid={ "query":{ "bool": { "must": { "match": { "TID": "" } }, "filter": { "range": { "eventdate": { "gte": from_date_time , "lte": to_date_time }} } } }
-
Install Logstash
Add logstash repository as it is done for elastic search above
“vim /etc/yum.repos.d/logstash.repo”
[logstash] name=Logstash baseurl=http://packages.elasticsearch.org/logstash/ gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1
--run the following command to make the repo
# “yum repolist”
--install logstash through yum
# “yum install logstash”
-
Install Kibana
Add kibana repository
“vim /etc/yum.repos.d/kibana.repo”
[kibana] name=Kibana baseurl=http://packages.elasticsearch.org/kibana/ gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1
--run the following command to make the repo
# “yum repolist”
--install kibana through yum
# “yum install kibana”
--start the kibana service
# “systemctl start kibana”
# “systemctl enable kibana”
NOTE: 5601 Port should be open for kibana
Now you can access the web page of kibana from the following url
How to configure logstash:
We will now create a configuration file for logstash under the folder ‘/etc/logstash/conf.d‘.
This file will be divided into three sections i.e. input, filter & output section as you are seeing in the above picture. We will start with these terms, what they mean and how it works.
Inputs-- You use inputs to get data into Logstash, inputs generate events that reads from a file on the filesystem, listens the process events on the port which is sent by the Filebeat client.
NOTE: Will discuss later the topic Filebeat
Filters-- Filters are intermediary processing device in the Logstash pipeline. You can combine filters with conditionals to perform an action on an event if it meets certain criteria.
Outputs-- Outputs are the final phase of the Logstash pipeline. An event can pass through multiple outputs, but once all output processing is complete, the event has finished its execution.
Configuration:
vim “/etc/logstash/conf.d/logstash.conf”
Inputs section: # input section input { beats { port => 5044 ssl => true ssl_certificate => "/etc/ssl/logstash_frwrd.crt" ssl_key => "/etc/ssl/logstash-forwarder.key" congestion_threshold => "40" } }
Filters section: will parse the logs before sending them to elasticsearch
# Filter section filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGLINE}" } } date { match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
Outputs section: it defines the storage of the logstash logs
#output section output { elasticsearch { hosts => localhost index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" } stdout { codec => rubydebug } }
Save and exit the file, start the logstash service again:
# “systemctl start logstash”
That’s all with ELK part, now let’s move on the next topic which plays a very important part for elastic search i.e. Filebeat
Filebeat
As I have mentioned earlier that filebeat should be installed on the client servers.
Filebeat is a log data shipper for local files. Installed as an agent on your servers, Filebeat monitors the log directories or specific log files, tails the files, and forwards them either to Elasticsearch or Logstash for indexing.
How Filebeat works?
When you start Filebeat, it starts one or more logs folder that look in the local paths you’ve specified for log files. For each log file that the logs locate, Filebeat starts a harvester. Each harvester reads a single log file for new content and sends the new log data to libbeat, which aggregates the events and sends the aggregated data to the output that you’ve configured for Filebeat.
To install filebeat you can follow the following two methods:
Method 1:
Install via rpm package
Download the package from the filebeat or elasticsearch site
Put the package under any directory
Run the below command to install the rpm package
# “rpm -ivh filebeat*.rpm”
Method 2:
Install via yum repository
First create a repo for a filebeat
#vi /etc/yum.repos.d/filebeat.repo
[beats] name=Elastic Beats Repository baseurl=https://packages.elastic.co/beats/yum/el/$basearch enabled=1 gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch gpgcheck=1
Make the repo “yum repolist”
Install the filebeat via yum
# “yum install filebeat”
Now make the following changes in the configuration file of filebeat to connect the client machine to the ELK server.
“vim /etc/filebeat/filebeat.yml”
----- under this section, you can allow the logs that needs to be analysed
paths:
- /var/log/*.log
- /var/log/secure
- /var/log/messages
-----
----- in the next section, change the document type to syslog
document_type: syslog
-----
----- last section is output, where you will define your ELK server IP address
output: logstash: hosts: ["xx.xx.xx.xx:5044"] tls: certificate_authorities: ["/etc/ssl/logstash_frwrd.crt"] -----
Now finally, start the filebeat service
# “systemctl start filebeat”
That’s it, configurations are now complete.
Information:
You can visit the site as below for more clarifications
Please install the package elasticsearch 6.0.0 (you can select the version according to your need) in the xx.xx.xx.xx machine.
The package can be downloaded from the below link
https://pypi.python.org/pypi/elasticsearch
The installation procedure is given in the below link
https://elasticsearch-py.readthedocs.io/en/master/
We have talked about ELK stack, filebeat. What they mean, how it works and how to configure these on RHEL machine.
I hope you find this post useful……
Please write comments if you find anything is incorrect, or you want to share more information about the above topic as discussed.
See you soon with the next post……stay updated… Happy Learning.