Logstash & Real-time analysis of Web logs | Cloud computing

Posted by Rose.S on Wed, 05 Jan 2022 10:09:34 +0100

1. Install Logstash

1.1 problems

This case requires:

  • Create a virtual machine and install logstash
  • Minimum configuration: 2cpu, 2G memory, 10G hard disk
  • Virtual machine IP: 192.168.1.47 logstash

1.2 steps

To implement this case, you need to follow the following steps.

Step 1: install logstash

1) Configure the host name, ip and yum source, and configure / etc/hosts

[root@logstash ~]# vim /etc/hosts
192.168.1.41    es-0001
192.168.1.42    es-0002
192.168.1.43    es-0003
192.168.1.44    es-0004
192.168.1.45    es-0005
192.168.1.46    kibana
192.168.1.47    logstash

2) Install java-1.8.0-openjdk and logstash

[root@logstash ~]# yum -y install java-1.8.0-openjdk logstash
[root@logstash ~]# java -version
openjdk version "1.8.0_161"
OpenJDK Runtime Environment (build 1.8.0_161-b14)
OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
[root@logstash ~]# ln -s /etc/logstash /usr/share/logstash/config 
[root@logstash ~]# vim /etc/logstash/conf.d/my.conf
input { 
  stdin {}
}
filter{ }
output{ 
  stdout{}
}
[root@logstash ~]# /usr/share/logstash/bin/logstash

2. Write logstash configuration file

2.1 problems

This case requires:

  • Write logstash configuration file
  • Standard input adopts json encoding format
  • The standard output adopts rubydebug coding format
  • Start logstash validation

2.2 steps

To implement this case, you need to follow the following steps.

Step 1: codec class plug-in

1) codec class plug-in

[root@logstash ~]# vim /etc/logstash/conf.d/my.conf
input { 
  stdin { codec => "json" }
}
filter{ }
output{ 
  stdout{ codec => "rubydebug" }
}
[root@logstash ~]# /usr/share/logstash/bin/logstash
Settings: Default pipeline workers: 2
Pipeline main started
a
{
       "message" => "a",
           "tags" => [
          [0] "_jsonparsefailure"
],
      "@version" => "1",
    "@timestamp" => "2020-05-23T12:34:51.250Z",
          "host" => "logstash"
}

3. Logstash input plug-in

3.1 problems

This case requires:

  • Write logstash configuration file
  • Read the data from the file and display it on the screen
  • Start logstash validation

3.2 steps

To implement this case, you need to follow the following steps.

Step 1: file module plug-in

1) file module plug-in

[root@logstash ~]# vim /etc/logstash/conf.d/my.conf
input { 
  file {
    path => ["/tmp/c.log"]
    type => "test"
    start_position => "beginning"
    sincedb_path => "/var/lib/logstash/sincedb"
  }
}
filter{ }
output{ 
  stdout{ codec => "rubydebug" }
}
[root@logstash ~]# rm -rf /var/lib/logstash/plugins/inputs/file/.sincedb_*
[root@logstash ~]# touch /tmp/a.log /tmp/b.log
[root@logstash ~]# /usr/share/logstash/bin/logstash

Open another terminal: write data

[root@logstash ~]#  echo a1 >> /tmp/a.log 
[root@logstash ~]#  echo b1 >> /var/tmp/b.log

Previous terminal view:

[root@logstash ~]# /usr/share/logstash/bin/logstash
Settings: Default pipeline workers: 2
Pipeline main started
{
       "message" => "a1",
      "@version" => "1",
    "@timestamp" => "2019-03-12T03:40:24.111Z",
          "path" => "/tmp/a.log",
          "host" => "logstash",
          "type" => "testlog"
}
{
       "message" => "b1",
      "@version" => "1",
    "@timestamp" => "2019-03-12T03:40:49.167Z",
          "path" => "/tmp/b.log",
          "host" => "logstash",
          "type" => "testlog"
}

4. Web log parsing experiment

4.1 problems

This case requires:

  • Web log parsing experiment
  • Copy a web log and add it to the file
  • Use grok to match the meaning of each field of the log and convert it into json format

4.2 steps

To implement this case, you need to follow the following steps.

Step 1: filter grok module plug-in

grok plug-in:

Parsing various unstructured log data plug-ins

grok uses regular expressions to structure structured data

In grouping matching, regular expressions need to be written according to specific data structures

Although it is difficult to write, it has wide applicability

Parse Apache logs. You can not install those that have been installed before

The browser accesses the web page in / var/log/httpd/access_log there are logs

[root@es-0005 ~]# cat /var/log/httpd/access_log
192.168.1.254 - - [12/Mar/2019:11:51:31 +0800] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"
[root@logstash ~]#  vim /etc/logstash/logstash.conf
input{
    file {
      path           => [ "/tmp/a.log", "/tmp/b.log" ]
      sincedb_path   => "/var/lib/logstash/sincedb"
      start_position => "beginning"
      type           => "testlog"
   }
}
filter{
    grok{
       match => [ "message",  "(?<key>reg)" ]
    }
}
output{
    stdout{ codec => "rubydebug" }
}

Copy / var/log/httpd/access_log to / tmp/c.log under logstash

[root@logstash ~]# echo '192.168.1.252 - - [29/Jul/2020:14:06:57 +0800] "GET /info.html HTTP/1.1" 200 119 "-" "curl/7.29.0"' >/tmp/c.log
[root@logstash ~]# vim /etc/logstash/conf.d/my.conf
input { 
  file {
    path => ["/tmp/c.log"]
    type => "test"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
filter{ 
  grok {
    match => { "message" => "%{HTTPD_COMBINEDLOG}" }
  }
}
output{ 
  stdout{ codec => "rubydebug" }
}
[root@logstash ~]# /usr/share/logstash/bin/logstash

Find regular macro path

[root@logstash ~]# cd 
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns
[root@logstash ~]# cat httpd / / find COMBINEDAPACHELOG
COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}
[root@logstash ~]#  vim /etc/logstash/logstash.conf
...
filter{
   grok{
        match => ["message", "%{ HTTPD_COMBINEDLOG }"]
  }
}
...

Parsed results

 [root@logstash ~]#  /opt/logstash/bin/logstash -f  /etc/logstash/logstash.conf  
Settings: Default pipeline workers: 2
Pipeline main started
{
        "message" => "192.168.1.254 - - [15/Sep/2018:18:25:46 +0800] \"GET /noindex/css/open-sans.css HTTP/1.1\" 200 5081 \"http://192.168.1.65/\" \"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0\"",
       "@version" => "1",
     "@timestamp" => "2018-09-15T10:55:57.743Z",
           "path" => "/tmp/a.log",
ZZ           "host" => "logstash",
           "type" => "testlog",
       "clientip" => "192.168.1.254",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "15/Sep/2019:18:25:46 +0800",
           "verb" => "GET",
        "request" => "/noindex/css/open-sans.css",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "5081",
       "referrer" => "\"http://192.168.1.65/\"",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0\""
}
...

5. Deploy beats and filebeat

5.1 problems

This case requires:

  • Get through the whole process of ELK
  • Install and configure beats plug-in on logstash
  • Installing filebeat on a web server
  • Use filebeat to collect web logs and send them to logstash
  • Convert the log into json format and store it in elasticsearch

5.2 steps

To implement this case, you need to follow the following steps.

Step 1: filter grok module plug-in

[root@logstash ~]# vim /etc/logstash/conf.d/my.conf
input { 
  stdin { codec => "json" }
  file{
    path => ["/tmp/c.log"]
    type => "test"
    start_position => "beginning"
    sincedb_path => "/var/lib/logstash/sincedb"
  }
  beats {
    port => 5044
  }
} 
filter{ 
  grok {
    match => { "message" => "%{HTTPD_COMBINEDLOG}" }
  }
} 
output{ 
  stdout{ codec => "rubydebug" }
  elasticsearch {
    hosts => ["es-0004:9200", "es-0005:9200"]
    index => "weblog-%{+YYYY.MM.dd}"
  }
}
[root@logstash ~]# /usr/share/logstash/bin/logstash

2) Install filebeat on the host where Apache was previously installed

[root@web ~]# yum install -y filebeat
[root@web ~]# vim /etc/filebeat/filebeat.yml
24:  enabled: true
28:  - /var/log/httpd/access_log
45:    fields: 
46:       my_type: apache
148, 150 Comment out
161: output.logstash:
163:   hosts: ["192.168.1.47:5044"]
180, 181, 182 Comment out
[root@web ~]# grep -Pv "^\s*(#|$)" /etc/filebeat/filebeat.yml
[root@web ~]# systemctl enable --now filebeat

Exercise

1 what is kibana

Data visualization platform tools

2 what are the logstash plug-ins

codec class plug-in, file plug-in, tcp and udp plug-in, syslog plug-in, filter grok plug-in

In case of infringement, please contact the author to delete

Topics: Front-end Linux Operation & Maintenance CentOS cloud computing