0663-6.2.0 - Get CDSW login information through Nginx

Posted by NiteCloak on Fri, 13 Sep 2019 07:40:47 +0200

Fayson's github: https://github.com/fayson/cdh project

Recommend the public number "Hadoop Practice", ID: gh_c4c535955d0f

1 Document Writing Purpose

Task Background: We need to record the audit information of CDSW login, such as when the user logged in, whether the login failed or not, and what is the user name.

Difficulty: At present, CDSW itself does not contain these information records, only simple user login success information. After integrating AD, the theory can be implemented on the AD side, but enterprise AD team does not cooperate, so it can only be done on the CDSW side.

Task Description: Because the source CDSW does not support, now consider using Nginx to forward the CDSW login page first, then use Nginx to capture the http page login request, and finally analyze the login request to achieve the goal.

The task is divided into three stages:

1. Implementation of Nginx configuration CDSW, that is, access to the address and port of Nginx can enter the CDSW page and perform tasks.

2. By configuring the configuration of Nginx, we try to capture the login information of CDSW.

3. Write Python or Shell scripts to parse login information and save it to MySQL or Impala for query analysis.

testing environment
1.Redhat7.2
2.CDSW version 1.5
3.Nginx version 1.16.0
4. Use root user to operate

2 Install and configure Nginx

1. Download the Nginx installation package and decompress it, then compile and install it.

The download address is as follows:

http://nginx.org/download/nginx-1.16.0.tar.gz

After decompression:

Compile:

./configure --with-stream

Installation:

make && make install

The default Nginx installation directory is / usr/local/nginx

Start Nginx after installation

2. Configure Nginx to capture HTTP request information of CDSW interface

Modify the Nginx configuration file/usr/local/nginx/conf/nginx.conf to configure forwarding for CDSW pages

server {
    listen       80;
    server_name  cdsw.macro.com;

    location / {
        proxy_pass http://cdsw.macro.com;
    }
}

Reload the Nginx configuration file

/usr/local/nginx/sbin/nginx -s reload

Modify the hosts file of the local Windows machine accessing the CDSW service

Add a line to the hosts file as follows:

192.168.0.177 cdsw.macro.com

Nginx service installed on 192.168.0.177 server

Access the CDSW page after adding:

Run session after login and execute the sample code successfully

3 Collect CDSW login information

1. Modify the configuration file of the Nginx service / usr/local/nginx/conf/nginx.conf

The logs of Nginx services support customized ways to modify the format to obtain the required login information. The first value is time, the second value is request information, the third value is status code, the fourth value is request body, the fifth is client IP address, the sixth is to record which page link is visited from, and the seventh is to record which page link is visited from. Is the information of the client browser. The configuration file of the Nginx service is reloaded after the modification is completed.

2. Log in on the CDSW page, and then check in the Nginx log to see if there is any login information.

Log in to a non-existent user on the page

View the log of the Nginx service


You can see the log can get the logon time, request mode, logon failure status code 401, logon account and password, then write a script to process the log, and then persist the logon information to MySQL.

3. The script is as follows:

#!/bin/bash
HOSTNAME="192.168.0.178"
PORT="3306"
USERNAME="root"
PASSWORD="123456"
DBNAME="cdsw_login_info"
TABLENAME="login_info"
log_dir=/usr/local/nginx/logs/
log_name=$(date -d "yesterday" +"%Y%m%d")

#Convert the hexadecimal quotation marks in the nginx log to the normal quotation marks and direct to the new log file of the command on the previous day
sed 's#\\x22#"#g' ${log_dir}access.log > ${log_dir}${log_name}.log
#Clear the nginx log file and make sure that each time you process the previous day's log
cat /dev/null > ${log_dir}access.log
#Read new log files by line for processing
cat ${log_dir}${log_name}.log | while read line
do
if [[ $line =~ "authenticate" ]]; then
    OIFS=$IFS; IFS="|"; set -- $line; aa=$1;bb=$3;cc=$4;source_ip=$5;referer=$6;user_agent=$7 IFS=$OIFS
    OIFS=$IFS; IFS='""'; set -- $cc; username=$7 IFS=$OIFS
    login_time=${aa:0:10}" "${aa:11:8}
    if [[ $bb = "200"  ]]; then
        login_state=1
    elif [[ $bb = "401"  ]]; then
        login_state=0
    else
        login_state=-1
    fi
    insert_sql="insert into ${DBNAME}.${TABLENAME}(source_ip,name,referer,user_agent,login_state,login_time) values('$source_ip','$username','$referer','$user_agent',$login_state,'$login_time')"
    mysql -h${HOSTNAME} -P${PORT} -u${USERNAME} -p${PASSWORD} -e "${insert_sql}"
fi
done


4. Execute scripts for testing to see the results in MySQL

Before executing scripts, create libraries and tables in MySQL

CREATE DATABASE cdsw_login_info;

CREATE TABLE `login_info` (
  `id` int(5) NOT NULL AUTO_INCREMENT,
  `source_ip` varchar(32),
  `name` varchar(16),
  `referer` varchar(32),
  `user_agent` varchar(256),
  `login_state` int(1),
  `login_time` timestamp,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

After executing the script, the information in the table is as follows:

As shown above, user login accounts, login failure, login time are stored in MySQL. You can use SQL to further count the number of user logins and other information.

5. Configure the timed task and execute the script once a day at 0:00 to output the error information to the error log.

0 0 * * * /root/collect_login_info/nginx111.sh 2>> /root/collect_login_info/error.log

Topics: Nginx MySQL github Hadoop