Zabbix是一個linux中常用的監控軟件了,我們可以使用Zabbix監控Memcached PHP-FPM Tomcat Nginx MySQL 網站日志了,下面一起來看一個例子哦.
Zabbix作為監控軟件非常的靈活,支持的數據類型非常豐富,比如數字(無正負),數字(浮點),日志,文字等,我們需要做的就是使用腳本來收集好數據,然后zabbix收集并畫圖,設置告警線,這里我們來學習使用Zabbix監控Memcached、PHP-FPM、Tomcat、Nginx、MySQL及網站日志.
Memcached監控,自定義鍵值:
UserParameter=memcached.stat[*],/data/sh/memcached-status.sh "$1"
memcached-status.sh腳本內容為:
- #!/bin/bash
- item=$1
- ip=127.0.0.1
- port=11211
- (echo "stats";sleep 0.5) | telnet $ip $port 2>/dev/null | grep "STAT $item\b" | awk '{print $3}'
導入模板,memcached zabbix模板下載
PHP-FPM監控,配置php-fpm狀態頁,打開php-fpm.conf配置文件,添加如下配置后重啟php:
自定義鍵值:UserParameter=php-fpm[*],/data/sh/php-fpm-status.sh "$1"
php-fpm-status.sh腳本內容:
- #!/bin/bash
- ##################################
- # Zabbix monitoring script
- #
- # php-fpm:
- # - anything available via FPM status page
- #
- ##################################
- # Contact:
- # vincent.viallet@gmail.com
- ##################################
- # ChangeLog:
- # 20100922 VV initial creation
- ##################################
- # Zabbix requested parameter
- ZBX_REQ_DATA="$1"
- # FPM defaults
- URL="http://localhost/fpm_status"
- WGET_BIN="/usr/bin/wget"
- #
- # Error handling:
- # - need to be displayable in Zabbix (avoid NOT_SUPPORTED)
- # - items need to be of type "float" (allow negative + float)
- #
- ERROR_NO_ACCESS_FILE="-0.9900"
- ERROR_NO_ACCESS="-0.9901"
- ERROR_WRONG_PARAM="-0.9902"
- ERROR_DATA="-0.9903" # either can not connect / bad host / bad port
- # save the FPM stats in a variable for future parsing
- FPM_STATS=$($WGET_BIN -q $URL -O - 2> /dev/null)
- # error during retrieve
- if [ $? -ne 0 -o -z "$FPM_STATS" ]; then
- echo $ERROR_DATA
- exit 1
- fi
- #
- # Extract data from FPM stats
- #
- RESULT=$(echo "$FPM_STATS" | sed -n -r "s/^$ZBX_REQ_DATA: +([0-9]+)/\1/p")
- if [ $? -ne 0 -o -z "$RESULT" ]; then
- echo $ERROR_WRONG_PARAM
- exit 1
- fi
- echo $RESULT
- exit 0
導入模板,php-fpm zabbix模板下載
Tomcat監控,剛開始決定監控Tomcat時,使用的是JMX,不過這貨設置太復雜了,而且對防火墻要求還挺高,需要開放幾個端口。只好使用Tomcat自帶的狀態頁來監控了。
自定義鍵值:UserParameter=tomcat.status[*],/data/sh/tomcat-status.py $1
因為需要解析到xml,所以還是決定用python實現比較方便.
- /data/sh/tomcat-status.py腳本內容:
- #!/usr/bin/python
- import urllib2
- import xml.dom.minidom
- import sys
- url = 'http://127.0.0.1:8080/manager/status?XML=true'
- username = 'username'
- password = 'password'
- passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
- passman.add_password(None, url, username, password)
- authhandler = urllib2.HTTPBasicAuthHandler(passman)
- opener = urllib2.build_opener(authhandler)
- urllib2.install_opener(opener)
- pagehandle = urllib2.urlopen(url)
- xmlData = pagehandle.read()
- doc = xml.dom.minidom.parseString(xmlData)
- item = sys.argv[1]
- if item == "memory.free":
- print doc.getElementsByTagName("memory")[0].getAttribute("free")
- elif item == "memory.total":
- print doc.getElementsByTagName("memory")[0].getAttribute("total")
- elif item == "memory.max":
- print doc.getElementsByTagName("memory")[0].getAttribute("max")
- elif item == "threadInfo.maxThreads":
- print doc.getElementsByTagName("threadInfo")[0].getAttribute("maxThreads")
- elif item == "threadInfo.currentThreadCount":
- print doc.getElementsByTagName("threadInfo")[0].getAttribute("currentThreadCount")
- elif item == "threadInfo.currentThreadsBusy":
- print doc.getElementsByTagName("threadInfo")[0].getAttribute("currentThreadsBusy")
- elif item == "requestInfo.maxTime":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("maxTime")
- elif item == "requestInfo.processingTime":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("processingTime")
- elif item == "requestInfo.requestCount":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("requestCount")
- elif item == "requestInfo.errorCount":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("errorCount")
- elif item == "requestInfo.bytesReceived":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("bytesReceived")
- elif item == "requestInfo.bytesSent":
- print doc.getElementsByTagName("requestInfo")[0].getAttribute("bytesSent") //Vevb.com
- else:
- print "unsupport item."
這個腳本是監控Tomcat7的,Tomcat6沒有試過,應該區別在狀態頁的url以及管理頁面的用戶密碼設置上,以上腳本可運行需要在tomcat-users.xml里添加用戶,至少權限為manager-status.
導入模板:tomcat zabbix模板下載
Nginx監控,配置Nginx狀態頁,在nginx配置文件server{}中加入:
- location /nginx_status {
- stub_status on;
- access_log off;
- }
自定義鍵值:UserParameter=nginx[*],/data/sh/nginx-status.sh "$1"
nginx-status.sh腳本內容:
- #!/bin/bash
- ##################################
- # Zabbix monitoring script
- #
- # nginx:
- # - anything available via nginx stub-status module
- #
- ##################################
- # Contact:
- # vincent.viallet@gmail.com
- ##################################
- # ChangeLog:
- # 20100922 VV initial creation
- ##################################
- # Zabbix requested parameter
- ZBX_REQ_DATA="$1"
- ZBX_REQ_DATA_URL="$2"
- # Nginx defaults
- URL="http://127.0.0.1/nginx_status"
- WGET_BIN="/usr/bin/wget"
- #
- # Error handling:
- # - need to be displayable in Zabbix (avoid NOT_SUPPORTED)
- # - items need to be of type "float" (allow negative + float)
- #
- ERROR_NO_ACCESS_FILE="-0.9900"
- ERROR_NO_ACCESS="-0.9901"
- ERROR_WRONG_PARAM="-0.9902"
- ERROR_DATA="-0.9903" # either can not connect / bad host / bad port
- # save the nginx stats in a variable for future parsing
- NGINX_STATS=$($WGET_BIN -q $URL -O - 2> /dev/null)
- # error during retrieve
- if [ $? -ne 0 -o -z "$NGINX_STATS" ]; then
- echo $ERROR_DATA
- exit 1
- fi
- #
- # Extract data from nginx stats
- #
- case $ZBX_REQ_DATA in
- active_connections) echo "$NGINX_STATS" | head -1 | cut -f3 -d' ';;
- accepted_connections) echo "$NGINX_STATS" | grep -Ev '[a-zA-Z]' | cut -f2 -d' ';;
- handled_connections) echo "$NGINX_STATS" | grep -Ev '[a-zA-Z]' | cut -f3 -d' ';;
- handled_requests) echo "$NGINX_STATS" | grep -Ev '[a-zA-Z]' | cut -f4 -d' ';; //Vevb.com
- reading) echo "$NGINX_STATS" | tail -1 | cut -f2 -d' ';;
- writing) echo "$NGINX_STATS" | tail -1 | cut -f4 -d' ';;
- waiting) echo "$NGINX_STATS" | tail -1 | cut -f6 -d' ';;
- *) echo $ERROR_WRONG_PARAM; exit 1;;
- esac
- exit 0
導入模板,nginx zabbix模板下載
MySQL監控:MySQL的監控,zabbix是默認支持的,已經有現成的模板,現成的鍵值,我們需要做的只是在/var/lib/zabbix里新建一個.my.cnf文件,內容如下:
- [client]
- host=127.0.0.1
- port=1036
- user=root
- password=root
網站日志監控,配置日志格式,我們假設你用的web服務器是Nginx,我們添加一個日志格式,如下:
- log_format withHost '$remote_addr\t$remote_user\t$time_local\t$host\t$request\t'
- '$status\t$body_bytes_sent\t$http_referer\t'
- '$http_user_agent';
我們使用tab作分隔符,為了方便awk識別列的內容,以防出錯,然后再設置全局的日志,其它server就不需要設置日志了:
access_log /data/home/logs/nginx/$host.log withHost;
定時獲取一分鐘日志,設置一個定時任務:
* * * * * /data/sh/get_nginx_access.sh
腳本內容為:
- #!/bin/bash
- logDir=/data/home/logs/nginx/
- logNames=`ls ${logDir}/*.*.log |awk -F"/" '{print $NF}'`
- for $logName in $logNames;
- do
- #設置變量
- split_log="/tmp/split_$logName"
- access_log="${logDir}/$logName"
- status_log="/tmp/$logName"
- #取出最近一分鐘日志
- tac $access_log | awk '
- BEGIN{
- FS="\t"
- OFS="\t"
- cmd="date -d \"1 minute ago\" +%H%M%S"
- cmd|getline oneMinuteAgo
- }
- {
- $3 = substr($3,13,8)
- gsub(":","",$3)
- if ($3>=oneMinuteAgo){
- } else {
- exit;
- }
- }' > $split_log
- #統計狀態碼個數
- awk -F'\t' '{
- status[$4" "$6]++
- }
- END{
- for (i in status)
- {
- print i,status[i]
- }
- }
- ' $split_log > $status_log
- done
這個定時任務是每分鐘執行,因為我們監控的頻率是每分鐘,添加這個任務是為了取得最近一分鐘各域名的日志,以及統計各域名的所有狀態碼個數,方便zabbix來獲取所需的數據.
自定義鍵值:
- UserParameter=nginx.detect,/data/sh/nginx-detect.sh
- UserParameter=nginx.access[*],awk -v sum=0 -v domain=$1 -v code=$2 '{if($$1 == domain && $$2 == code ){sum+=$$3} }END{print sum}' /tmp/$1.log
- UserParameter=nginx.log[*],awk -F'\t' -v domain=$1 -v code=$2 -v number=$3 -v sum=0 -v line="" '{if ($$4 == domain && $$6 == code ){sum++;line=line$$5"\n" }}END{if (sum > number) print line}' /tmp/split_$1.log | sort | uniq -c | sort -nr | head -10 | sed -e 's/^/<p>/' -e 's/$/<\/p>/'
nginx-detect.sh腳本內容為:
- #!/bin/bash
- function json_head {
- printf "{"
- printf "\"data\":["
- }
- function json_end {
- printf "]"
- printf "}"
- }
- function check_first_element {
- if [[ $FIRST_ELEMENT -ne 1 ]]; then
- printf ","
- fi
- FIRST_ELEMENT=0
- }
- FIRST_ELEMENT=1
- json_head
- logNames=`ls /data/home/logs/nginx/*.*.log |awk -F"/" '{print $NF}'`
- for logName in $logNames;
- do
- while read domain code count;do
- check_first_element
- printf "{"
- printf "\"{#DOMAIN}\":\"$domain\",\"{#CODE}\":\"$code\""
- printf "}"
- done < /tmp/$logName
- done
- json_end
這里我們定義了三個鍵值,nginx.detect是為了發現所有域名及其所有狀態碼,nginx.access[*]是為了統計指定域名的狀態碼的數量,nginx.log[*]是為了測試指定域名的狀態碼超過指定值時輸出排在前十的url,我們監控nginx訪問日志用到了zabbix的自動發現功能,當我們增加域名時,不需要修改腳本,zabbix會幫助我們自動發現新增的域名并作監控.
新聞熱點
疑難解答