收到研發經理的需求:統計某具體時間段內某服務請求API接口的數據。 統計緯度有 1. 按各接口的總請求次數分類彙總; 2. 各接口每分鐘的請求數,取出請求數最多的20個API接口; 數據展示有指定的格式。

Nginx訪問日誌中的數據格式如下 - - [07/Sep/2016:18:23:22 +0800] "POST /bususer/flybusLogin.do HTTP/1.1" 200 0 "-" "okhttp/3.1.2"


$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"

其中的/bususer/flybusLogin.do即為請求的API接口,指定時間範圍內的數據有 772,528 條。

本人的處理方案是: 1. 使用vim提取指定時間範圍內的數據存儲到臨時文件中;(也可通過sed進行) 2. 使用scp將該臨時文件從遠程服務器中下載到本地; 3. 使用awkbash進行數據的清洗、提取; 4. 將提取的數據存入MariaDB數據庫中; 4.1 另一方案:將提取的數據存入另一臨時文件中; 5. 數據入庫完成後,通過SQL語句進行數據分析彙總;

前兩步手動進行,之後的操作通過撰寫Shell Script執行。

數據清洗、提取、入庫 全程耗時將近5個小時,嚴重影響到工作效率。



item detail
OS CentOS Linux release 7.2.1511 (Core)
Kernel 3.10.0-327.28.3.el7.x86_64
CPU Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Memory 8GB







其在官網網站的介紹如下: >GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.

關於其特性及支持的Web日誌格式,具體可見 鏈接

常見問題見 鏈接


GoAccess目前的最新版本是 1.0.2,其官方的安裝文檔見 GoAccess - Downloads,以下相關內容皆來自該文檔。



The only dependency is ncurses.

Distro NCurses GeoIP (optional) Tokyo Cabinet (optional)
Ubuntu/Debian libncursesw5-dev libgeoip-dev libtokyocabinet-dev
Fedora/RHEL/CentOS ncurses-devel geoip-devel tokyocabinet-devel
Arch Linux ncurses geoip compile from source
Gentoo sys-libs/ncurses dev-libs/geoip dev-db/tokyocabinet
Slackware ncurses GeoIP tokyocabinet

You may need to install tools like gcc, make, etc for compiling/building software from source. e.g., base-devel, build-essential, "Development Tools".


sudo yum install -y epel-release
sudo yum install -y gcc make ncurses ncurses-devel geoip geoip-devel tokyocabinet tokyocabinet-devel

sudo apt-get install libncursesw5-dev libgeoip-dev libtokyocabinet-dev

Via Compile Installation

編譯安裝步驟見 鏈接


# 創建目錄 /usr/local/bin
/usr/bin/mkdir -p '/usr/local/bin'
# 將可執行文件 goaccess 複製到該目錄中
/usr/bin/install -c goaccess '/usr/local/bin'

# 創建目錄 /usr/local/etc
/usr/bin/mkdir -p '/usr/local/etc'
# 將文件 goaccess.conf 複製到該目錄中並賦予644權限
/usr/bin/install -c -m 644 config/goaccess.conf '/usr/local/etc'

# 創建目錄 /usr/local/share/doc/goaccess
/usr/bin/mkdir -p '/usr/local/share/doc/goaccess'
# 將相關文件複製到該目錄中
/usr/bin/install -c -m 644 resources/tpls.html resources/css/app.css resources/css/bootstrap.min.css resources/css/fa.min.css resources/js/app.js resources/js/charts.js resources/js/d3.v3.min.js resources/js/hogan.min.js '/usr/local/share/doc/goaccess'

# 創建目錄 /usr/local/share/man/man1
/usr/bin/mkdir -p '/usr/local/share/man/man1'
# 將文件 goaccess.1 複製到該目錄中並賦予644權限
/usr/bin/install -c -m 644 goaccess.1 '/usr/local/share/man/man1'



sudo make uninstall


cd '/usr/local/bin' && rm -f goaccess
cd '/usr/local/etc' && rm -f goaccess.conf
cd '/usr/local/share/doc/goaccess' && rm -f tpls.html app.css bootstrap.min.css fa.min.css app.js charts.js d3.v3.min.js hogan.min.js
cd '/usr/local/share/man/man1' && rm -f goaccess.1

Configure Options

Multiple options can be used to configure GoAccess. For a complete up-to-date list of configure options, run ./configure --help

  • --enable-debug: Compile with debugging symbols and turn off compiler optimizations. Default is disabled
  • --enable-utf8: Compile with wide character support. Ncursesw is required.
  • --enable-geoip: Compile with GeoLocation support. MaxMind’s GeoIP is required. Default is disabled
  • --enable-tcb=<memhash|btree>: Compile with Tokyo Cabinet storage support. memhash will utilize Tokyo Cabinet’s on-memory hash database. btree will utilize Tokyo Cabinet’s on-disk B+ Tree database. Default is disabled
  • --disable-zlib: Disable zlib compression on B+ Tree database.
  • --disable-bzip: Disable bzip2 compression on B+ Tree database.
  • --with-getline: Use GNU getline() to parse full line requests instead of a fixed size buffer of 4096.

其餘選項 * --disable-option-checking: ignore unrecognized –enable/–with options * --disable-FEATURE: do not include FEATURE (same as –enable-FEATURE=no) * --enable-FEATURE[=ARG]: include FEATURE [ARG=yes] * --enable-silent-rules: less verbose build output (undo: “make V=1”) * --disable-silent-rules: verbose build output (undo: “make V=0”) * --enable-dependency-tracking: do not reject slow dependency extractors * --disable-dependency-tracking: speeds up one-time build

執行yum info tokyocabinet可看到如下信息 >Tokyo Cabinet is a library of routines for managing a database. It is the successor of QDBM. Tokyo Cabinet runs very fast. For example, the time required to store 1 million records is 1.5 seconds for a hash database and 2.2 seconds for a B+ tree database. Moreover, the database size is very small and can be up to 8EB. Furthermore, the scalability of Tokyo Cabinet is great.


# 數據存儲在內存中

# 數據存儲在磁盤中

注意:數據存儲在磁盤中要比存儲在內存中耗時。 >A dataset of about 52M hits (12GB size) is parsed in 20 mins (in-memory), 60 mins (on-disk storage). – https://goaccess.io/faq


# Way 1: Directly Download

## 通過curl
curl -# -o /tmp/goaccess.tar.gz http://tar.goaccess.io/goaccess-1.0.2.tar.gz
cd /tmp && tar -xzf goaccess.tar.gz -C /tmp
## 通過wget
# wget -q http://tar.goaccess.io/goaccess-1.0.2.tar.gz -P /tmp
# cd /tmp && tar -xzf goaccess-1.0.2.tar.gz -C /tmp

mv goaccess-1.0.2 goaccess
cd ./goaccess

# Way 2: Build from GitHub (Development)
git clone https://github.com/allinurl/goaccess.git /tmp #下載到臨時目錄中
cd /tmp/goaccess
autoreconf -fiv

# compile
# ./configure --prefix=/usr/local --sysconfdir=/etc/goaccess --enable-geoip --enable-utf8
./configure --prefix=/usr/local --sysconfdir=/etc/goaccess --enable-geoip --enable-utf8 --enable-tcb=memhash --with-getline
make -j 4   #Specifies the number of jobs (commands) to run simultaneously.
sudo make install


[flying@lempstacker ~]$ which goaccess
[flying@lempstacker ~]$ ls /etc/goaccess/goaccess.conf
[flying@lempstacker ~]$ goaccess -V
GoAccess - 1.0.2.
For more details visit: http://goaccess.io
Copyright (C) 2009-2016 by Gerardo Orellana
[flying@lempstacker ~]$

Via Package Manager Installation

通過各GNU/Linux發行版的包管理器安裝,具體見 鏈接


sudo yum install -y epel-release
sudo yum install -y goaccess #在epel倉庫中
  • Official GoAccess’ Debian/Ubuntu Repository 通過其官方倉庫可以安裝最新版本的GoAccess
echo "deb http://deb.goaccess.io/ $(lsb_release -cs) main" | sudo tee -a /etc/apt/sources.list.d/goaccess.list
wget -O - http://deb.goaccess.io/gnugpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install goaccess

Note: .deb packages in the official repo are available through https as well. You may need to install apt-transport-https.

Command Line / Config Options

以下是命令參數 鏈接

The following options can be supplied to the command or specified in the configuration file. If specified in the configuration file, long options need to be used without prepending --.

Removing the query string with -q can greatly decrease memory consumption, especially on timestamped requests.


GoAccess官方文檔Manual PageCUSTOM LOG/DATE FORMAT部分對GoAccess的配置文件及time-formatdate-formatlog-format的格式進行了簡要說明。


The configuration file resides under: %sysconfdir%/goaccess.conf or ~/.goaccessrc Note %sysconfdir% is either /etc/, /usr/etc/ or /usr/local/etc/




goaccess -f /tmp/api_access.log


| Log Format Configuration                                  |
| [SPACE] to toggle - [ENTER] to proceed                    |
|                                                           |
| [ ] NCSA Combined Log Format                              |
| [ ] NCSA Combined Log Format with Virtual Host            |
| [ ] Common Log Format (CLF)                               |
| [ ] Common Log Format (CLF) with Virtual Host             |
| [ ] W3C                                                   |
| [ ] Squid Native Format                                   |
|                                                           |
| Log Format - [c] to add/edit format                       |
|                                                           |
|                                                           |
| Date Format - [d] to add/edit format                      |
|                                                           |
|                                                           |
| Time Format - [t] to add/edit format                      |
|                                                           |


在配置文件%sysconfdir%/goaccess.conf的對應位置按如下格式進行修改 或 直接寫入~/.goaccessrc

# Apache/NGINX's log formats below.
time-format %H:%M:%S

# Apache/NGINX's log formats below.
date-format %d/%b/%Y

# NCSA Combined Log Format
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

再次執行goaccess -f /tmp/api_access.logGoAccess自動開始分析(parse)日誌中數據,分析完成後即能顯示圖表數據,如何在Shell窗口中使用快捷按鍵進行交互,可參考其官方文檔中的 INTERACTIVE KEYS 部分。


time-format 的格式%T%H:%M:%S,如果時間戳(timestamp)設置了微秒(microsecond),則必須設置%f。具體可通過命令man strftime查看。

The time-format variable followed by a space, specifies the log format date containing any combination of regular characters and special format specifiers. They all begin with a percentage (%) sign. See man strftime. %T or %H:%M:%S.

Note: If a timestamp is given in microseconds, %f must be used as time-format

  • %H: The hour as a decimal number using a 24-hour clock (range 00 to 23).
  • %M: The minute as a decimal number (range 00 to 59).
  • %S: The second as a decimal number (range 00 to 60). (The range is up to 60 to allow for occasional leap seconds.)


具體可通過命令man strftime查看。

The date-format variable followed by a space, specifies the log format date containing any combination of regular characters and special format specifiers. They all begin with a percentage (%) sign. See man strftime.

Note: If a timestamp is given in microseconds, %f must be used as date-format

  • %b: The abbreviated month name according to the current locale.
  • %d: The day of the month as a decimal number (range 01 to 31).
  • %Y: The year as a decimal number including the century.


The log-format variable followed by a space or \t for tab-delimited, specifies the log format string.




Shell Windows



goaccess -f /tmp/api_access.log




sed -n '/02\/Sep\/2016/,/19\/Sep\/2016:12/ p' /tmp/api_access.log > /tmp/access.log

goaccess -f /tmp/access.log -o ~/Desktop/report.html
# goaccess -f /tmp/access.log > ~/Desktop/report.html

google-chrome ~/Desktop/report.html


本文只進行簡單的測試,更詳細的示例參見 https://github.com/allinurl/goaccess#usage



此處附上本人寫的腳本 創建數據表

create database if not exists log_analysis
    default character set=utf8
    default collate=utf8_general_ci;

use log_analysis;

create table if not exists log_analysis (
    id int unsigned not null auto_increment primary key comment '自增id',
    ip char(20) not null comment 'ip地址',
    access_time timestamp not null comment '訪問接口時間,格式 YYYY-MM-DD HH:MM:SS',
    method enum('POST','GET') default 'POST' comment 'api訪問methond',
    api_url char(60) not null comment 'api 地址',
    response_code char(3) not null comment '響應碼',
    key `log_ip` (`ip`),
    key `log_access_time` (`access_time`),
    key `log_method` (`method`),
    key `log_api_url` (`api_url`),
    key `log_response_code` (`response_code`)
)engine=innodb default charset=utf8 collate=utf8_general_ci comment '接口訪問數據統計表';

# log_path='/tmp/test.txt'
target_file=`mktemp -t tempXXXXX.txt`
# 數據格式 - - [07/Sep/2016:18:23:22 +0800] "POST /bususer/flybusLogin.do HTTP/1.1" 200 0 "-" "okhttp/3.1.2"

while read line; do
    ip=`echo "$line" | awk '{print $1}'`
    api_url=`echo "$line" | awk '{print $7}'`
    response_code=`echo "$line" | awk '{print $9}'`

    # 日期時間處理 access_time
    timetemp=`echo "$line" | awk '{print $4}'`
    ymd=`echo $ymdtemp | awk -v FS='/' '{printf("%s-%s-%s",$3,$2,$1)}'`
    access_time=$ymd' '$hms

    # method處理 method
    methodtemp=`echo "$line" | awk '{print $6}'`

    mysql -e " insert into log_analysis.log_analysis set ip='$ip',access_time='$access_time',method='$method',api_url='$api_url',response_code='$response_code';"

    # echo "$ymd $hms $hm $api_url $ip $response_code $method" >> "$target_file"

    unset ip
    unset api_url
    unset response_code
    unset timetemp
    unset hms
    unset ymdtemp
    unset ymd
    unset access_time
    unset methodtemp
    unset method
    # echo "$response_code"

    # awk '{print $1}' "$line"

# End Script


mysql -D log_analysis -Bse "select api_url as 'API',count(id) as Frequency from log_analysis group by api_url order by Frequency desc;" | awk '{printf("%s —— %s\n",$1,$2)}'

mysql -D log_analysis -Bse "select api_url, count(id) as num, date_format(access_time, '%Y年%m月%d日 %H時%i分') as accessTime from log_analysis group by api_url,date_format(access_time, '%Y年%m月%d日 %H時%i分') order by num desc limit 20;" | awk '{printf("%s —— %s —— %s\n",$1,$2,$3)}'


Change Logs

  • 2016.09.20 18:26 Tue Asia/Shanghai
    • 初稿完成

  • Note Time: 2016.09.20 18:26 Tue
  • Note Location: Asia/Shanghai
  • Writer: lempstacker