本人此前只研究了Zabbix的源碼編譯安裝,之後並未對其進行深入研究。本文主要討論的是Zabix中的agent自動註冊機制。agent主動聯繫server,實現server自動(其實是被動,於操作者而言該過程是自動進行的)發現並監控agent主機。

Zabbix官方文檔關於這部分內容在 Active agent auto-registration,期間踩了不少坑,幸一一解決。日誌在故障的偵測過程中起了重要作用。

本次測試一共六臺機器,一臺宿主機(server),五臺虛擬機(agent)。server是個人筆電,虛擬機通過vagrant創建。

Preparation

Host Role IP
NoteBook Server 192.168.0.107
zabbix1 agent 192.168.0.151
zabbix2 agent 192.168.0.152
zabbix3 agent 192.168.0.153
zabbix4 agent 192.168.0.154
zabbix5 agent 192.168.0.155

操作系統使用的是CentOS Linux release 7.2.1511 (Core)

以下是Vagrantfile文件

# -*- mode: ruby -*-
# vi: set ft=ruby :

# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
# you're doing.

boxes = [
    { :name => :zabbix1, :role => 'agent', :ip => '192.168.0.151' },
    { :name => :zabbix2, :role => 'agent', :ip => '192.168.0.152' },
    { :name => :zabbix3, :role => 'agent', :ip => '192.168.0.153' },
    { :name => :zabbix4, :role => 'agent', :ip => '192.168.0.154' },
    { :name => :zabbix5, :role => 'agent', :ip => '192.168.0.155' },
]

Vagrant.configure(2) do |config|
    config.vm.box = "CentOS7Minimal"

    boxes.each do |opts|
        config.vm.define opts[:name] do |config|
            config.vm.network "public_network",ip: opts[:ip],bridge: "wlp3s0"
            config.vm.host_name = "%s.vm" % opts[:name].to_s
            # config.ssh.insert_key = false
            config.vm.provision "shell", inline: <<-SHELL
                sudo yum install -y http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-agent-3.0.1-1.el7.x86_64.rpm
                sudo systemctl enable zabbix-agent.service
                sudo systemctl start zabbix-agent.service
            SHELL
        end
    end

end
  • 使用vagrant up初始化並啓動各虛擬機
  • 使用vagrant ssh登錄,如vagrant ssh zabbix1
  • 使用vagrant halt關閉虛擬機,可關閉指定虛擬機,也可全部關閉
[flying@lemp zabbix]$ vagrant halt
==> zabbix5: Attempting graceful shutdown of VM...
==> zabbix4: Attempting graceful shutdown of VM...
==> zabbix3: Attempting graceful shutdown of VM...
==> zabbix2: Attempting graceful shutdown of VM...
==> zabbix1: Attempting graceful shutdown of VM...
[flying@lemp zabbix]$

Agent Configuration

Agent主機(虛擬機)中的zabbix-agent通過yum安裝,其配置文件路徑爲/etc/zabbix,需要對文件/etc/zabbix/zabbix_agentd.conf進行修改,主要設置如下幾個選項:

192.168.0.107是Server(監控主機)的IP

#Server=127.0.0.1
Server=192.168.0.107

#ServerActive=127.0.0.1
ServerActive=192.168.0.107

#此處的Hostname保證惟一,此名即監控系統中的Host name
#Hostname=Zabbix server
Hostname=Zabbix server-lemp-01

# HostMetadataItem=
HostMetadataItem=system.uname

更改完成後重啓服務sudo systemctl restart zabbix-agent.service

Server Configuration

Server主機的配置文件路徑爲/etc/zabbix,須對文件/etc/zabbix/zabbix_server.conf進行修改

ListenIP選項全部註釋

#ListenIP=0.0.0.0
#ListenIP=127.0.0.1

更改完成後重啓服務sudo systemctl restart zabbix-server.service

Testing

以上操作完成後,登錄Zabbix系統,Configuration –> Hosts 中即可看到新偵測到的主機。

Zabbix server-lemp-01	Applications 10	Items 46	Triggers 19	Graphs 9	Discovery 2	Web	192.168.0.151: 10050	Template OS Linux (Template App Zabbix Agent)	Enabled	ZBXSNMPJMXIPMI	NONE
Zabbix server-lemp-02	Applications 10	Items 46	Triggers 19	Graphs 9	Discovery 2	Web	192.168.0.152: 10050	Template OS Linux (Template App Zabbix Agent)	Enabled	ZBXSNMPJMXIPMI	NONE
Zabbix server-lemp-03	Applications 10	Items 46	Triggers 19	Graphs 9	Discovery 2	Web	192.168.0.153: 10050	Template OS Linux (Template App Zabbix Agent)	Enabled	ZBXSNMPJMXIPMI	NONE
Zabbix server-lemp-04	Applications 10	Items 46	Triggers 19	Graphs 9	Discovery 2	Web	192.168.0.154: 10050	Template OS Linux (Template App Zabbix Agent)	Enabled	ZBXSNMPJMXIPMI	NONE
Zabbix server-lemp-05	Applications 10	Items 46	Triggers 19	Graphs 9	Discovery 2	Web	192.168.0.155: 10050	Template OS Linux (Template App Zabbix Agent)	Enabled	ZBXSNMPJMXIPMI	NONE

Errorings

遇到報錯、故障,首先查看日誌。Zabbix日誌路徑/var/log/zabbix

#虛擬機agent
[vagrant@zabbix1 ~]$ ls /var/log/zabbix/
zabbix_agentd.log
[vagrant@zabbix1 ~]$

#宿主機server
[flying@lemp ~]$ ls /var/log/zabbix/
zabbix_agentd.log  zabbix_server.log
[flying@lemp ~]$
  • active check configuration update from [192.168.0.107:10051] started to fail (cannot connect to [[192.168.0.107]:10051]: [113] No route to host)

關閉宿主機防火牆後解決

  • active check configuration update from [192.168.0.107:10051] started to fail (cannot connect to [[192.168.0.107]:10051]: [111] Connection refused)

宿主機中文件/etc/zabbix/zabbix_server.conf中的ListenIP=127.0.0.1註釋後解決

  • Zabbix SIAReceived empty response from Zabbix Agent at [192.168.0.153]. Assuming that agent dropped connection because of access permissions

虛擬機中文件/etc/zabbix/zabbix_agentd.conf中的Server=127.0.0.1改爲Server=192.168.0.107後解決


References


Change Log

  • 2016.04.07 19:24 Thu Asia/Beijing
    • 初稿完成

  • Note Time: 2016.04.07 19:24 Thu
  • Note Location: Asia/Beijing
  • Writer: lempstacker