目 录CONTENT

文章目录

blackbox_exporter实现对URL状态、IP可用性、端口状态监控、及SSL证书到期监控

Seven
2023-03-08 / 0 评论 / 0 点赞 / 954 阅读 / 11350 字 / 正在检测是否收录...

一、blackbox_exporter介绍

blackbox_exporter 是 Prometheus 官方提供的一个 exporter, 可以监控 HTTP、HTTPS,、 DNS、 TCP 、 ICMP 等目标实例, 从而实现对被监控节点进行监控和数据采集。

  • HTTP/HTPPS: URL/API 可用性检测
  • TCP: 端口监听检测
  • ICMP: 主机存活检测
  • DNS: 域名解析

二、安装部署blackbox_exporter

wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.23.0/blackbox_exporter-0.23.0.linux-amd64.tar.gz

tar -zxvf blackbox_exporter-0.23.0.linux-amd64.tar.gz

mv blackbox_exporter-0.23.0.linux-amd64/ /usr/local/blackbox_exporter

使用systemd管理blackbox_exporter

[Unit]
Description=Prometheus Blackbox Exporter
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus

ExecStart=/usr/local/blackbox_exporter/blackbox_exporter \
--config.file=/usr/ocal/blackbox_exporter/blackbox.yml \
--web.listen-address=:9115
Restart=on-failure

[Install]
WantedBy=multi-user.target

查看运行状态:

[root@centos ~]# systemctl status blackbox-exporter.service 
● blackbox-exporter.service - Prometheus Blackbox Exporter
   Loaded: loaded (/etc/systemd/system/blackbox-exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-09-27 16:56:04 CST; 1min 13s ago
 Main PID: 29832 (blackbox_export)
    Tasks: 8 (limit: 49440)
   Memory: 4.9M
   CGroup: /system.slice/blackbox-exporter.service
           └─29832 /apps/blackbox_exporter/blackbox_exporter --config.file=/apps/blackbox_exporter/blackbox.yml --web.listen-address=:9115

Sep 27 16:56:04 centos systemd[1]: Started Prometheus Blackbox Exporter.
Sep 27 16:56:04 centos blackbox_exporter[29832]: ts=2022-09-27T08:56:04.252Z caller=main.go:256 level=info msg="Starting blackbox_exporter" version="(version=0.22.0, >
Sep 27 16:56:04 centos blackbox_exporter[29832]: ts=2022-09-27T08:56:04.253Z caller=main.go:257 level=info build_context="(go=go1.18.5, user=root@4d81de342d10, date=2>
Sep 27 16:56:04 centos blackbox_exporter[29832]: ts=2022-09-27T08:56:04.255Z caller=main.go:269 level=info msg="Loaded config file"
Sep 27 16:56:04 centos blackbox_exporter[29832]: ts=2022-09-27T08:56:04.257Z caller=main.go:417 level=info msg="Listening on address" address=:9115
Sep 27 16:56:04 centos blackbox_exporter[29832]: ts=2022-09-27T08:56:04.258Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
[root@centos ~]# netstat -tnlp | grep 9115
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 :::9115                 :::*                    LISTEN      29832/blackbox_expo 

2.1、blackbox exporter 实现 URL 监控

配置Prometheus.yaml

  - job_name: 'http_status'
    metrics_path: /probe #指定指标接口
    params: #指定查询参数,在Prometheus向target发送Get请求获取指标数据时,会传递到url上
      module: [http_2xx]
    static_configs:
      - targets: ['http://www.xiaomi.com', 'http://www.magedu.com']
        labels: #自定义标签,附加在target上
          instance: http_status
          group: web
    relabel_configs:
      - source_labels: [__address__] # 将__address__(当前监控目标URL地址的标签)修改为__param_target,用于传递给blackbox_exporter
        target_label: __param_target #标签key为__param_target、value为www.xiaomi.mkey为__param_target、value为www.magedu.com
      - source_labels: [__param_target] #基于__param_target获取监控目标
        target_label: url #将监控目标的值与 url 创建一个label
      - target_label: __address__ #新添加一个目标__address__,指向blackbox_exporter服务器地址,用于将监控请求发送给指定的blackbox_exporter服务器
        replacement: 172.16.88.20:9115 #指定blackbox_exporter服务器地址

#API Serevr节点发现
[root@centos prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
 SUCCESS: prometheus.yml is valid prometheus config file syntax

[root@centos prometheus]# systemctl restart prometheus.service 

image-1678260487777

image-1678260503280

2.2、blackbox exporter 实现 ICMP 监控

ICMP就是Ping使用的协议,可以探测IP是否在线:

[root@centos prometheus]# vim prometheus.yml
[root@centos prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
 SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@centos prometheus]# grep ping_status  -A10  prometheus.yml
  - job_name: 'ping_status'
    metrics_path: /probe
    params:
     module: [icmp]
    static_configs:
    - targets: ['172.16.88.254',"223.6.6.6"]
      labels:
        instance: 'ping_status'
        group: 'icmp'
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: ip
      - target_label: __address__
        replacement: 172.16.88.20:9115

#API Serevr节点发现
[root@centos prometheus]# systemctl restart prometheus.service 

image-1678260609773
image-1678260621362

2.3、blackbox exporter 实现端口监控

[root@centos prometheus]# vim prometheus.yml
[root@centos prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
 SUCCESS: prometheus.yml is valid prometheus config file syntax

[root@centos prometheus]# grep port_status  -A10  prometheus.yml
  - job_name: 'port_status'
    metrics_path: /probe
    params:
     module: [tcp_connect]
    static_configs:
      - targets: ['172.16.88.20:51234', '172.16.88.20:9256','172.16.88.20:22']
        labels:
          instance: 'port_status'
          group: 'port'
    relabel_configs:
    - source_labels: [__address__]
      target_label: __param_target
    - source_labels: [__param_target]
      target_label: instance
    - target_label: __address__
        replacement: 172.16.88.20:9115
#API Serevr节点发现
[root@centos prometheus]# systemctl restart prometheus.service 

image-1678260701547
image-1678260712565

Grafana模板
可在官网查找blackbox_exporter相关的模板:

https://grafana.com/grafana/dashboards/

推荐模板ID:9965, 13587

配置告警

创建告警规则

vim rules/blackbox_rules.yml
groups:
  - name: 服务进程监控
    rules:
      - alert: 进程异常
        expr: probe_success == 0
        for: 10s  # 告警持续时间,超过这个时间才会发送给alertmanager
        labels:
          severity: 严重告警
        annotations:
          summary: "{{ $labels.instance }} 进程异常"
          description: "{{ $labels.job }}进程{{ $labels.instance }}已经10秒无法连接,请前往服务器查看。"

      - alert: 进程响应延时
        expr: avg_over_time(probe_duration_seconds[1m]) > 1
        for: 1m
        labels:
          severity: 一般告警
        annotations:
          summary: "{{ $labels.instance }} 进程延时响应"
          description: "{{ $labels.job }} 进程 {{ $labels.instance }} 响应延时超过1s,请注意服务状态。"
  
       - alert: 证书过期提醒
        expr: (probe_ssl_earliest_cert_expiry -time())/3600/24 < 10
        for: 600m
        labels:
          severity: 一般告警
        annotations:
          summary: "{{ $labels.instance }} 证书有效期不足10天"
          description: "{{ $labels.job }} 网站 {{ $labels.url }} 证书即将过期,请注意续期证书,证书将在{{ $value }}天后过期。"
          
    - alert: BlackboxProbeHttpFailure
      expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Blackbox probe HTTP failure (instance {{ $labels.instance }})
      description: "HTTP status code is not 200-399\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  # Blackbox probe slow HTTP
  - alert: BlackboxProbeSlowHttp
    expr: avg_over_time(probe_http_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: Blackbox probe slow HTTP (instance {{ $labels.instance }})
      description: "HTTP request took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  # Blackbox probe slow ping
  - alert: BlackboxProbeSlowPing
    expr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: Blackbox probe slow ping (instance {{ $labels.instance }})
      description: "Blackbox ping took more than 1s\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

prometheus配置添加告警rules

vim prometheus.yaml
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
  - "./rules/blackbox_rules.yml"

重载prometheus服务。
systemctl restart prometheus.service
查看:
image-1711360574156
测试:
image-1711360675439
完成。

0

评论区