Promethues + Grafana + AlertManager使用总结
Prometheus是一個開源監控報警系統和時序列數據庫,通常會使用Grafana來美化數據展示。
1. 監控系統基礎架
1.1核心組件
Prometheus Server, 主要用于抓取數據和存儲時序數據,另外還提供查詢和 Alert Rule 配置管理。
exporters ,數據采樣器,例如采集機器數據的node_exporter,采集MongoDB 信息的 MongoDB exporter 等等。
alertmanager ,用于告警通知管理。
Grafana ,監控數據圖表化展示模塊。
2. 基礎組件安裝
由于是學習研究使用,這里通過docker快速安裝環境。
2.1 安裝Node Exporter
docker-compose-node-export.yml
version: '3'
services:
node-exporter:
image: prom/node-exporter
container_name: node-exporter
hostname: node-exporter
restart: always
ports:
- "9100:9100"
2.2 安裝Alert Manager
docker-compose-alertmanager.yml
version: '3'
services:
alertmanager:
image: prom/alertmanager
container_name: alertmanager
hostname: alertmanager
restart: always
volumes:
- /data/docker_file/monitor/conf/alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
alertmanager.yml
global:
smtp_smarthost: 'smtp.qq.com:25' #QQ服務器
smtp_from: '793272861@qq.com' #發郵件的郵箱
smtp_auth_username: '793272861@qq.com' #發郵件的郵箱用戶名,也就是你的郵箱
smtp_auth_password: '****************' #發郵件的郵箱密碼
smtp_require_tls: false #不進行tls驗證
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 10m
receiver: live-monitoring
receivers:
- name: 'live-monitoring'
email_configs:
- to: '793272861@qq.com' #收郵件的郵箱
2.3 安裝Prometheus
docker-compose-prometheus.yml
version: '3'
services:
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: always
volumes:
- /data/docker_file/prometheus/data:/prometheus
- /data/docker_file/prometheus/conf/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
# 配置定時任務,輪詢拉取監控數據
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['prometheus:9090']
- job_name: 'node-exporter'
scrape_interval: 5s
static_configs:
- targets: ['node-exporter:9100']
Prometheus服務發現機制
通過consul實現自動服務發現
訪問:http://localhost:9090/
2.4 安裝Grafana
docker-compose-grafana.yml
version: '3'
services:
grafana:
image: grafana/grafana
container_name: grafana
hostname: grafana
restart: always
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- /data/docker_file/grafana/data:/var/lib/grafana
- /data/docker_file/grafana/log:/var/log/grafana
ports:
- "3000:3000"
添加數據源(Prometheus)
訪問:http://localhost:30000/ , 默認用戶名:admin,密碼:admin
2.5 Docker-Compose腳本
version: '3'
services:
prometheus:
image: prom/prometheus
container_name: prometheus
hostname: prometheus
restart: always
volumes:
- /data/docker_file/prometheus/data:/prometheus
- /data/docker_file/prometheus/conf/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
networks:
- monitor
alertmanager:
image: prom/alertmanager
container_name: alertmanager
hostname: alertmanager
restart: always
volumes:
- /data/docker_file/monitor/conf/alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
networks:
- monitor
grafana:
image: grafana/grafana
container_name: grafana
hostname: grafana
restart: always
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- /data/docker_file/grafana/data:/var/lib/grafana
- /data/docker_file/grafana/log:/var/log/grafana
ports:
- "3000:3000"
networks:
- monitor
node-exporter:
image: prom/node-exporter
container_name: node-exporter
hostname: node-exporter
restart: always
ports:
- "9100:9100"
networks:
- monitor
networks:
monitor:
driver: bridge
3. 配置Grafana DashBoard
Grafana通過PromQL查詢語句從Prometheus拉取數據,并有Pannel進行渲染,一個個Grafana Pannel 組成一個Grafana DashBoard。
3.1下載Grafana DashBoard文件
可以從官網下載已經寫好的Grafana DashBoard文件,導入到我們Grafana系統就可以直接使用。
推薦的Grafana DashBoard
JVM (Micrometer)
Spring Boot 2.1 Statistics
主機基礎監控(cpu,內存,磁盤,網絡)
Node Exporter for Prometheus Dashboard CN
Druid Connection Pool Dashboard
導入Grafana DashBoard
3.2 添加修改Grafana Panel(擴展)
官方自帶的Spring Boot 2.1 Statistics Dashboard沒有展示第三方請求的數據報表,我們以此為例,添加第三方請求的Client Request Count報表和Client Response Time報表。
Client Request Count
irate(http_client_requests_seconds_count{instance="$instance", application="$application", uri!~".*actuator.*"}[5m])
注意:應用中的Meter的名稱必須為http.client.requests
Client Response Time
irate(http_client_requests_seconds_sum{instance="$instance", application="$application",uri!~".*actuator.*"}[5m]) / irate(http_client_requests_seconds_count{instance="$instance", application="$application",uri!~".*actuator.*"}[5m])
4. Spring Boot 集成Micrometer
Metrics(譯:指標,度量)
Micrometer提供了與供應商無關的接口,包括 timers(計時器), gauges(量規), counters(計數器), distribution summaries(分布式摘要), long task timers(長任務定時器)。它具有維度數據模型,當與維度監視系統結合使用時,可以高效地訪問特定的命名度量,并能夠跨維度深入研究。
4.1 引入依賴
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>${micrometer.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
4.2 開啟Prometheus功能
spring:
application:
name: spring-boot-node
management:
metrics:
# 1.添加全局的tags,后面可以作為變量搜索數據
tags:
application: ${spring.application.name}
endpoints:
web:
exposure:
# 2.打開prometheus端點功能
include: 'health,prometheus'
4.3 實現第三方請求的監控
基于OkHttpMetricsEventListener可以有好的對OkHttp Client的請求進行監控。
配置OkHttp Client事件監聽
@Bean("okHttpClient")
public OkHttpClient okHttpClient(ConnectionPool connectionPool) {
return new OkHttpClient().newBuilder().connectionPool(connectionPool)
.connectTimeout(5, TimeUnit.SECONDS)
.readTimeout(10, TimeUnit.SECONDS)
.eventListener(eventListener())
.build();
}
/**
* 事件監聽器 OkHttpMetricsEventListener
* metricsProperties.getWeb().getClient().getRequestsMetricName() equals 'http.client.request',可稱為度量。
* @return
*/
private EventListener eventListener(){
return OkHttpMetricsEventListener.builder(
meterRegistry, metricsProperties.getWeb().getClient().getRequestsMetricName())
.build();
}
原理:OkHttpMetricsEventListener.java
public class OkHttpMetricsEventListener extends EventListener {
/**
* Header name for URI patterns which will be used for tag values.
*/
public static final String URI_PATTERN = "URI_PATTERN";
@Override
public void callFailed(Call call, IOException e) {
CallState state = callState.remove(call);
if (state != null) {
state.exception = e;
// 請求完成時,注冊監控數據
time(state);
}
}
@Override
public void responseHeadersEnd(Call call, Response response) {
CallState state = callState.remove(call);
if (state != null) {
state.response = response;
// 請求完成時,注冊監控數據
time(state);
}
}
private void time(CallState state) {
String uri = state.response == null ? "UNKNOWN" :
(state.response.code() == 404 || state.response.code() == 301 ? "NOT_FOUND" : urlMapper.apply(state.request));
// 定義一些Tag或者是變量,在Prometheus和Grafana中可以使用
Iterable<Tag> tags = Tags.concat(extraTags, Tags.of(
"method", state.request != null ? state.request.method() : "UNKNOWN",
"uri", uri,
"status", getStatusMessage(state.response, state.exception),
"host", state.request != null ? state.request.url().host() : "UNKNOWN"
));
// 注冊計時器監控數據,此時Prometheus可以通過Spring Boot Actuator提供的/actuator/promotheus斷點來pull數據
Timer.builder(this.requestsMetricName)
.tags(tags)
.description("Timer of OkHttp operation")
.register(registry)
.record(registry.config().clock().monotonicTime() - state.startTime, TimeUnit.NANOSECONDS);
}
}
4.4 Spring Boot集成案例
Spring Boot Node
5. 參考文檔
【1】Grafana Dashboards
【2】Centos7.X 搭建Prometheus+node-exporter+Grafana實時監控平臺
【3】Micrometer 快速入門
【4】JVM應用度量框架Micrometer實戰
【5】SpringBoot+Prometheus:微服務開發中自定義業務監控指標的幾點經驗
總結
以上是生活随笔為你收集整理的Promethues + Grafana + AlertManager使用总结的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: title与h1标签的区别和联系
- 下一篇: 中国空间站首个大型对日定向装置亮相:55