當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

在 k8s 中部署 Prometheus 和 Grafana

發(fā)布時(shí)間：2023/12/4 编程问答 27 豆豆

生活随笔收集整理的這篇文章主要介紹了在 k8s 中部署 Prometheus 和 Grafana 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

部署 Prometheus 和 Grafana 到 k8s

Intro

上次我們主要分享了 asp.net core 集成 prometheus，以及簡單的 prometheus 使用，在實(shí)際在 k8s 中部署的時(shí)候就不能在使用前面講的靜態(tài)配置的方式來部署了，需要使用 Prometheus 的服務(wù)發(fā)現(xiàn)。

部署規(guī)劃

Prometheus 和 Grafana 的部署放在一個(gè)單獨(dú)的 namespace —— monitoring 下面，這樣的好處在于可以屏蔽掉一些細(xì)節(jié)，別的 namespace 無感知，也不需要知道它們的存在

可以使用 kubectl create namespace monitoring 來創(chuàng)建命名空間或者 kubectl apply 執(zhí)行下面的 yaml 配置

apiVersion:?v1 kind:?Namespace metadata:name:?monitoring

希望 prometheus 和 grafana 可以公網(wǎng)訪問，所以需要配置一下端口號，NodePort 31100~31200 保留為基礎(chǔ)設(shè)施使用的端口，31110 保留為 prometheus 需要的端口，31120 保留為 Grafana 端口，端口規(guī)劃好后，就可以先配置 nginx 了，增加 nginx 配置如下：

server?{listen?443;server_name?monitoring.weihanli.xyz;location?/?{proxy_pass?http://172.18.0.2:31110;proxy_set_header?X-Real-IP?$remote_addr;proxy_set_header?X-Forwarded-For?$proxy_add_x_forwarded_for;} } server?{listen?443;server_name?grafana.weihanli.xyz;location?/?{proxy_pass?http://172.18.0.2:31120;proxy_set_header?X-Real-IP?$remote_addr;proxy_set_header?X-Forwarded-For?$proxy_add_x_forwarded_for;} }

Grafana 比較簡單，部署一個(gè) service，部署一個(gè) deployment 就可以了，Prometheus 要把配置文件放到 ConfigMap 里單獨(dú)管理，另外 Prometheus 涉及到要使用 k8s 服務(wù)發(fā)現(xiàn)，需要創(chuàng)建一個(gè) serviceAccount 以有權(quán)限來獲取 k8s 中的資源

部署 Grafana

部署 deployment，deployment yaml 如下，可以根據(jù)自己需要進(jìn)行調(diào)整

apiVersion:?apps/v1 kind:?Deployment metadata:name:?grafananamespace:?monitoringlabels:app:?grafana spec:replicas:?1revisionHistoryLimit:?2selector:matchLabels:app:?grafanaminReadySeconds:?0strategy:type:?RollingUpdaterollingUpdate:maxUnavailable:?1maxSurge:?1template:metadata:labels:app:?grafanaspec:containers:????????-?name:?grafanaimage:?grafana/grafanaimagePullPolicy:?IfNotPresentresources:limits:memory:?"128Mi"cpu:?"50m"readinessProbe:httpGet:path:?/api/healthport:?3000initialDelaySeconds:?60periodSeconds:?10livenessProbe:tcpSocket:port:?3000initialDelaySeconds:?60periodSeconds:?10ports:-?containerPort:?3000

根據(jù)上面的 yaml 定義創(chuàng)建 Grafana 的 deploy，創(chuàng)建之后再創(chuàng)建 service

apiVersion:?v1 kind:?Service metadata:name:?grafananamespace:?monitoring spec:selector:app:?grafanatype:?NodePortports:-?protocol:?TCPport:?3000targetPort:?3000nodePort:?31120

創(chuàng)建之后就可以在 k8s 集群外部訪問到 Grafana 了，通過前面 nginx 的配置我們就可以直接通過域名訪問了

部署 Prometheus

ServiceAccount

首先我們先創(chuàng)建一個(gè) Service Account，k8s 使用基于角色的 RBAC 授權(quán)機(jī)制，創(chuàng)建 ServiceAccount 之后還需要創(chuàng)建一個(gè) ClusterRole 和 ClusterRoleBinding，ClusterRole 用于指定權(quán)限，ClusteRoleBinding 用來給 serviceAccount 關(guān)聯(lián)角色，為了方便這幾個(gè)都定義在了一個(gè) yaml 文件中

apiVersion:?v1 kind:?ServiceAccount metadata:name:?prometheusnamespace:?monitoring--- apiVersion:?rbac.authorization.k8s.io/v1 kind:?ClusterRole metadata:name:?prometheus rules: -?apiGroups:?[""]resources:-?nodes-?services-?endpoints-?podsverbs:?["get",?"list",?"watch"] -?apiGroups:?[""]resources:-?configmapsverbs:?["get"] -?nonResourceURLs:?["/metrics"]verbs:?["get"]--- apiVersion:?rbac.authorization.k8s.io/v1 kind:?ClusterRoleBinding metadata:name:?prometheus roleRef:apiGroup:?rbac.authorization.k8s.iokind:?ClusterRolename:?prometheus subjects: -?kind:?ServiceAccountname:?prometheusnamespace:?monitoring

ConfigMap

創(chuàng)建 ServiceAccount 之后，我們創(chuàng)建 Prometheus 的配置文件，放在 ConfigMap 中掛載在 Prometheus 里

apiVersion:?v1 kind:?ConfigMap metadata:name:?prometheus-confignamespace:?monitoring data:default:?|#?my?global?configglobal:scrape_interval:?????10s?#?Set?the?scrape?interval?to?every?15?seconds.?Default?is?every?1?minute.evaluation_interval:?15s?#?Evaluate?rules?every?15?seconds.?The?default?is?every?1?minute.#?Load?rules?once?and?periodically?evaluate?them?according?to?the?global?'evaluation_interval'.rule_files:#?-?"first_rules.yml"#?-?"second_rules.yml"#?A?scrape?configuration?containing?exactly?one?endpoint?to?scrape:scrape_configs:-?job_name:?'kubernetes-service-endpoints'kubernetes_sd_configs:-?role:?endpointsrelabel_configs:-?source_labels:?[__meta_kubernetes_service_annotation_prometheus_io_should_be_scraped]action:?keepregex:?true-?action:?labelmapregex:?__meta_kubernetes_pod_label_(.+)-?source_labels:?[__meta_kubernetes_namespace]action:?replacetarget_label:?k8s_namespace-?source_labels:?[__meta_kubernetes_service_name]action:?replacetarget_label:?k8s_service-?source_labels:?[__meta_kubernetes_pod_name]separator:?;regex:?(.*)replacement:?$1target_label:?k8s_podaction:?replace

執(zhí)行上面的 yaml 配置以部署 prometheus 需要的配置

我們可以利用 prometheus 的 relabel 的機(jī)制將一些元數(shù)據(jù)信息應(yīng)用的 metrics 信息上，這樣我們就可以知道這個(gè) metrics 信息是來自哪一個(gè) namespace 下面哪一個(gè) service 哪一個(gè) Pod 里，在 Prometheus targets 的界面可以看到所有的 metadata label，或者參考文檔上的介紹 https://prometheus.io/docs/prometheus/latest/configuration/configuration#kubernetes_sd_config

__meta_kubernetes_service_annotation_prometheus_io_should_be_scraped 是我后面加上的，不加這個(gè)的話，會嘗試從所有的 k8s 資源中獲取 metrics 信息，這回導(dǎo)致很多沒有集成 Prometheus metrics 的資源也會被持續(xù)訪問，所以增加了這個(gè)配置，如果 service 里的 annotation 里有 prometheus.io/should_be_scraped 配置的話 Prometheus 才會去拉取 metrics 信息

需要 Prometheus 抓取 metrics 的 service 配置實(shí)力：

apiVersion:?v1 kind:?Service metadata:name:?reservation-serverannotations:prometheus.io/should_be_scraped:?"true" spec:selector:app:?reservation-servertype:?NodePortports:-?protocol:?TCPport:?80targetPort:?80nodePort:?31220

如果后面需要配置不同的 metrics_path，也可以使用類似的模式來增加一個(gè) prometheus.io/metrics-path 類似的配置轉(zhuǎn)換成真正要拉取 metrics 信息的 path 即可

Deployment

前面 Prometheus 部署所需要的 serviceAccount 和 config 我們都已經(jīng)準(zhǔn)備好了，執(zhí)行下面的 yaml 配置就可以部署應(yīng)用了

apiVersion:?apps/v1 kind:?Deployment metadata:name:?prometheusnamespace:?monitoringlabels:app:?prometheus spec:replicas:?1revisionHistoryLimit:?2?#?how?many?old?ReplicaSets?for?this?Deployment?you?want?to?retain,?https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#clean-up-policyselector:matchLabels:app:?prometheusminReadySeconds:?0strategy:type:?RollingUpdaterollingUpdate:maxUnavailable:?1maxSurge:?1template:metadata:labels:app:?prometheusspec:serviceAccountName:?prometheuscontainers:-?name:?prometheusimage:?prom/prometheusimagePullPolicy:?IfNotPresentresources:limits:memory:?"512Mi"cpu:?"200m"readinessProbe:httpGet:path:?/-/readyport:?9090initialDelaySeconds:?60periodSeconds:?10livenessProbe:httpGet:path:?/-/healthyport:?9090initialDelaySeconds:?60periodSeconds:?10ports:-?containerPort:?80volumeMounts:-?name:?configmountPath:?/etc/prometheus/prometheus.ymlsubPath:?defaultvolumes:-?name:?configconfigMap:name:?prometheus-config

Service

deployment 創(chuàng)建之后，只要根據(jù)下面的配置創(chuàng)建 service 就可以訪問了

apiVersion:?v1 kind:?Service metadata:name:?prometheusnamespace:?monitoring spec:selector:app:?prometheustype:?NodePortports:-?protocol:?TCPport:?9090targetPort:?9090nodePort:?31110

Sample

運(yùn)行 kubectl get all -n monitoring 查看部署之后的資源情況：

Resources

打開 prometheus 可以執(zhí)行一個(gè)簡單的查詢，看一下

Prometheus query

在 Grafana 中添加 DataSource，域名使用 service name prometheus 即可，這樣可以通過內(nèi)網(wǎng)去訪問，就不需要繞公網(wǎng)走一圈了

新建一個(gè) Dashboard 把剛才的查詢通過 Grafana 來做一個(gè)展示，新建一個(gè) Panel，輸入剛才我們執(zhí)行的查詢

Legend 中可以使用 lable，使用語法可以用 {{label_name}}

可以在右側(cè)方便設(shè)置顯示最小值，最大值，平均值，當(dāng)前值和總計(jì)

如果要添加篩選條件如只看某一個(gè) app 的數(shù)據(jù)，可以在查詢表達(dá)式中添加條件，使用語法 metrics_name{label_name="label_value"}

更多查詢語法可以參考官方文檔的介紹 https://prometheus.io/docs/prometheus/latest/querying/basics/

上面部署的時(shí)候沒有做數(shù)據(jù)的掛載，實(shí)際部署的時(shí)候需要考慮掛載數(shù)據(jù)目錄，這樣即使服務(wù)重啟，數(shù)據(jù)還是在的，如果不關(guān)心數(shù)據(jù)問題的話可以忽略

Reference

https://github.com/OpenReservation/ReservationServer/blob/dev/k8s/prometheus/deployment.yaml
https://github.com/OpenReservation/ReservationServer/blob/dev/k8s/prometheus/configMap.yaml
https://github.com/OpenReservation/ReservationServer/blob/dev/k8s/grafana/deployment.yaml
https://github.com/OpenReservation/ReservationServer/blob/dev/k8s/grafana/service.yaml
https://medium.com/kubernetes-tutorials/monitoring-your-kubernetes-deployments-with-prometheus-5665eda54045
https://prometheus.io/docs/prometheus/latest/configuration/configuration
https://prometheus.io/docs/prometheus/latest/querying/basics/

總結(jié)

以上是生活随笔為你收集整理的在 k8s 中部署 Prometheus 和 Grafana的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：如何在 Asp.Net Core 实现
下一篇：小心使用 Task.Run 续篇