Prometheus Thanos 集群方案
一、Prometheus 缺陷与不足
先梳理下,当前单实例 Prometheus 架构存在的问题
单点问题
单台的 Prometheus 存在单点故障的风险,监控本身的可用性一样很重要
数据容量
默认情况下,存储到本地 tsdb 的采样数据只会保存 15 天,这个值可以通过 --storage.tsdb.retention.time
配置,值越大,对存储空间的要求也就越大,纵然我们可以扩展单个节点的磁盘以支持保存更久的指标数据
但是,数据并不 safe,这在需要保存长时间指标数据用以分析统计的场景很要命
二、Prometheus 联邦机制
Prometheus 内置支持的一种集群方式,核心原理是 级联抓取,例如:企业里有 N 套 Kubernetes 集群,或者在 N 个机房,每个机房(k8s集群)都部署一个 Prometheus 实例,此时这些 Prometheus 实例数据隔离开的
这里,我们可以选择使用 Grafana
、Nightingale
创建对应的 Prometheus 数据源,然后通过切换对应的 Dashboard 查看不同机房(k8s集群)的数据,但本质各 Prometheus 实例数据仍然是没有打通
联邦机制,可以把不同的实例数据聚拢到一个中心实例上,这时中心实例就成了瓶颈(单点、容量),为了缓解中心实例的容量问题,它应该配置为只抓取那些需要做聚合计算或其他团队也关注的指标,大部分指标仍下沉在各个边缘实例,让边缘实例消化大部分指标数据
示例配置:
scrape_configs:
- job_name: 'federate'
scrape_interval: 30s
# 中心节点 从 边缘节点 /federate 路径采集指标数据
metrics_path: '/federate'
# 当标签重复时,以源数据的标签为准
honor_labels: true
params:
# 通过正则匹配过滤出所有 aggr: 打头的指标(自定义的持久化规则命名规范)
'match[]':
- '{__name__=~"aggr:.*"}'
static_configs:
- targets:
- 'prometheus-edge1:9090'
- 'prometheus-edge2:9090'
联邦机制的使用场景
- 边缘 Prometheus 基本消化了绝大部分指标数据,也满足了日常告警、看图等需求
- 仅刮取少量需要做打通多实例做聚合计算的指标数据,以此避免 Prometheus 容量上限
总的来说,联邦机制并没有解决原本 Prometheus 该解决的 单点、容量 问题,只是提供了打通数据的简单思路(手段),所以,通常来说更常用的是 远程存储方案
三、Prometheus 远程存储
Thanos
Thanos 核心特点是使用对象存储做海量时序存储,可以用公有云的 OSS、或私有云 Minio 方案
Thanos 支持 sidecar
、receiver
方式与 Prometheus 进行集成,主要有以下几个组件
- Sidecar:Thanos Sidecar 边车容器,负责将 Prometheus TSDB 数据备份到对象存储中,此外,Sidecar 还实现了 Thanos 的 Store API,这样 Thanos Querier 就可以从 Prometheus 查询最近的指标数据
- Receiver:Prometheus 实例通过 Remote Write 将数据写入到 Receiver,Receiver 暂存数据到本地 TSDB,定期上传到对象存储中,此外,Receiver 也实现了 Thanos 的 Store API,作用也是让 Thanos Querier 使用
- Querier:Thanos Querier 提供一个全局的统一查询入口,组件支持对多实例的监控数据自动去重,以及聚合底层组件数据(OSS、Sidecar、Receiver)
- Store:将 OSS 云存储中的数据内容暴露给 Querier 组件查询 metrics 数据
- Compactor(可选):Thanos Compactor 组件用以将对象存储中的数据进行压缩、轮转,如 压缩 block(合并 block)、降采样(小的存储块合并为大存储块)、删除超期采样数据
- Ruler(可选):Thanos Ruler 组件用以评估 Prometheus 的记录规则、报警规则
- Query Frontend(可选):Query Frontend 组件通过拆分大型查询、缓存提升查询性能
sidecar 模式
需在 Prometheus 实例 Pod 中简单地添加一个 Sidecar 容器,Sidecar 可以选择每 2 小时将一个 TSDB 块写入对象存储,另外 Sidecar 作为服务暴露给 Thanos Querier 组件,用于查询近期内的指标数据,长期数据通过 Store 组件从 OSS 中查询
读取流程如下:
Client
通过query API
向Query
组件发起查询请求,Query
组件将请求转换成StoreAPI
发送到sidecar
、rule
、store
Sidecar
收到Query
组件的查询请求后,将其转换成query API
请求,发送给其绑定的Prometheus
,由Prometheus
从本地读取数据并响应(短期数据)Ruler
接收到来自Query
组件的查询请求,直接从本地读取数据并响应,返回短期的本地评估数据Store
接收到来自Query
组件的查询请求,首先从对象存储桶中遍历数据块的meta.json
,根据记录时间范围和标签先进行一次过滤,然后从对象存储桶中读取数据块的index
和chunks
进行查询(查询频率较高的index
会被缓存下来),返回长期的历史采集和评估指标
写入流程如下:
Prometheus
按照刮取频率从 metrics 接口抓取指标数据(以及持久化指标规则),指标数据以 TSDB 格式分块存储到本地,窗口期内存在 WAL,满足窗口期(2 小时)后,写入 TSDB 数据块Sidecar
嗅探到Prometheus
数据存储目录生成了新的数据块后,上传数据块到 OSS 存储,上传数据块时对meta.json
修改,添加thanos
相关的字段(如external_labels
)Ruler
根据配置的recording rules
定期地向Query
组件发起查询,获取评估所需的指标值,并将结果以 TSDB 格式分块存储到本地,当本地生成新数据块时,自行上传该数据块到 OSS 中做为长期历史数据保存Compact
定期对 OSS 中的数据块进行压缩和降准采样,每次压缩都会在对应的meta.json
中的 level 加 1(初始为 1),降采样时会创建新的数据块,根据采样步长,从原有的数据块中抽取值存储到新的数据块中,在meta.json
中记录resolution
为采样步长
告警流程如下:
Prometheus
根据自身配置的alerting
规则定期地对自身采集的指标进行评估,当告警条件满足的情况下发起告警到Alertmanager
上Ruler
根据自身配置的alerting
规则定期的向query
发起查询请求获取评估所需的指标,当告警条件满足的情况下发起告警到Alertmanager
Alertmanager
接收Prometheus
和Ruler
告警消息后进行分组合并,最终通过告警媒介发出警报
receiver 模式
大致流程和 Sidecar 仅是略有不同,所以简单过一下
- 读请求,Query 组件 走的是 Receiver 组件暂存的 TSDB 数据块
- 写请求,Prometheus 通过 Remote Write 远程网络写入,Receiver 根据哈希环配置,选择对应的实例,同时 Receiver 嗅探扫描本地新生成的数据块,发现新块就上传到 OSS
四、Thanos 部署
Kubernetes 集群初始化
申请 ECS 节点
$ python aliyun-ecs-sdk.py apply
Success. Instance creation succeed. InstanceIds: i-hp31frvfcnik5l3t3wdo, i-hp31frvfcnik5l3t3wdp, i-hp31frvfcnik5l3t3wdq
Instance boot successfully: node00001 39.104.22.155 172.16.0.13
Instance boot successfully: node00002 39.104.66.174 172.16.0.11
Instance boot successfully: node00003 39.104.169.161 172.16.0.12
修改 Master 节点 IP roles/kubernetes/vars/main.yml
K8S_MASTER_INTERNAL_ADVERTISE_ADDRESS: "172.16.0.13"
修改 Host roles/initial/files/hosts
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
172.16.0.13 k8s-master01
172.16.0.11 k8s-worker02
172.16.0.12 k8s-worker03
修改 ECS 主机名
$ ap -i alicloud.py --tags=initial setup.yml
刷新 阿里云 inventory 缓存
$ ./alicloud.py --refresh-cache
创建 Kubernetes 集群
ap -i alicloud.py --tags=kubernetes setup.yml
检查集群状态
kc get cs; kc get node; kc get pods -A; kc get --raw='/readyz?verbose'
正常情况,输出如下
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready control-plane,master 10m v1.22.2
k8s-worker02 Ready <none> 10m v1.22.2
k8s-worker03 Ready <none> 10m v1.22.2
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-2bhp5 1/1 Running 0 10m
kube-flannel kube-flannel-ds-4m9zq 1/1 Running 0 10m
kube-flannel kube-flannel-ds-jmvsf 1/1 Running 0 10m
kube-system coredns-5485ddfd7b-drwnn 1/1 Running 0 10m
kube-system coredns-5485ddfd7b-twc6v 1/1 Running 0 10m
kube-system etcd-k8s-master01 1/1 Running 0 10m
kube-system kube-apiserver-k8s-master01 1/1 Running 0 10m
kube-system kube-controller-manager-k8s-master01 1/1 Running 1 (10m ago) 10m
kube-system kube-proxy-dr2xj 1/1 Running 0 10m
kube-system kube-proxy-tgsxr 1/1 Running 0 10m
kube-system kube-proxy-vkf5d 1/1 Running 0 10m
kube-system kube-scheduler-k8s-master01 1/1 Running 1 (10m ago) 10m
[+]ping ok
[+]log ok
[+]etcd ok
[+]informer-sync ok
[+]poststarthook/start-kube-apiserver-admission-initializer ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/priority-and-fairness-config-consumer ok
[+]poststarthook/priority-and-fairness-filter ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/crd-informer-synced ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/priority-and-fairness-config-producer ok
[+]poststarthook/start-cluster-authentication-info-controller ok
[+]poststarthook/aggregator-reload-proxy-client-cert ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
[+]poststarthook/apiservice-openapi-controller ok
[+]shutdown ok
readyz check passed
Thanos Receiver 模式
1. namespace
创建数据目录
# k8s-worker02
$ mkdir -p /data/k8s/{prometheus,thanos-store-gateway-cache}
# k8s-worker03
$ mkdir -p /data/k8s/{grafana,minio,thanos-receiver}
创建 kube-mon
命名空间
$ kc apply -f ns.yml
2. RBAC
为 prometheus 创建 serviceaccount,role、clusterrolebinding
$ kc apply -f rbac.yml
3. StorageClass
资源定义
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
# 延迟绑定:等到第一个声明使用该 PVC 的 Pod 开始调度绑定
volumeBindingMode: WaitForFirstConsumer
创建资源
$ kc apply -f storageclass.yml
4. minio oss
MinIO 是开源的高性能分布式对象存储服务,为大规模私有云基础设施而设计,Minio 兼容 Amazon S3 云存储服务接口,适合于存储 图片、视频、日志文件、数据归档备份、和容器、虚拟机镜像等,对象文件可以是任意大小,从 kb ~ 5T 不等
下面部署为独立模式的 Minio 配置清单
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: monio-local
labels:
app: monio
spec:
accessModes:
# 读写权限,只允许被单个节点挂载
- ReadWriteOnce
capacity:
storage: 10Gi
storageClassName: local-storage
local:
path: /data/k8s/minio
persistentVolumeReclaimPolicy: Retain
# 节点亲和性,部署在 worker03 节点
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-worker03
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio-pvc
spec:
accessModes:
# 读写权限,只允许被单个节点挂载
- ReadWriteOnce
resources:
requests:
storage: 5G
storageClassName: local-storage # 最好使用LocalPV
---
apiVersion: v1
kind: Service
metadata:
name: minio
spec:
selector:
app: minio
type: NodePort
ports:
- name: console
port: 9001
targetPort: 9001
nodePort: 30091
- name: api
port: 9000
targetPort: 9000
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
spec:
selector:
matchLabels:
app: minio
template:
metadata:
labels:
app: minio
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: minio-pvc
containers:
- name: minio
image: minio/minio:latest
imagePullPolicy: IfNotPresent
args: ["server", "--console-address", ":9001", "/data"]
env:
# MINIO_ROOT_USER WebUI 登录用户名
- name: MINIO_ROOT_USER
value: "m1n10_AccessKey"
# MINIO_ROOT_USER WebUI 登录密码
- name: MINIO_ROOT_PASSWORD
value: "m1n10_SecretKey"
ports:
# API 服务
- containerPort: 9000
# WebUI
- containerPort: 9001
readinessProbe:
httpGet:
path: /minio/health/ready
port: 9000
livenessProbe:
httpGet:
path: /minio/health/ready
port: 9000
initialDelaySeconds: 10
periodSeconds: 10
volumeMounts:
- mountPath: /data
name: data
创建 minio 对象存储
$ kc apply -f minio-deploy.yml
检查确认
$ kc get deploy minio
NAME READY UP-TO-DATE AVAILABLE AGE
minio 1/1 1 1 65m
$ kc logs -l app=minio
创建 凭证 secret
$ kc create secret generic thanos-objectstorage --from-file=thanos.yaml=thanos-minio.yml -n kube-mon
访问 WebUI
# 用户名 m1n10_AccessKey
# 密码 m1n10_SecretKey
创建 bucket thanos
5. node_exporter
资源定义
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: kube-mon
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
volumes:
- name: proc
hostPath:
path: /proc
- name: dev
hostPath:
path: /dev
- name: sys
hostPath:
path: /sys
- name: root
hostPath:
path: /
- name: system-dbus-socket
hostPath:
path: /var/run/dbus/system_bus_socket
containers:
- name: node-exporter
image: prom/node-exporter:v1.5.0
args:
- --web.listen-address=0.0.0.0:9110
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(proc|var/lib/containerd/.+|/var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
- --collector.textfile
- --collector.netdev.device-exclude="^(lo|docker[0-9]|veth.+)$"
- --collector.systemd
- --collector.systemd.unit-whitelist="(docker|ssh).service"
- --collector.conntrack
- --collector.cpu
- --collector.diskstats
- --collector.filefd
- --collector.filesystem
- --collector.loadavg
- --collector.meminfo
- --collector.netdev
- --collector.netstat
- --collector.ntp
- --collector.sockstat
- --collector.stat
- --collector.time
- --collector.uname
- --collector.vmstat
- --collector.tcpstat
- --collector.xfs
- --collector.zfs
- --no-collector.arp
- --no-collector.bcache
- --no-collector.bonding
- --no-collector.buddyinfo
- --no-collector.drbd
- --no-collector.edac
- --no-collector.entropy
- --no-collector.hwmon
- --no-collector.infiniband
- --no-collector.interrupts
- --no-collector.ipvs
- --no-collector.ksmd
- --no-collector.logind
- --no-collector.mdadm
- --no-collector.meminfo_numa
- --no-collector.mountstats
- --no-collector.nfs
- --no-collector.nfsd
- --no-collector.qdisc
- --no-collector.runit
- --no-collector.supervisord
- --no-collector.timex
- --no-collector.wifi
ports:
- containerPort: 9110
env:
- name: HOSTIP
valueFrom:
fieldRef:
fieldPath: status.hostIP
resources:
requests:
cpu: 150m
memory: 180Mi
limits:
cpu: 150m
memory: 180Mi
securityContext:
runAsUser: 65534
runAsNonRoot: true
volumeMounts:
- mountPath: /host/proc
name: proc
- mountPath: /host/sys
name: sys
- mountPath: /host/root
name: root
readOnly: true
mountPropagation: HostToContainer
- mountPath: /var/run/dbus/system_bus_socket
name: system-dbus-socket
readOnly: true
tolerations:
- operator: "Exists"
创建资源
$ kc apply -f node-exporter.yml
检查确认
$ kc get ds -A
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-flannel kube-flannel-ds 3 3 3 3 3 <none> 84m
kube-mon node-exporter 3 3 3 3 3 kubernetes.io/os=linux 16m
kube-system kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 85m
$ kc -n kube-mon logs -l app=node-exporter
6. thanos store
Thanos Store 组件,负责将历史监控指标存储在对象存储(Minio)中
资源定义
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: thanos-store-gateway-local
namespace: kube-mon
labels:
app: thanos-store-gateway
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
storageClassName: local-storage
local:
path: /data/k8s/thanos-store-gateway-cache
persistentVolumeReclaimPolicy: Retain
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-worker02
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: thanos-store-gateway-pvc
namespace: kube-mon
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2G
---
# 该服务为 querier 创建 srv 记录,以便查找 store-api 的信息
apiVersion: v1
kind: Service
metadata:
name: thanos-store-gateway
namespace: kube-mon
spec:
type: ClusterIP
clusterIP: None
ports:
- name: grpc
port: 10901
targetPort: grpc
selector:
# 发现 thanos-store-api 标签 为 true 的 StatefulSet Pod
thanos-store-api: "true"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: thanos-store-gateway
namespace: kube-mon
labels:
app: thanos-store-gateway
spec:
replicas: 2
selector:
matchLabels:
app: thanos-store-gateway
serviceName: thanos-store-gateway
template:
metadata:
labels:
app: thanos-store-gateway
# 添加标签,用以让 Headless Service 发现
thanos-store-api: "true"
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- thanos-store-gateway
containers:
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- "store"
- "--log.level=debug"
- "--data-dir=/data/$(POD_NAME)"
- "--objstore.config-file=/etc/secret/thanos.yaml"
- "--index-cache-size=500MB"
- "--chunk-pool-size=500MB"
ports:
- name: http
containerPort: 10902
- name: grpc
containerPort: 10901
livenessProbe:
httpGet:
port: 10902
path: /-/healthy
readinessProbe:
httpGet:
port: 10902
path: /-/ready
volumeMounts:
- name: object-storage-config
mountPath: /etc/secret
readOnly: false
- mountPath: /data
name: thanos-store-gateway-cache-volume
volumes:
- name: object-storage-config
secret:
secretName: thanos-objectstorage
- name: thanos-store-gateway-cache-volume
persistentVolumeClaim:
claimName: thanos-store-gateway-pvc
创建资源
$ kc apply -f thanos-store.yml
检查确认
$ kc get sts -A
NAMESPACE NAME READY AGE
kube-mon thanos-store-gateway 2/2 93s
$ kc -n kube-mon logs -l app=thanos-store-gateway
# ...
level=info ts=2023-04-08T03:30:03.084382377Z caller=store.go:370 msg="bucket store ready" init_duration=10.13259ms
level=debug ts=2023-04-08T03:30:03.084430366Z caller=fetcher.go:319 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=info ts=2023-04-08T03:30:03.084660673Z caller=intrumentation.go:56 msg="changing probe status" status=ready
level=info ts=2023-04-08T03:30:03.084716398Z caller=grpc.go:131 service=gRPC/server component=store msg="listening for serving gRPC" address=0.0.0.0:10901
level=info ts=2023-04-08T03:30:03.08623262Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.826716ms duration_ms=1 cached=0 returned=0 partial=0
7. thanos receiver
Thanos Receiver 组件可以接收来自任何 Prometheus 实例的 remote write 远程写入请求,并将数据存储在其本地 TSDB,也可以选择将本地 TSDB 块定期上传到对象存储中,Thanos Querier 组件 通过 Receiver 暴露 Store API 接口,就可以获取到最近一段时间的数据
资源定义
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: thanos-receiver
namespace: kube-mon
labels:
app: thanos-receiver
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
storageClassName: local-storage
local:
path: /data/k8s/thanos-receiver
persistentVolumeReclaimPolicy: Retain
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: ["k8s-worker03"]
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: thanos-receiver-pvc
namespace: kube-mon
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 2G
storageClassName: local-storage
---
apiVersion: v1
kind: Service
metadata:
name: thanos-receiver
namespace: kube-mon
spec:
clusterIP: None
ports:
- name: grpc
port: 10901
targetPort: 10901
- name: http
port: 10902
targetPort: 10902
- name: remote-write
port: 19291
targetPort: 19291
selector:
app: thanos-receiver
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hashring-config
namespace: kube-mon
data:
# Remote Write 请求路由,未配置租户即默认租户,默认租户的请求由以下三个 Receiver 实例进行处理
hashring.json: |-
[
{
"endpoints": [
"thanos-receiver-0.thanos-receiver:10901",
"thanos-receiver-1.thanos-receiver:10901",
"thanos-receiver-2.thanos-receiver:10901"
]
}
]
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: thanos-receiver
name: thanos-receiver
namespace: kube-mon
spec:
selector:
matchLabels:
app: thanos-receiver
serviceName: thanos-receiver
# receiver 实例数量
replicas: 3
template:
metadata:
labels:
app: thanos-receiver
# 用以让 thanos-receiver 被 querier 组件服务发现
thanos-store-api: "true"
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- thanos-receiver
volumes:
- name: object-storage-config
secret:
secretName: thanos-objectstorage
- name: hashring-config
configMap:
name: hashring-config
- name: data-volume
persistentVolumeClaim:
claimName: thanos-receiver-pvc
containers:
- name: thanos-receiver
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- receive
- --grpc-address=0.0.0.0:10901
- --http-address=0.0.0.0:10902
- --remote-write.address=0.0.0.0:19291
- --receive.replication-factor=1
- --objstore.config-file=/etc/secret/thanos.yaml
- --tsdb.path=/var/thanos/receiver/$(POD_NAME)
- --tsdb.retention=1d
- --label=receive_replica="$(POD_NAME)"
- --receive.local-endpoint=$(POD_NAME).thanos-receiver:10901 # 节点 endpoint,hashring 中记录的节点 host 需要与此处保持一致
- --receive.hashrings-file=/var/lib/thanos-receive/hashring.json # hashring 文件,用于记录集群多个 Receiver 节点的哈希环配置
ports:
- containerPort: 10901
name: grpc
- containerPort: 10902
name: http
- containerPort: 19291
name: remote-write
livenessProbe:
failureThreshold: 8
periodSeconds: 30
httpGet:
port: 10902
scheme: HTTP
path: /-/healthy
readinessProbe:
failureThreshold: 20
periodSeconds: 5
httpGet:
port: 10902
scheme: HTTP
path: /-/healthy
volumeMounts:
# 数据存储
- name: data-volume
mountPath: /var/thanos/receiver
readOnly: false
# 哈希环配置
- name: hashring-config
mountPath: /var/lib/thanos-receive
# minio oss 存储认证配置
- name: object-storage-config
mountPath: /etc/secret
readOnly: false
创建资源
$ kc apply -f thanos-receiver.yml
persistentvolume/thanos-receiver created
persistentvolumeclaim/thanos-receiver-pvc created
service/thanos-receiver created
configmap/hashring-config created
statefulset.apps/thanos-receiver created
检查确认
$ kc get sts -A
NAMESPACE NAME READY AGE
kube-mon thanos-receiver 3/3 83s
kube-mon thanos-store-gateway 2/2 11m
$ kc -n kube-mon logs --tail 10 thanos-receiver-0
# ...
level=info ts=2023-04-08T03:40:50.145702787Z caller=intrumentation.go:56 component=receive msg="changing probe status" status=ready
level=info ts=2023-04-08T03:40:50.145736805Z caller=receive.go:555 component=receive msg="storage started, and server is ready to receive web requests"
level=info ts=2023-04-08T03:40:50.146315543Z caller=receive.go:363 component=receive msg="listening for StoreAPI and WritableStoreAPI gRPC" address=0.0.0.0:10901
level=info ts=2023-04-08T03:40:50.146408006Z caller=grpc.go:131 component=receive service=gRPC/server component=receive msg="listening for serving gRPC" address=0.0.0.0:10901
8. prometheus
Prometheus 配置文件 ConfigMap 对象
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-prom-config
namespace: kube-mon
data:
# 名称是 .tmpl 后续 thanos 要渲染一下
prometheus.yaml.tmpl: |
global:
scrape_interval: 15s
scrape_timeout: 15s
# For Thanos
external_labels:
cluster: dayo-thanos-demo
# 每个 Prometheus 有一个唯一的标签
replica: $(POD_NAME)
# 指定 remote write 地址
remote_write:
- url: "http://thanos-receiver:19291/api/v1/receive"
# 报警规则文件配置
rule_files:
- /etc/prometheus/rules/*.yml
alerting:
# 告警去重
alert_relabel_configs:
- regex: replica
action: labeldrop
alertmanagers:
- scheme: http
path_prefix: /
static_configs:
- targets: ['alertmanager:9193']
# 保持不变
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
kubernetes_sd_configs:
- role: node
relabel_configs:
# 修改使用自定义端口
- source_labels: [__address__]
action: replace
regex: ([^:]+):.*
replacement: $1:9110
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubelet'
kubernetes_sd_configs:
- role: node
# 使用 https 协议访问
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# 跳过证书校验
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
- replacement: /metrics/cadvisor
target_label: __metrics_path__
- job_name: 'apiserver'
kubernetes_sd_configs:
# endpoints
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_component]
action: keep
# 根据正则过滤出 apiserver 服务组件的 endpoint
regex: apiserver
- job_name: 'pod'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
# 通过 service 的注解 prometheus.io/scrape: true 发现对应的 Endpoints(Pod)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
# 通过 prometheus.io/scheme 这个注解获取 http 或 https
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
# 生成 指标接口协议 label
target_label: __scheme__
regex: (https?)
# 通过 prometheus.io/path 这个注解获取 指标接口端点
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
# 生成 指标接口端点路径 label
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
# ([^:]+) 非:开头出现一到多次,匹配 IP 地址
# (?::\d+)? 不保存子组,:\d+,匹配 :port 出现 0 到 1次
# (\d+) 端口
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
# 映射 Service 的 Label 标签
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将 namespace 映射成标签
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将 Service 名称映射成标签
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_service_name
# 将 Pod 名称映射成标签
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
Prometheus 规则文件 ConfigMap 对象
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-prom-rules
namespace: kube-mon
data:
node_records.yml: |+
groups:
- name: "node_rules"
interval: 15s
rules:
#################
# CPU
#################
# 最近 1 分钟节点 CPU 使用率
- record: node:cpu:cpu_usage
expr: (1 - sum(irate(node_cpu_seconds_total{mode="idle"}[1m])) by (instance) / sum(irate(node_cpu_seconds_total[1m])) by (instance) )
# 最近 1 分钟节点各 CPU 核心使用率
- record: node:cpu:per_cpu_usage
expr: (1 - sum(irate(node_cpu_seconds_total{mode="idle"}[1m])) by (instance, cpu) / sum(irate(node_cpu_seconds_total[1m])) by (instance, cpu))
#################
# Memory
#################
# 节点 内存 使用率
- record: node:mem:memory_usage
expr: (1 - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes) / node_memory_MemTotal_bytes)
# tmpfs、devtmpfs 内存使用量(单位 MiB)
- record: node:mem:tmpfs_used
expr: (node_filesystem_size_bytes{fstype=~".*tmpfs"} - node_filesystem_free_bytes{fstype=~".*tmpfs"}) / 1024 / 1024
# 最近一分钟内 slab 不可回收内存量的平均值(单位 MiB)
- record: node:mem:slab_sunreclaim
expr: avg_over_time(node_memory_SUnreclaim_bytes[1m]) / 1024 / 1024
# 最近一分钟内 LRU list 中 不可释放内存量的平均值(单位 MiB)
- record: node:mem:lru_unevictable
expr: avg_over_time(node_memory_Unevictable_bytes[1m]) / 1024 / 1024
#################
# Disk
#################
# 空间 已用百分比
- record: node:disk:disk_space_usage
expr: (1 - (node_filesystem_avail_bytes{fstype=~"ext.*|xfs|btrfs",device=~"/dev/vd.*"} / node_filesystem_size_bytes{fstype=~"ext.*|xfs|btrfs",device=~"/dev/vd.*"}))
# Inode 已用百分比
- record: node:disk:inode_space_usage
expr: (1 - (node_filesystem_files_free{fstype="ext4"} / node_filesystem_files{fstype="ext4"}))
#################
# DiskIO
#################
# 计算 1 分钟内平均每秒处理磁盘读请求数,对应 iostat -dxk 中的 r/s
- record: node:disk:read_iops
expr: sum by (instance) (rate(node_disk_reads_completed_total{device=~"vd.*"}[1m]))
# 计算 1 分钟内平均每秒处理磁盘写请求数,对应 iostat -dxk 中的 w/s
- record: node:disk:write_iops
expr: sum by (instance) (rate(node_disk_writes_completed_total{device=~"vd.*"}[1m]))
# 计算 1 分钟内平均每秒处理磁盘读带宽,对应 iostat -dxk 中的 rkB/s
- record: node:disk:read_bandwidth
expr: sum by (instance) (irate(node_disk_read_bytes_total{device=~"vd.*"}[1m]))
# 计算 1 分钟内平均每秒处理磁盘写带宽,对应 iostat -dxk 中的 wkB/s
- record: node:disk:write_bandwidth
expr: sum by (instance) (irate(node_disk_written_bytes_total{device=~"vd.*"}[1m]))
# 计算 1 分钟内平均读请求延迟 ms,对应 iostat -dxk 中的 r_await
- record: node:disk:read_await
expr: sum by (instance) (rate(node_disk_read_time_seconds_total{device=~"vd.*"}[1m]) / rate(node_disk_reads_completed_total{device=~"vd.*"}[1m]) * 1000)
# 计算 1 分钟内平均写请求延迟,对应 iostat -dxk 中的 w_await
- record: node:disk:write_await
expr: sum by (instance) (rate(node_disk_write_time_seconds_total{device=~"vd.*"}[1m]) / rate(node_disk_writes_completed_total{device=~"vd.*"}[1m]) * 1000)
#################
# File Descriptor
#################
# 系统已用文件描述符百分比
- record: node:proc:os_fd_usage
expr: (node_filefd_allocated / node_filefd_maximum)
# 进程已用文件描述符百分比
- record: node:proc:proc_fd_usage
expr: (process_open_fds{job="node"} / process_max_fds{job="node"})
#################
# Network
#################
# 各实例、各网卡 1 分钟内平均每秒接收字节数
- record: node:net:network_rx
expr: sum by(instance, device) (irate(node_network_receive_bytes_total{device=~"eth.*"}[1m]))
# 各实例、各网卡 1 分钟内平均每秒发送字节数
- record: node:net:network_tx
expr: sum by(instance, device) (irate(node_network_transmit_bytes_total{device=~"eth.*"}[1m]))
#################
# TCP
#################
# 各实例、各网卡 5 分钟内入向报文错误包占比(平均每秒)
- record: node:tcp:rx_error_rate5m
expr: sum by(instance, device) (rate(node_network_receive_errs_total{device=~"eth.*"}[5m]) / rate(node_network_receive_packets_total{device=~"eth.*"}[5m]))
# 各实例、各网卡 5 分钟内出向报文错误包占比(平均每秒)
- record: node:tcp:tx_error_rate5m
expr: sum by(instance, device) (rate(node_network_transmit_errs_total{device=~"eth.*"}[5m]) / rate(node_network_transmit_packets_total{device=~"eth.*"}[5m]))
# 各实例、各网卡 5 分钟内入向报文丢弃包占比(平均每秒)
- record: node:tcp:rx_drop_rate5m
expr: sum by(instance, device) (rate(node_network_receive_drop_total{device=~"eth.*"}[5m]) / rate(node_network_receive_packets_total{device=~"eth.*"}[5m]))
# 各实例、各网卡 5 分钟内出向报文丢弃包占比(平均每秒)
- record: node:tcp:tx_drop_rate5m
expr: sum by(instance, device) (rate(node_network_transmit_drop_total{device=~"eth.*"}[5m]) / rate(node_network_transmit_drop_total{device=~"eth.*"}[5m]))
# 当前重传报文率 与 30 分钟前对比,涨幅百分比
- record: node:tcp:retrans_rate5m
expr: (irate(node_netstat_Tcp_RetransSegs[1m]) / irate(node_netstat_Tcp_OutSegs[1m])) - (irate(node_netstat_Tcp_RetransSegs[1m] offset 30m) / irate(node_netstat_Tcp_OutSegs[1m] offset 30m))
# 当前重置报文率 与 30 分钟前对比,涨幅百分比
- record: node:tcp:rst_rate5m
expr: (irate(node_netstat_Tcp_OutRsts[1m]) / irate(node_netstat_Tcp_OutSegs[1m])) - (irate(node_netstat_Tcp_OutRsts[1m] offset 30m) / irate(node_netstat_Tcp_OutSegs[1m] offset 30m))
#################
# TCP Socket
#################
# 半连接队列 syn_backlog 溢出情况
- record: node:socket:listen_drop
expr: irate(node_netstat_TcpExt_ListenDrops[1m])
# 全连接队列 accept 溢出情况
- record: node:socket:listen_overflow
expr: irate(node_netstat_TcpExt_ListenOverflows[1m])
# 连接追踪表使用率
#################
# conntrack table
#################
- record: node:net:conntrack_tb_usage
expr: (node_nf_conntrack_entries / node_nf_conntrack_entries_limit)
node_alerts.yml: |+
groups:
- name: node_alerts
rules:
###### CPU ######
- alert: HostHighCpuLoad
# 最近 1m CPU 使用率超过 80%
expr: node:cpu:cpu_usage > 0.8
for: 0m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} 节点 CPU 使用率过高"
description: "最近一分钟内 {{ $labels.instance }} 节点 CPU 使用率超过 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
console: "URL: http://baidu.com"
- alert: HostHighCpuCoreLoad
# 最近 1m CPU 某个核心使用率超过 80%
expr: node:cpu:per_cpu_usage > 0.8
for: 1m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} 节点 CPU 核心使用率过高"
description: "最近一分钟内 {{ $labels.instance }} 节点 CPU 核心 {{ $labels.cpu }} 使用率超过 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
console: "URL: http://baidu.com"
###### Memory ######
- alert: HostHighTmpfsUsed
# tmpfs 内存使用超过 1 GiB
expr: node:mem:tmpfs_used > 200
for: 1m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} 节点 tmpfs 使用率过高 !"
description: "最近一分钟内 {{ $labels.instance }} 节点 tmpfs 使用率过高 !\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostHighMemorySlabUnreclaimUsed
# slab 不可回收内存量内存量过高
expr: node:mem:slab_sunreclaim > 1024
for: 1m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} slab 不可回收内存量内存量过高 "
description: "最近一分钟内 {{ $labels.instance }} slab 不可回收内存量内存量过高 !\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostHighMemoryLruUnreclaimUsed
# slab 不可回收内存量内存量过高
expr: node:mem:lru_unevictable > 2048
for: 1m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} lru list 不可回收内存量内存量过高"
description: "最近一分钟内 {{ $labels.instance }} lru list 不可回收内存量内存量过高 !\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
###### Disk ######
- alert: HostOutOfDiskSpace
# 磁盘空间使用率超过 90%
expr: node:disk:disk_space_usage > 0.9
for: 1m
labels:
severity: warning
annotations:
summary: "最近一分钟内 {{ $labels.instance }} 节点 CPU 使用率超过 80%"
description: "最近一分钟内 {{ $labels.instance }} 节点 CPU 使用率超过 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostDiskWillFillIn24Hour
# 通过predict_linear函数根据过去1h的数据,推测4小时后磁盘是否会满
expr: predict_linear(node_filesystem_free_bytes[1h], 24*3600) < 0
for: 0m
labels:
severity: critical
annotations:
summary: "预计实例 {{ $labels.instance }} 挂载点将在一天后打满!"
- alert: HostOutofDiskInodes
expr: node:disk:inode_space_usage > 0.8
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 磁盘 inode 超过 80%"
description: "节点 {{ $labels.instance }} 磁盘 inode 超过 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostInodesWillFillIn24Hour
# 通过predict_linear函数根据过去1h的数据,推测4小时后磁盘 inode是否会满
expr: predict_linear(node_filesystem_files_free[1h], 24*3600) < 0
for: 0m
labels:
severity: critical
annotations:
summary: "预计实例 {{ $labels.instance }} 磁盘 inode 将在一天后打满!"
###### DiskIO ######
- alert: HostUnusualDiskReadLatency
expr: node:disk:read_await > 100
for: 2m
labels:
severity: warning
annotations:
summary: "节点 {{ $labels.instance }} 磁盘 读请求耗时(r_await)异常"
description: "节点 {{ $labels.instance }} 磁盘 读请求耗时(r_await)异常!\n当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostUnusualDiskWriteLatency
expr: node:disk:write_await > 100
for: 2m
labels:
severity: warning
annotations:
summary: "节点 {{ $labels.instance }} 磁盘 写请求耗时(w_await)异常"
description: "节点 {{ $labels.instance }} 磁盘 写请求耗时(w_await)异常!\n当前值:{{ $value }}\n LABELS = {{ $labels }}"
###### File Descriptor ######
- alert: HostHighSystemFdUsed
expr: node:proc:os_fd_usage > 0.8
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 系统文件描述符使用率超过 80%"
description: "节点 {{ $labels.instance }} 系统文件描述符使用率 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
###### File Descriptor ######
- alert: HostHighSystemFdUsed
expr: node:proc:proc_fd_usage > 0.8
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 进程文件描述符使用率超过 80%"
description: "节点 {{ $labels.instance }} 进程文件描述符使用率 80%!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
###### TCP ######
- alert: HostNetworkReceiveErrRate
expr: node:tcp:rx_error_rate5m > 0.01
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 接收报文错误占比异常"
description: "节点 {{ $labels.instance }} 接收报文错误占比异常!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostNetworkTransmitErrRate
expr: node:tcp:tx_error_rate5m > 0.01
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 发送报文错误占比异常"
description: "节点 {{ $labels.instance }} 发送报文错误占比异常!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostNetworkReceiveDropRate
expr: node:tcp:rx_drop_rate5m > 0.01
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 接收报文丢弃占比异常"
description: "节点 {{ $labels.instance }} 接收报文丢弃占比异常!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostNetworkTransmitDropRate
expr: node:tcp:rx_drop_rate5m > 0.01
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 发送报文丢弃占比异常"
description: "节点 {{ $labels.instance }} 发送报文丢弃占比异常!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostUnusualNetworkRetransRate
expr: node:tcp:retrans_rate5m > 20
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 报文重传率发生异常升高"
description: "节点 {{ $labels.instance }} 报文重传率发生异常升高!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostUnusualNetworkResetRate
expr: node:tcp:rst_rate5m > 20
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 报文重置率发生异常升高"
description: "节点 {{ $labels.instance }} 报文重置率发生异常升高!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
###### TCP Socket ######
- alert: HostSynBacklogOverflow
expr: node:socket:listen_overflow > 10
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 半连接队列存在溢出现象"
description: "节点 {{ $labels.instance }} 半连接队列存在溢出现象!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostAcceptBacklogverflow
expr: node:socket:listen_overflow > 10
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 半连接队列存在溢出现象"
description: "节点 {{ $labels.instance }} 半连接队列存在溢出现象!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
- alert: HostHighConntrackTableUsage
expr: node:net:conntrack_tb_usage > 80
for: 1m
labels:
security: warning
annotations:
summary: "节点 {{ $labels.instance }} 连接追踪表使用率过高"
description: "节点 {{ $labels.instance }} 连接追踪表使用率过高!\n 当前值:{{ $value }}\n LABELS = {{ $labels }}"
创建 ConfgiMap 对象
$ kc apply -f configmap-prometheus-config-receiver.yml
$ kc apply -f configmap-prometheus-rules.yml
Prometheus 资源定义
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-local
labels:
app: prometheus
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 40Gi
storageClassName: local-storage
local:
# 需保证亲和性节点存在该目录
path: /data/k8s/prometheus
persistentVolumeReclaimPolicy: Retain
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
# 按照主机名选择 pv 亲和性节点
- k8s-worker02
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-data
namespace: kube-mon
spec:
selector:
matchLabels:
app: prometheus
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: local-storage
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: kube-mon
labels:
app: prometheus
spec:
type: NodePort
selector:
app: prometheus
ports:
- name: http
port: 9090
targetPort: http
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
namespace: kube-mon
labels:
app: prometheus
spec:
serviceName: prometheus
replicas: 2
selector:
matchLabels:
app: prometheus
thanos-store-api: "true"
template:
metadata:
labels:
app: prometheus
thanos-store-api: "true"
spec:
serviceAccountName: prometheus
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- prometheus
volumes:
- name: prom-config-volume
configMap:
# ConfigMap 资源对象的名称
name: configmap-prom-config
- name: prom-rules-volume # Prometheus Rules
configMap:
# ConfigMap 资源对象的名称
name: configmap-prom-rules
items:
- key: node_records.yml
path: node_records.yml
- key: node_alerts.yml
path: node_alerts.yml
- name: prom-config-shared-volume
emptyDir: { }
- name: data-volume
persistentVolumeClaim:
claimName: prometheus-data
initContainers:
- name: fix-permissions
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
image: busybox:stable
imagePullPolicy: IfNotPresent
command: ['/bin/sh', '-c', "mkdir -p /prometheus/$(POD_NAME) && chown -R nobody:nobody /prometheus"]
volumeMounts:
- name: data-volume
mountPath: /prometheus
containers:
- name: prometheus
image: prom/prometheus:v2.35.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- "--config.file=/etc/prometheus-shared/prometheus.yaml"
- "--storage.tsdb.path=/prometheus/$(POD_NAME)"
- "--storage.tsdb.retention.time=6h"
- "--storage.tsdb.no-lockfile"
- "--storage.tsdb.min-block-duration=2h" # Thanos 处理数据压缩
- "--storage.tsdb.max-block-duration=2h"
- "--web.enable-admin-api" # 通过一些命令去管理数据
- "--web.enable-lifecycle" # 支持热更新 localhost:9090/-/reload 加载
- "--web.listen-address=:9090"
- "--web.external-url=http://0.0.0.0:9090"
ports:
- name: http
containerPort: 9090
resources:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 250m
memory: 1Gi
volumeMounts:
- name: prom-config-shared-volume
mountPath: /etc/prometheus-shared/
- name: prom-rules-volume
mountPath: /etc/prometheus/rules/
- name: prom-config-volume
mountPath: /etc/prometheus
- name: data-volume
mountPath: /prometheus
# 渲染 Prometheus 配置
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- sidecar
- --log.level=debug
- --reloader.config-file=/etc/prometheus/prometheus.yaml.tmpl
- --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yaml
- --reloader.rule-dir=/etc/prometheus/rules/
volumeMounts:
- name: prom-config-shared-volume
mountPath: /etc/prometheus-shared/
- name: prom-rules-volume
mountPath: /etc/prometheus/rules/
- name: prom-config-volume
mountPath: /etc/prometheus
- name: data-volume
mountPath: /prometheus
创建 prometheus 资源
$ kc apply -f prometheus-receiver.yml
service/prometheus created
statefulset.apps/prometheus created
检查确认
$ kc get sts -l app=prometheus -n kube-mon
NAME READY AGE
prometheus 2/2 2m48s
$ kc -n kube-mon logs --tail 10 -l app=prometheus
ts=2023-04-08T03:49:53.287Z caller=kubernetes.go:313 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-04-08T03:49:53.296Z caller=main.go:1179 level=info msg="Completed loading of configuration file" filename=/etc/prometheus-shared/prometheus.yaml totalDuration=90.057454ms db_storage=874ns remote_storage=382.102µs web_handler=348ns query_engine=831ns scrape=236.853µs scrape_sd=79.543253ms notify=28.406µs notify_sd=21.174µs rules=7.966266ms tracing=7.848µs
ts=2023-04-08T03:49:53.296Z caller=main.go:910 level=info msg="Server is ready to receive web requests."
ts=2023-04-08T03:49:56.818Z caller=main.go:1142 level=info msg="Loading configuration file" filename=/etc/prometheus-shared/prometheus.yaml
ts=2023-04-08T03:49:56.819Z caller=kubernetes.go:313 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-04-08T03:49:56.821Z caller=kubernetes.go:313 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-04-08T03:49:56.834Z caller=main.go:1179 level=info msg="Completed loading of configuration file" filename=/etc/prometheus-shared/prometheus.yaml totalDuration=15.764754ms db_storage=1.574µs remote_storage=125.174µs web_handler=603ns query_engine=1.214µs scrape=82.578µs scrape_sd=2.370842ms notify=13.75µs notify_sd=11.465µs rules=12.105489ms tracing=8.153µs
ts=2023-04-08T03:50:01.736Z caller=dedupe.go:112 component=remote level=info remote_name=216c76 url=http://thanos-receiver:19291/api/v1/receive msg="Done replaying WAL" duration=8.450592918s
访问 Prometheus WebUI 检查配置情况
$ kc get svc prometheus -n kube-mon
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus NodePort 10.108.27.23 <none> 9090:31994/TCP 4m41s
9. thanos query
Thanos Querier 组件支持对监控数据自动去重,提供一个全局的统一查询入口
Receiver 模式 WebUI 中是看不到 Alerts、Rule、Target 相关信息,因为它本地的数据只有远程写过来的 tsdb 数据
资源定义
---
# 创建 Serivce 对象为 thanos-querier 提供全局查询服务
apiVersion: v1
kind: Service
metadata:
name: thanos-querier
namespace: kube-mon
labels:
app: thanos-querier
spec:
type: NodePort
selector:
app: thanos-querier
ports:
- port: 9090
targetPort: http
name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: thanos-querier
namespace: kube-mon
labels:
app: thanos-querier
spec:
replicas: 2
selector:
matchLabels:
app: thanos-querier
template:
metadata:
labels:
app: thanos-querier
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- thanos-querier
containers:
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
args:
- query
- --log.level=debug
# 用于过滤重复数据的标签
- --query.replica-label=replica
- --query.replica-label=receive_replica
# Querier 会通过 Store 发现 Receiver 实例,这样才能查询到近期的数据,所以这里要配置上
- --store=dnssrv+thanos-store-gateway:10901
ports:
- name: grpc
containerPort: 10901
- name: http
containerPort: 10902
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 512Mi
cpu: 250m
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /-/healthy
port: http
readinessProbe:
initialDelaySeconds: 15
httpGet:
path: /-/healthy
port: http
创建资源
$ kc apply -f thanos-querier.yml
service/thanos-querier created
deployment.apps/thanos-querier created
检查确认
$ kc get deploy thanos-querier -n kube-mon
NAME READY UP-TO-DATE AVAILABLE AGE
thanos-querier 2/2 2 2 21s
$ kc -n kube-mon logs -l app=thanos-querier
# ...
level=info ts=2023-04-08T03:56:03.094267518Z caller=intrumentation.go:75 msg="changing probe status" status=healthy
# 启动完成开始监听
level=info ts=2023-04-08T03:56:03.094289948Z caller=http.go:73 service=http/server component=query msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2023-04-08T03:56:03.094453505Z caller=tls_config.go:195 service=http/server component=query msg="TLS is disabled." http2=false
# 完成就绪检查
level=info ts=2023-04-08T03:56:03.094509886Z caller=intrumentation.go:56 msg="changing probe status" status=ready
level=info ts=2023-04-08T03:56:03.094540468Z caller=grpc.go:131 service=gRPC/server component=query msg="listening for serving gRPC" address=0.0.0.0:10901
level=debug ts=2023-04-08T03:56:08.096982106Z caller=endpointset.go:309 component=endpointset msg="starting to update API endpoints" cachedEndpoints=0
level=debug ts=2023-04-08T03:56:08.100735284Z caller=endpointset.go:312 component=endpointset msg="checked requested endpoints" activeEndpoints=5 cachedEndpoints=0
# ...
# 发现 store 实例 10.244.2.4:10901
level=info ts=2023-04-08T03:56:08.100807469Z caller=endpointset.go:349 component=endpointset msg="adding new store with [storeAPI]" address=10.244.2.4:10901 extLset=
# 发现 receive 实例 10.244.1.6:10901
level=info ts=2023-04-08T03:56:08.100829478Z caller=endpointset.go:349 component=endpointset msg="adding new receive with [storeAPI exemplarsAPI]" address=10.244.1.6:10901 extLset="{receive_replica=\"thanos-receiver-2\", tenant_id=\"default-tenant\"}"
# 发现 store 实例 10.244.2.5:10901
level=info ts=2023-04-08T03:56:08.100846078Z caller=endpointset.go:349 component=endpointset msg="adding new store with [storeAPI]" address=10.244.2.5:10901 extLset=
# 发现 receive 实例 10.244.1.4:10901
level=info ts=2023-04-08T03:56:08.100866424Z caller=endpointset.go:349 component=endpointset msg="adding new receive with [storeAPI exemplarsAPI]" address=10.244.1.4:10901 extLset="{receive_replica=\"thanos-receiver-0\", tenant_id=\"default-tenant\"}"
# 发现 receive 实例 10.244.1.5:10901
level=info ts=2023-04-08T03:56:08.100886006Z caller=endpointset.go:349 component=endpointset msg="adding new receive with [storeAPI exemplarsAPI]" address=10.244.1.5:10901 extLset="{receive_replica=\"thanos-receiver-1\", tenant_id=\"default-tenant\"}"
level=debug ts=2023-04-08T03:56:13.094131667Z caller=endpointset.go:309 component=endpointset msg="starting to update API endpoints" cachedEndpoints=5
# ...
访问 WebUI 试着执行查询
$ kc get svc thanos-querier -n kube-mon
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
thanos-querier NodePort 10.97.136.11 <none> 9090:31974/TCP 6m59s
10. thanos frontend query
Thanos 提供 Query Frontend
组件(可选)来提升查询性能,它的工作内容主要是两个方面
- 将大型查询拆分为多个较小的查询
- 缓存查询结果以此提升性能
资源定义
apiVersion: v1
kind: Service
metadata:
name: thanos-query-frontend
namespace: kube-mon
labels:
app: thanos-query-frontend
spec:
type: NodePort
selector:
app: thanos-query-frontend
ports:
- port: 9090
name: http
targetPort: 9090
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: thanos-query-frontend
namespace: kube-mon
labels:
app: thanos-query-frontend
spec:
selector:
matchExpressions:
- key: app
operator: In
values: ["thanos-query-frontend"]
template:
metadata:
labels:
app: thanos-query-frontend
spec:
containers:
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
env:
- name: HOST_IP_ADDRESS
valueFrom:
fieldRef:
fieldPath: status.hostIP
ports:
- containerPort: 9090
name: http
args:
- query-frontend
- --log.level=info
- --log.format=logfmt
- --query-frontend.compress-responses
- --http-address=0.0.0.0:9090
# 下游 querier
- --query-frontend.downstream-url=http://thanos-querier.kube-mon.svc.cluster.local:9090
- --query-range.split-interval=12h # 以 12h 半天为拆分单位
- --query-range.max-retries-per-request=10 # HTTP 请求失败时最大重试次数
- --query-frontend.log-queries-longer-than=10s # 慢查询阈值
- --labels.split-interval=12h # 将长查询拆分为多个短查询
- --labels.max-retries-per-request=10
- |-
--query-range.response-cache-config="config":
max_size: "200MB"
max_size_items: 0
validity: 0s
type: IN-MEMORY
- |-
--labels.response-cache-config="config":
max_size: "200MB"
max_size_items: 0
validity: 0s
type: IN-MEMORY
livenessProbe:
failureThreshold: 4
periodSeconds: 30
httpGet:
port: 9090
scheme: HTTP
path: /-/healthy
readinessProbe:
failureThreshold: 20
periodSeconds: 5
httpGet:
port: 9090
scheme: HTTP
path: /-/ready
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
创建资源
$ kc apply -f thanos-query-frontend.yml
检查确认
$ kc get all -n kube-mon -l app=thanos-query-frontend
NAME READY STATUS RESTARTS AGE
pod/thanos-query-frontend-7b56c5b69f-llm6x 1/1 Running 0 2m39s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/thanos-query-frontend NodePort 10.101.88.51 <none> 9090:31335/TCP 2m39s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/thanos-query-frontend 1/1 1 1 2m39s
NAME DESIRED CURRENT READY AGE
replicaset.apps/thanos-query-frontend-7b56c5b69f 1 1 1 2m39s
$ kc -n kube-mon logs -l app=thanos-query-frontend
11. Grafana
资源定义
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana-local
labels:
app: grafana
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
storageClassName: local-storage
local:
path: /data/k8s/grafana
persistentVolumeReclaimPolicy: Retain
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-worker03
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-data
namespace: kube-mon
spec:
selector:
matchLabels:
app: grafana
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-storage
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: kube-mon
spec:
type: NodePort
ports:
- port: 3000
selector:
app: grafana
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: kube-mon
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
volumes:
- name: storage
persistentVolumeClaim:
claimName: grafana-data
initContainers:
- name: fix-permissions
image: busybox
command: [chown, -R, "472:472", "/var/lib/grafana"]
volumeMounts:
- mountPath: /var/lib/grafana
name: storage
containers:
- name: grafana
image: grafana/grafana:9.4.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: grafana
env:
# Grafana 登录用户名密码
- name: GF_SECURITY_ADMIN_USER
value: admin
- name: GF_SECURITY_ADMIN_PASSWORD
value: admin123
readinessProbe:
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
livenessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 150m
memory: 512Mi
requests:
cpu: 150m
memory: 512Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: storage
创建资源
$ kc apply -f grafana.yml
检查确认
$ kc get pods -n kube-mon -o wide -l app=grafana
$ kc -n kube-mon logs -f -l app=grafana
创建 2 个 数据源
导入模版 18435,选择对应数据源
分别查看数据是否正确获取并渲染
12. promoter
promoter 配置文件 promoter-config.yml
---
global:
# 执行 PromQL 语句,用以渲染图片
prometheus_url: http://prometheus:9090
dingtalk_api_token: xxx
dingtalk_api_secret: xxx
wechat_api_secret: xxx-xxx
wechat_api_corp_id: xxx
s3:
# 阿里云 OSS,用以保存生成的图片
access_key: "xxx"
secret_key: "xxx"
# endpoint: "oss-cn-beijing-internal.aliyuncs.com"
endpoint: "oss-cn-beijing.aliyuncs.com"
region: "cn-beijing"
bucket: "xxx"
receivers:
- name: dingtalk
dingtalk_config:
message_type: markdown
markdown:
title: '{{ template "dingtalk.default.title" . }}'
text: '{{ template "dingtalk.default.content" . }}'
at:
atMobiles: [ "138xxxx" ]
isAtAll: true
- name: wechat
wechat_config:
message_type: markdown
message: '{{ template "wechat.default.message" . }}'
to_user: "@all"
agent_id: 1000002
生成 secret 密文 data
$ cat promoter-config.yml | base64
配置 promoter secret 对象
apiVersion: v1
kind: Secret
metadata:
name: secret-promoter-config
namespace: kube-mon
data:
config.yml: |
# 密文 data
创建 secret 对象
$ kc apply -f secret-promoter-config.yml
promoter 工作负载定义
apiVersion: v1
kind: Service
metadata:
name: promoter
namespace: kube-mon
labels:
app: promoter
spec:
type: ClusterIP
selector:
app: promoter
ports:
- port: 9194
protocol: TCP
targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: promoter
namespace: kube-mon
labels:
app: promoter
spec:
selector:
matchLabels:
app: promoter
template:
metadata:
labels:
app: promoter
spec:
volumes:
- name: promoter-config
secret:
secretName: secret-promoter-config
containers:
- name: promoter
image: lotusching/promoter:latest
imagePullPolicy: IfNotPresent
command:
- "/promoter/bin/promoter"
- "--config.file=/etc/secret/config.yml"
volumeMounts:
- mountPath: /etc/secret
name: promoter-config
ports:
- name: http
containerPort: 8080
protocol: TCP
部署 alertmanager webhook 服务
$ kc apply -f promoter.yml
检查确认
$ kc get pods -n kube-mon -o wide -l app=promoter
$ kc -n kube-mon logs -l app=promoter
13. alertmanager
AlertManager ConfigMap 配置对象
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-alertmanager-config
namespace: kube-mon
data:
alertmanager.yml: |
global:
# 当 alertmanager 持续多长时间未接收到告警后标记告警状态为 resolved
resolve_timeout: 5m
# 告警路由
route:
# 这里的标签列表是接收到报警信息后的重新分组标签
# 如,接收到的报警信息里有许多具有 instance=A 和 alertname=xx 这样标签的报警信息将会批量被聚合到一个分组里面
group_by: ['instance', 'alertname']
group_wait: 1s
group_interval: 10s
# 警报重复间隔,每2分钟重复一次警报
repeat_interval: 2m
# 警报接收端,这里配置为下面定义的钩子
receiver: 'promoter-webhook-wechat'
routes:
- match_re:
# severity: ^(error|critical)$
severity: ^(critical)$
receiver: promoter-webhook-dingtalk
continue: true
receivers:
- name: 'promoter-webhook-dingtalk'
webhook_configs:
# 配置 promoter service 地址端口
- url: "http://promoter:9194/dingtalk/send"
send_resolved: true
- name: 'promoter-webhook-wechat'
webhook_configs:
- url: "http://promoter:9194/wechat/send"
send_resolved: true
AlertManager 工作负载
apiVersion: v1
kind: Service
metadata:
name: alertmanager
namespace: kube-mon
labels:
app: alertmanager
spec:
selector:
app: alertmanager
type: ClusterIP
ports:
- port: 9193
targetPort: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: alertmanager
namespace: kube-mon
labels:
app: alertmanager
spec:
selector:
matchLabels:
app: alertmanager
template:
metadata:
labels:
app: alertmanager
spec:
volumes:
- name: alertmanager-config
configMap:
name: configmap-alertmanager-config
containers:
- name: alertmanager
image: prom/alertmanager:v0.25.0
imagePullPolicy: IfNotPresent
args:
- "--config.file=/etc/alertmanager/alertmanager.yml"
ports:
- containerPort: 9093
name: http
volumeMounts:
- mountPath: "/etc/alertmanager"
name: alertmanager-config
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 100m
memory: 256Mi
部署 alertmanager
$ kc apply -f configmap-alertmanager-config.yml
$ kc apply -f alertmanager.yml
检查确认
$ kc get all -n kube-mon -l app=alertmanager
$ kc -n kube-mon logs -l app=alertmanager
14. 功能确认
告警测试
下面命令会触发 HostHighTmpfsUsed 告警策略
$ dd if=/dev/urandom of=testfile count=300 bs=1M
300+0 records in
300+0 records out
314572800 bytes (315 MB) copied, 1.13119 s, 278 MB/s
稍等片刻
OSS 检查
历史数据也正常上传到了 monio
Grafana 大盘
正常渲染
Thanos Sidecar 模式
部署过程大致与 Receiver 类似,所以仅是贴下过程,先初始化创建 ECS
1. namespace
创建数据目录
# k8s-worker02
$ mkdir -p /data/k8s/{prometheus,thanos-store-gateway-cache}
# k8s-worker03
$ mkdir -p /data/k8s/{grafana,minio,thanos-receiver}
创建 kube-mon
命名空间
$ kc apply -f ns.yml
2. RBA
为 prometheus 创建 serviceaccount,role、clusterrolebinding
$ kc apply -f rbac.yml
3. StorageClass
创建资源
$ kc apply -f storageclass.yml
4. Minio
创建 minio 对象存储
$ kc apply -f minio-deploy.yml
$ kc describe pod -l app=minio
$ kc logs -l app=minio
创建 凭证 secret
$ kc create secret generic thanos-objectstorage --from-file=thanos.yaml=thanos-minio.yml -n kube-mon
访问 WebUI、创建 bucket thanos
访问 WebUI
# 用户名 m1n10_AccessKey
# 密码 m1n10_SecretKey
5. node_exporter
创建资源
$ kc apply -f node-exporter.yml
$ kc -n kube-mon describe pod -l app=node-exporter
$ kc -n kube-mon logs -l app=node-exporter
6. thanos Store
创建资源
$ kc apply -f thanos-store.yml
$ kc -n kube-mon describe pod -l app=thanos-store-gateway
$ kc -n kube-mon logs -l app=thanos-store-gateway
7. Prometheus、Thanos sidecar
配置定义
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-prom-config
namespace: kube-mon
data:
# 名称是 .tmpl 后续 thanos 要渲染一下
prometheus.yaml.tmpl: |
global:
scrape_interval: 15s
scrape_timeout: 15s
# For Thanos
external_labels:
cluster: dayo-thanos-demo
# 每个 Prometheus 有一个唯一的标签
replica: $(POD_NAME)
# 报警规则文件配置
rule_files:
- /etc/prometheus/rules/*.yml
alerting:
# 告警去重
alert_relabel_configs:
- regex: replica
action: labeldrop
alertmanagers:
- scheme: http
path_prefix: /
static_configs:
- targets: ['alertmanager:9193']
# 保持不变
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
kubernetes_sd_configs:
- role: node
relabel_configs:
# 修改使用自定义端口
- source_labels: [__address__]
action: replace
regex: ([^:]+):.*
replacement: $1:9110
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubelet'
kubernetes_sd_configs:
- role: node
# 使用 https 协议访问
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# 跳过证书校验
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
- replacement: /metrics/cadvisor
target_label: __metrics_path__
- job_name: 'apiserver'
kubernetes_sd_configs:
# endpoints
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_component]
action: keep
# 根据正则过滤出 apiserver 服务组件的 endpoint
regex: apiserver
- job_name: 'pod'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
# 通过 service 的注解 prometheus.io/scrape: true 发现对应的 Endpoints(Pod)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
# 通过 prometheus.io/scheme 这个注解获取 http 或 https
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
# 生成 指标接口协议 label
target_label: __scheme__
regex: (https?)
# 通过 prometheus.io/path 这个注解获取 指标接口端点
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
# 生成 指标接口端点路径 label
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
# ([^:]+) 非:开头出现一到多次,匹配 IP 地址
# (?::\d+)? 不保存子组,:\d+,匹配 :port 出现 0 到 1次
# (\d+) 端口
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
# 映射 Service 的 Label 标签
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将 namespace 映射成标签
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将 Service 名称映射成标签
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_service_name
# 将 Pod 名称映射成标签
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
资源定义
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-local
labels:
app: prometheus
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 40Gi
storageClassName: local-storage
local:
# 需保证亲和性节点存在该目录
path: /data/k8s/prometheus
persistentVolumeReclaimPolicy: Retain
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
# 按照主机名选择 pv 亲和性节点
- k8s-worker02
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-data
namespace: kube-mon
spec:
selector:
matchLabels:
app: prometheus
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: local-storage
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: kube-mon
labels:
app: prometheus
spec:
type: NodePort
selector:
app: prometheus
ports:
- name: http
port: 9090
targetPort: http
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
namespace: kube-mon
labels:
app: prometheus
spec:
serviceName: prometheus
replicas: 2
selector:
matchLabels:
app: prometheus
thanos-store-api: "true"
template:
metadata:
labels:
app: prometheus
thanos-store-api: "true"
spec:
serviceAccountName: prometheus
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- prometheus
volumes:
- name: object-storage-config
secret:
secretName: thanos-objectstorage
- name: prom-config-volume
configMap:
# ConfigMap 资源对象的名称
name: configmap-prom-config
- name: prom-rules-volume # Prometheus Rules
configMap:
# ConfigMap 资源对象的名称
name: configmap-prom-rules
items:
- key: node_records.yml
path: node_records.yml
- key: node_alerts.yml
path: node_alerts.yml
- name: prom-config-shared-volume
emptyDir: { }
- name: data-volume
persistentVolumeClaim:
claimName: prometheus-data
initContainers:
- name: fix-permissions
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
image: busybox:stable
imagePullPolicy: IfNotPresent
command: ['/bin/sh', '-c', "mkdir -p /prometheus/$(POD_NAME) && chown -R nobody:nobody /prometheus"]
volumeMounts:
- name: data-volume
mountPath: /prometheus
containers:
# - name: debug
# image: busybox
# imagePullPolicy: IfNotPresent
# command: ["/bin/sh", "-c", "sleep 3600"]
# volumeMounts:
# - name: prom-config-shared-volume
# mountPath: /etc/prometheus-shared/
# - name: prom-rules-volume
# mountPath: /etc/prometheus/rules/
# - name: prom-config-volume
# mountPath: /etc/prometheus/
# - name: data-volume
# mountPath: /prometheus
- name: prometheus
image: prom/prometheus:v2.35.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- "--config.file=/etc/prometheus-shared/prometheus.yaml"
- "--storage.tsdb.path=/prometheus/$(POD_NAME)"
- "--storage.tsdb.retention.time=6h"
- "--storage.tsdb.no-lockfile"
# !!!这里设置为 10m 仅为了测试 Compact 压缩合并,合理值应为 2h!!!
- "--storage.tsdb.min-block-duration=10m" # Thanos 处理数据压缩
- "--storage.tsdb.max-block-duration=10m"
- "--web.enable-admin-api" # 通过一些命令去管理数据
- "--web.enable-lifecycle" # 支持热更新 localhost:9090/-/reload 加载
- "--web.listen-address=:9090"
- "--web.external-url=http://0.0.0.0:9090"
ports:
- name: http
containerPort: 9090
resources:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 250m
memory: 1Gi
volumeMounts:
- name: prom-config-shared-volume
mountPath: /etc/prometheus-shared/
- name: prom-rules-volume
mountPath: /etc/prometheus/rules/
- name: prom-config-volume
mountPath: /etc/prometheus
- name: data-volume
mountPath: /prometheus
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- sidecar
- --log.level=debug
- --tsdb.path=/prometheus/$(POD_NAME)
- --prometheus.url=http://localhost:9090
- --reloader.config-file=/etc/prometheus/prometheus.yaml.tmpl
- --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yaml
- --reloader.rule-dir=/etc/prometheus/rules/
- --objstore.config-file=/etc/secret/thanos.yaml
ports:
- containerPort: 10901
name: grpc
- containerPort: 10902
name: http-sidecar
resources:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 250m
memory: 1Gi
volumeMounts:
- name: prom-config-shared-volume
mountPath: /etc/prometheus-shared/
- name: prom-rules-volume
mountPath: /etc/prometheus/rules/
- name: prom-config-volume
mountPath: /etc/prometheus
- name: data-volume
mountPath: /prometheus
- name: object-storage-config
mountPath: /etc/secret
readOnly: false
创建资源
$ kc apply -f configmap-prometheus-config-sidecar.yml
$ kc apply -f configmap-prometheus-rules.yml
$ kc apply -f prometheus-sidecar-with-store.yml
检查资源
$ kc -n kube-mon describe pod -l app=prometheus
$ kc -n kube-mon logs -l app=prometheus
8. Thanos Query
Sidecar 与 Receiver 不同,Sidecar 模式下 Query WebUI 可以查看到 Alert、Record rule、Targets
这是因为 边车容器 与 Prometheus 容器一起运行, --reloader.config*
参数读取并 watch 配置文件变化,自然是能获取到
创建资源
$ kc apply -f thanos-querier.yml
$ kc -n kube-mon get pod -l app=thanos-querier
$ kc -n kube-mon describe pod -l app=thanos-querier
$ kc -n kube-mon logs -l app=thanos-querier
9. Thanos frontend Query
$ kc apply -f thanos-query-frontend.yml
$ kc -n kube-mon get pod -l app=thanos-query-frontend
$ kc -n kube-mon describe pod -l app=thanos-query-frontend
$ kc -n kube-mon logs -l app=thanos-query-frontend
10. Thanos Compact
当监控数据量非常庞大时,可以考虑安装 Thanos Compactor 组件,Compactor 组件支持数据块 压缩、清理、降采样
- 压缩:压缩 block(将多个 block 合并成一个)
- 清理:清理超过保留期限的 block
- 降采样:降低数据精度
--retention.resolution-raw
:对象存储中只保存特定时间长度的数据块,超过即清理(单位:d,默认 0d)--retention.resolution-5m
:为每个存储时长大于 40 小时的块开辟新的存储区域(块),块内数据以 5 分钟为精度进行下采样(单位:d,默认 0d)--retention.resolution-1h
:为每个存储时长大于 10 天的块中开辟新的存储区域(块),块内数据以 1 小时为精度进行下采样(单位:d,默认 0d)
资源定义
apiVersion: v1
kind: Service
metadata:
name: thanos-compactor
namespace: kube-mon
labels:
app: thanos-compactor
spec:
ports:
- port: 10902
targetPort: http
name: http
selector:
app: thanos-compactor
type: NodePort
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: thanos-compactor
namespace: kube-mon
labels:
app: thanos-compactor
spec:
replicas: 1
selector:
matchLabels:
app: thanos-compactor
serviceName: thanos-compactor
template:
metadata:
labels:
app: thanos-compactor
spec:
volumes:
- name: object-storage-config
secret:
secretName: thanos-objectstorage
containers:
- name: thanos
image: thanosio/thanos:v0.31.0
imagePullPolicy: IfNotPresent
args:
- "compact"
- "--log.level=debug"
- "--data-dir=/data"
- "--objstore.config-file=/etc/secret/thanos.yaml"
- "--retention.resolution-raw=60d" # 保留远端对象存储中多久的数据
# 空间充裕的话可以关闭下采样
# - "--debug.disable-downsampling"
- "--wait"
ports:
- name: http
containerPort: 10902
livenessProbe:
httpGet:
port: 10902
path: /-/healthy
initialDelaySeconds: 10
readinessProbe:
httpGet:
port: 10902
path: /-/ready
initialDelaySeconds: 15
volumeMounts:
- name: object-storage-config
mountPath: /etc/secret
readOnly: false
创建资源
$ kc apply -f thanos-compactor.yml
检查确认
$ kc -n kube-mon describe pod -l app=thanos-compactor
$ kc -n kube-mon logs -l app=thanos-compactor
访问 WebUI
当前没有上传到数据块,所以提示 No block found.
稍等一段时间后,我们再查看下,回顾下 Prometheus tsdb 参数
# !!!这里设置为 10m 仅为了测试 Compact 压缩合并,合理值应为 2h!!!
- "--storage.tsdb.min-block-duration=10m" # Thanos 处理数据压缩
- "--storage.tsdb.max-block-duration=10m"
OK,可以看到,一个个 10m 的小快被合并在了一起
11. Grafana
$ kc apply -f grafana.yml
$ kc -n kube-mon describe pod -l app=grafana
$ kc get pods -n kube-mon -o wide -l app=grafana
$ kc -n kube-mon logs -f -l app=grafana
12. Promoter
$ kc apply -f secret-promoter-config.yml
$ kc apply -f promoter.yml
$ kc get pods -n kube-mon -o wide -l app=promoter
$ kc -n kube-mon logs -l app=promoter
13. AlertManager
# 部署
$ kc apply -f configmap-alertmanager-config.yml
$ kc apply -f alertmanager.yml
# 检查
$ kc get all -n kube-mon -l app=alertmanager
$ kc -n kube-mon logs -l app=alertmanager
14. 功能确认
告警测试
下面命令会触发 HostHighTmpfsUsed 告警策略
$ dd if=/dev/urandom of=testfile count=300 bs=1M
300+0 records in
300+0 records out
314572800 bytes (315 MB) copied, 1.13119 s, 278 MB/s
OK,钉钉通过
微信通过
OSS 存储
观察日志
kc -n kube-mon logs prometheus-0 -c thanos | grep upload
关键输出
level=debug ts=2023-04-10T02:36:39.12636033Z caller=objstore.go:288 msg="uploaded file" from=/prometheus/prometheus-0/thanos/upload/01GXMGAZMBZT67FVVRV0HZCF82/chunks/000001 dst=01GXMGAZMBZT67FVVRV0HZCF82/chunks/000001 bucket="tracing: thanos"
level=debug ts=2023-04-10T02:36:39.208398455Z caller=objstore.go:288 msg="uploaded file" from=/prometheus/prometheus-0/thanos/upload/01GXMGAZMBZT67FVVRV0HZCF82/index dst=01GXMGAZMBZT67FVVRV0HZCF82/index bucket="tracing: thanos"
level=debug ts=2023-04-10T02:36:39.299048784Z caller=objstore.go:288 msg="uploaded file" from=/prometheus/prometheus-0/thanos/upload/01GXMGB2VABREC82XA7NXB360V/chunks/000001 dst=01GXMGB2VABREC82XA7NXB360V/chunks/000001 bucket="tracing: thanos"
level=debug ts=2023-04-10T02:36:39.333543236Z caller=objstore.go:288 msg="uploaded file" from=/prometheus/prometheus-0/thanos/upload/01GXMGB2VABREC82XA7NXB360V/index dst=01GXMGB2VABREC82XA7NXB360V/index bucket="tracing: thanos"
查看 Minio 平台
Compact 压缩
Compact 组件操作的 OSS 中的数据块,如果没发现数据,自然是不会进行任何操作的,通过日志可以观察到,默认情况下它是每分钟检查一次
# ...
### 02:29:27 获取元数据
level=debug ts=2023-04-10T02:29:27.561092785Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
### 啥也没有 cached=0 returned=0 partial=0
level=info ts=2023-04-10T02:29:27.562524099Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=1.478105ms duration_ms=1 cached=0 returned=0 partial=0
### 02:30:27 获取元数据
level=debug ts=2023-04-10T02:30:27.560498097Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=info ts=2023-04-10T02:30:27.562412735Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=2.012146ms duration_ms=2 cached=0 returned=0 partial=0
### 02:31:27 获取元数据
level=debug ts=2023-04-10T02:31:27.560375625Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=info ts=2023-04-10T02:31:27.561918533Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=1.583797ms duration_ms=1 cached=0 returned=0 partial=0
### 02:32:27 获取元数据
level=debug ts=2023-04-10T02:32:27.560803846Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=info ts=2023-04-10T02:32:27.56304876Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duratio
n=2.387654ms duration_ms=2 cached=0 returned=0 partial=0
当 sidecar 将 tsdb block 上传到 minio 后,Compact 组件自动发现新块
level=debug ts=2023-04-10T02:37:27.561358271Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=info ts=2023-04-10T02:37:27.599941009Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=38.670452ms duration_ms=38 cached=28 returned=28 partial=0
level=info ts=2023-04-10T02:38:27.555831736Z caller=compact.go:1291 msg="start sync of metas"
level=debug ts=2023-04-10T02:38:27.555906939Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.568020609Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAG7A8SWWVPVQ7MPYSS5Q
# ...
level=debug ts=2023-04-10T02:38:27.568293779Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMG9XYT0M89KVSCZ338Q3CP
level=info ts=2023-04-10T02:38:27.56833485Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duratio
n=6.589682ms duration_ms=6 cached=28 returned=28 partial=0
level=info ts=2023-04-10T02:38:27.56841679Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=12.589463ms duration_ms=12 cached=28 returned=0 partial=0
### 管理块数据(已上传的、标记为删除的)
level=info ts=2023-04-10T02:38:27.568431643Z caller=clean.go:34 msg="started cleaning of aborted partial uploads"
level=info ts=2023-04-10T02:38:27.568440234Z caller=clean.go:61 msg="cleaning of aborted partial uploads done"
level=info ts=2023-04-10T02:38:27.568448547Z caller=blocks_cleaner.go:44 msg="started cleaning of blocks marked for deletion"
level=info ts=2023-04-10T02:38:27.568457032Z caller=blocks_cleaner.go:58 msg="cleaning of blocks marked for deletion done"
#################################
level=debug ts=2023-04-10T02:38:27.568490022Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.580477594Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAWPMJ6WY1XVHPCQJW7WC
# ...
level=debug ts=2023-04-10T02:38:27.580742951Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAM6H4Q5BTEWFFZNG6D8Z
level=info ts=2023-04-10T02:38:27.580853148Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=12.385089ms duration_ms=12 cached=28 returned=0 partial=0
level=info ts=2023-04-10T02:38:27.580874618Z caller=compact.go:1296 msg="start of GC"
### 开始尝试压缩
level=info ts=2023-04-10T02:38:27.580894454Z caller=compact.go:1319 msg="start of compactions"
level=info ts=2023-04-10T02:38:27.580906902Z caller=compact.go:1355 msg="compaction iterations done"
### 尝试首轮降采样
level=info ts=2023-04-10T02:38:27.580926822Z caller=compact.go:430 msg="start first pass of downsampling"
level=debug ts=2023-04-10T02:38:27.580961433Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.594650377Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGA9JGPB54GW8GQKTFF3MY
# ...
level=debug ts=2023-04-10T02:38:27.594880554Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMG9VG9Y1QVACCAXG87F6DS
level=info ts=2023-04-10T02:38:27.594955239Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=14.018095ms duration_ms=14 cached=28 returned=0 partial=0
level=debug ts=2023-04-10T02:38:27.595026715Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.608109142Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAQ4SRGFFHW1G8FYNKDGG
# ...
level=debug ts=2023-04-10T02:38:27.608356229Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAMG1D8XSBPY11ESQAMHZ
level=info ts=2023-04-10T02:38:27.60877326Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.79693ms duration_ms=13 cached=28 returned=0 partial=0
### 尝试二轮降采样
level=debug ts=2023-04-10T02:38:27.608918151Z caller=downsample.go:246 msg="downsampling bucket" concurrency=1
level=info ts=2023-04-10T02:38:27.608974613Z caller=compact.go:444 msg="start second pass of downsampling"
level=debug ts=2023-04-10T02:38:27.609017232Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.622177502Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGASPAKPDMPZYRAX4WZJKW
# ...
level=info ts=2023-04-10T02:38:27.622523209Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.534556ms duration_ms=13 cached=28 returned=0 partial=0
level=debug ts=2023-04-10T02:38:27.622644914Z caller=downsample.go:246 msg="downsampling bucket" concurrency=1
level=info ts=2023-04-10T02:38:27.622707242Z caller=compact.go:451 msg="downsampling iterations done"
level=debug ts=2023-04-10T02:38:27.622745364Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.633623809Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGA9JGPB54GW8GQKTFF3MY
# ...
level=debug ts=2023-04-10T02:38:27.63391206Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGAWPDXHNCV31ZKMRT3G52
level=info ts=2023-04-10T02:38:27.633998101Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=11.278018ms duration_ms=11 cached=28 returned=0 partial=0
level=info ts=2023-04-10T02:38:27.634018874Z caller=retention.go:32 msg="start optional retention"
level=info ts=2023-04-10T02:38:27.634027758Z caller=retention.go:47 msg="optional retention apply done"
level=debug ts=2023-04-10T02:38:27.634065079Z caller=fetcher.go:327 component=block.BaseFetcher msg="fetching meta data" concurrency=32
level=debug ts=2023-04-10T02:38:27.644224495Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMGA21M8F0CRQ6XTMCJFCZD
# ...
level=debug ts=2023-04-10T02:38:27.644457389Z caller=fetcher.go:777 msg="block is too fresh for now" block=01GXMG9VG9Y1QVACCAXG87F6DS
level=info ts=2023-04-10T02:38:27.644539853Z caller=fetcher.go:478 component=block.BaseFetcher msg="successfully synchronized block metadata" durati
on=10.502657ms duration_ms=10 cached=28 returned=0 partial=0
level=info ts=2023-04-10T02:38:27.644560797Z caller=clean.go:34 msg="started cleaning of aborted partial uploads"
level=info ts=2023-04-10T02:38:27.644569866Z caller=clean.go:61 msg="cleaning of aborted partial uploads done"
level=info ts=2023-04-10T02:38:27.644578172Z caller=blocks_cleaner.go:44 msg="started cleaning of blocks marked for deletion"
level=info ts=2023-04-10T02:38:27.644587935Z caller=blocks_cleaner.go:58 msg="cleaning of blocks marked for deletion done"
通过日志可以看出 Compact 组件也开始工作了,包括:整理数据块、删除已标记的块(这里没有配置保留策略,所以会一直保存下去)
Grafana 大盘
创建数据源,略过
导入仪表盘 18435
资源汇总
贴一下最终 Kubernetes 资源汇总
ConfigMap
$ Thanos-demo kc get cm -n kube-mon
NAME DATA AGE
configmap-alertmanager-config 1 3h4m
configmap-prom-config 1 3h26m
configmap-prom-rules 2 3h25m
kube-root-ca.crt 1 3h30m
Secret
$ kc get secret -n kube-mon
NAME TYPE DATA AGE
default-token-lfxrn kubernetes.io/service-account-token 3 3h31m
prometheus-token-6gxhq kubernetes.io/service-account-token 3 3h31m
secret-promoter-config Opaque 1 3h5m
thanos-objectstorage Opaque 1 3h31m
StorageClass
$ kc get sc -n kube-mon
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 3h34m
PV
$ kc get pv -n kube-mon
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
grafana-local 1Gi RWO Retain Bound kube-mon/grafana-data local-storage 3h6m
monio-local 10Gi RWO Retain Bound default/minio-pvc local-storage 3h15m
prometheus-local 40Gi RWX Retain Bound kube-mon/prometheus-data local-storage 3h9m
thanos-store-gateway-local 2Gi RWO Retain Bound kube-mon/thanos-store-gateway-pvc local-storage 3h13m
PVC
$ kc get pvc -n kube-mon
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
grafana-data Bound grafana-local 1Gi RWO local-storage 3h7m
prometheus-data Bound prometheus-local 40Gi RWX local-storage 3h10m
thanos-store-gateway-pvc Bound thanos-store-gateway-local 2Gi RWO local-storage 3h14m
DaemonSet
$ kc get ds -n kube-mon
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
node-exporter 3 3 3 3 3 kubernetes.io/os=linux 3h34m
Deployment
$ kc get deploy -n kube-mon
NAME READY UP-TO-DATE AVAILABLE AGE
alertmanager 1/1 1 1 3h9m
grafana 1/1 1 1 3h9m
promoter 1/1 1 1 3h9m
thanos-querier 2/2 2 2 3h8m
thanos-query-frontend 1/1 1 1 174m
StatefulSet
$ kc get sts -n kube-mon
NAME READY AGE
prometheus 2/2 52m
thanos-compactor 1/1 174m
thanos-store-gateway 2/2 3h16m
Service
$ kc get svc -n kube-mon -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
alertmanager ClusterIP 10.108.96.79 <none> 9193/TCP 3h10m app=alertmanager
grafana NodePort 10.107.122.187 <none> 3000:32549/TCP 3h10m app=grafana
prometheus NodePort 10.98.89.117 <none> 9090:30549/TCP 3h13m app=prometheus
promoter ClusterIP 10.101.77.189 <none> 9194/TCP 3h10m app=promoter
thanos-compactor NodePort 10.106.233.137 <none> 10902:31336/TCP 175m app=thanos-compactor
thanos-querier NodePort 10.111.95.215 <none> 9090:31295/TCP 3h9m app=thanos-querier
thanos-query-frontend NodePort 10.99.87.116 <none> 9090:32734/TCP 175m app=thanos-query-frontend
thanos-store-gateway ClusterIP None <none> 10901/TCP 3h17m thanos-store-api=true
镜像
$ ctr --namespace k8s.io images ls -q|grep -v 'sha256'
docker.io/grafana/grafana:9.4.7
docker.io/library/busybox:latest
docker.io/lotusching/promoter:latest
docker.io/minio/minio:latest
docker.io/prom/alertmanager:v0.25.0
docker.io/prom/node-exporter:v1.5.0
docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
docker.io/rancher/mirrored-flannelcni-flannel:v0.20.1
docker.io/thanosio/thanos:v0.31.0
registry.aliyuncs.com/google_containers/coredns:v1.8.4
registry.aliyuncs.com/google_containers/etcd:3.5.0-0
registry.aliyuncs.com/google_containers/kube-apiserver:v1.22.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.22.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.22.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.22.2
registry.aliyuncs.com/google_containers/pause:3.5
registry.aliyuncs.com/google_containers/pause:3.6
七、故障排查
部署期间遇到大大小小的问题,这里贴一下大致的排障思路
检查 Pod 状态,观察 READY、STATUS、RESTART 列
$ kc get pods -o wide -n kube-mon
如果状态不正确,检查 Pod 状态详细描述
$ kc -n kube-mon describe -l app=<name>
- 检查是否正常调度
- 检查 PV、PVC 是否正确挂载
- …
如果 Pod 是 PV、PVC 相关问题
检查 pv、pvc 是否正确关联绑定,关注
NAME
与CLAIM
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/grafana-local 1Gi RWO Retain Bound kube-mon/grafana-data local-storage 3h20m
仪表盘 Sytemd 服务单元状态无数据,这是因为 node_exporter 运行在容器中,暂时没去处理,等后续补上处理方式
- minio 无上传数据
- 检查数据目录下,是否生成
01GX...
相关的数据文件目录,如果没有,那就是正常的,因为时间还不够,如果着急确认上传功能,可以通过缩短--storage.tsdb.min-block-duration
、--storage.tsdb.max-block-duration
这两个参数,重建 Prometheus 工作负载即可 - 检查上传日志,
kc -n kube-mon logs prometheus-0 -c thanos | grep upload
- 检查数据目录下,是否生成