云服务器Grafana+VictoriaMetrics监控方案

一、VictoriaMetrics安装

1. 单节点部署

bash
# 下载VictoriaMetrics
wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.91.3/victoria-metrics-linux-amd64-v1.91.3.tar.gz
tar xzf victoria-metrics-linux-amd64-v1.91.3.tar.gz

# 创建存储目录
mkdir -p /data/victoria-metrics

# 启动服务
./victoria-metrics \
-storageDataPath=/data/victoria-metrics \
-retentionPeriod=3 \
-httpListenAddr=:8428

2. 集群部署配置

yaml
# docker-compose.yml
version: '3'
services:
vmstorage:
image: victoriametrics/vmstorage
command:
- "--storageDataPath=/storage"
volumes:
- storage-data:/storage

vmselect:
image: victoriametrics/vmselect
depends_on:
- vmstorage
command:
- "--storageNode=vmstorage:8400"

vminsert:
image: victoriametrics/vminsert
depends_on:
- vmstorage
command:
- "--storageNode=vmstorage:8400"

二、Grafana配置

1. Grafana安装

bash
# 添加Grafana源
wget -q -O - https://packages.grafana.com/gpg.key | apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | tee /etc/apt/sources.list.d/grafana.list

# 安装Grafana
apt update
apt install -y grafana

# 启动服务
systemctl enable grafana-server
systemctl start grafana-server

2. 数据源配置

json
{
"name": "VictoriaMetrics",
"type": "prometheus",
"url": "http://localhost:8428",
"access": "proxy",
"basicAuth": false,
"isDefault": true
}

三、数据采集配置

1. vmagent配置

yaml
# vmagent.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']

- job_name: 'mysql'
static_configs:
- targets: ['localhost:9104']

2. 服务发现配置

yaml
# service-discovery.yml
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true

四、告警配置

1. 告警规则

yaml
# rules.yml
groups:
- name: server_alerts
rules:
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{$labels.instance}}"

2. 告警通知

json
// Grafana告警通道配置
{
"name": "webhook",
"type": "webhook",
"settings": {
"url": "http://alert-webhook:8080/notify",
"httpMethod": "POST"
}
}

五、Dashboard配置

1. 系统监控面板

json
{
"dashboard": {
"panels": [
{
"title": "CPU Usage",
"type": "graph",
"datasource": "VictoriaMetrics",
"targets": [
{
"expr": "100 - (avg by (instance) (irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)"
}
]
}
]
}
}

2. 性能监控面板

json
{
"dashboard": {
"panels": [
{
"title": "Memory Usage",
"type": "gauge",
"datasource": "VictoriaMetrics",
"targets": [
{
"expr": "node_memory_MemUsed_bytes / node_memory_MemTotal_bytes * 100"
}
]
}
]
}
}

六、性能优化

1. VictoriaMetrics优化

bash
# VictoriaMetrics启动参数优化
./victoria-metrics \
-storageDataPath=/data/victoria-metrics \
-memory.allowedPercent=60 \
-search.maxUniqueTimeseries=1000000 \
-search.maxQueryDuration=30s

2. Grafana性能调优

ini
# grafana.ini
[server]
http_port = 3000
root_url = https://grafana.example.com

[database]
type = mysql
host = localhost:3306
name = grafana
user = grafana
password = password

[session]
provider = redis
provider_config = addr=127.0.0.1:6379

最佳实践建议

  1. 数据采集
  • 合理设置采集间隔
  • 优化指标筛选
  • 配置服务发现
  • 管理数据保留
  1. 告警配置
  • 设置合理阈值
  • 配置告警级别
  • 优化告警规则
  • 管理通知渠道
  1. 可视化展示
  • 合理布局面板
  • 优化查询语句
  • 配置自动刷新
  • 设置权限控制

本指南为您提供了在云服务器上搭建Grafana+VictoriaMetrics监控系统的完整方案。记住,监控系统的配置需要根据实际业务需求不断调整和优化。

重要提示:

  1. 定期更新组件版本
  2. 监控存储容量
  3. 优化查询性能
  4. 做好数据备份

对于生产环境的监控系统,建议建立完善的备份机制,确保监控数据的安全性。同时,要注意性能监控和资源使用情况,确保监控系统本身的稳定运行。

实操指南知识库

云服务器Harbor私有镜像仓库搭建

2024-12-18 12:31:53

实操指南知识库

云服务器Graylog日志分析平台部署

2024-12-18 14:33:39

0 条回复 A文章作者 M管理员
    暂无讨论,说说你的看法吧