33 个重大版本

0.34.0	2024 年 8 月 5 日
0.33.0	2024 年 4 月 9 日
0.32.0	2024 年 2 月 20 日
0.30.0	2023 年 11 月 29 日
0.13.0	2020 年 11 月 26 日

#122 在数据库接口

每月 202 次下载

MIT 许可证

130KB
2.5K SLoC

Elasticsearch exporter

Prometheus Elasticsearch 导出器，能够处理大型集群。 注意：启用所有指标可能会导致 Prometheus 服务器过载，导出器能够导出近 100 万个指标。为了避免过载 Prometheus 服务器，运行多个仅针对少量特定指标的 Elasticsearch 导出器。

$ curl -s http://127.0.0.1:9222/metrics | wc
 940272 1887011 153668390

使用 docker 尝试一下

$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT

特性

指标收集与提供 /metrics 页面解耦
跳过零/空指标（由标志 exporter_allow_zero_metrics 控制）
Elasticsearch "millis" 转换为秒
Elasticsearch "kilobytes" 转换为字节
所有基于时间的指标都转换为 f64 秒，将关键字 millis 替换为 seconds
添加了 _bytes 和 _seconds 后缀
保留指标树命名空间直到最后一个叶子节点
- elasticsearch_cat_indices_pri_warmer_total_time_seconds_bucket
- elasticsearch_cat_health_unassign
- elasticsearch_nodes_info_jvm_mem_heap_max_in_bytes
为方便比较集群版本之间的指标，添加了自定义命名空间标签 vin_cluster_version
每 5 分钟自动更新集群元数据
/_nodes/* API 通过将节点 ID 映射到获取的集群元数据注入额外的标签
- 名称（从节点 ID 映射到名称）命名空间 -> name
- 版本（Elasticsearch 节点版本）命名空间到 vin_cluster_version
- IP 命名空间 -> ip
根据生存期设置自动删除指标，默认情况下，指标在最后一次出现后的 600 秒将被删除。
度量名称将被标准化为蛇形命名法，冒号被下划线替换，括号被冒号(:)替换
- "transport_actions_cluster:monitor/nodes/info[n]_requests_count" -> "transport_actions_cluster_monitor_nodes_info:n:_requests_count"
- "transport_actions_internal:cluster/coordination/join/ping_requests_count" -> "transport_actions_internal_cluster_coordination_join_ping_requests_count"

选项

可配置的标签 "skip" 和/或 "include"（标志：exporter_include_labels，exporter_skip_labels）
可配置的跳过度量（通过标志 exporter_skip_metrics 控制）
可配置的全局超时（标志 elasticsearch_global_timeout）
可配置的全局轮询间隔（标志 exporter_poll_default_interval）
可配置的每个度量轮询间隔（标志 exporter_poll_intervals）
可配置的度量收集（标志 exporter_metrics_enabled）
可配置的度量命名空间（标志 exporter_metrics_namespace）：度量将使用自定义命名空间前缀，而不是 elasticsearch
可配置的元数据收集（标志 exporter_metadata_refresh_interval）

TLS验证

证书路径通过标志 --elasticsearch_certificate_path=CERTIFICATE_PATH 定义。

默认情况下启用TLS验证，无需进行配置。默认验证证书，这验证了服务器提供的证书是由受信任的证书颁发机构（CA）签发的，同时也验证了服务器的主机名（或IP地址）与证书中由通用名称（CN）或主题备用名称（SAN）识别的名称相匹配。

无验证

不验证服务器提供的证书。

提供标志 --elasticsearch_certificate_validation=none

完全验证

完全验证证书，这验证了服务器提供的证书是由受信任的证书颁发机构（CA）签发的，同时也验证了服务器的主机名（或IP地址）与证书中由通用名称（CN）或主题备用名称（SAN）识别的名称相匹配。

提供标志 --elasticsearch_certificate_validation=full

部分验证

验证服务器提供的证书是由受信任的证书颁发机构（CA）签发的，但不执行主机名验证。

提供标志 --elasticsearch_certificate_validation=partial

用法速查表

抓取 /_nodes/stats 子系统线程池路径度量

$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="nodes_stats=true" --elasticsearch_path_parameters="nodes_stats=thread_pool"

抓取 /_nodes/stats 子系统线程池 + fs路径度量

$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="nodes_stats=true" --elasticsearch_path_parameters="nodes_stats=thread_pool,fs"

仅抓取 /stats 的 total.indexing 和 total.search 度量

$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="stats=true" --elasticsearch_query_filter_path="stats=indices.*.total.indexing,indices.*.total.search"

仅抓取 /_cat/shards 的 search.fetch* 度量。在这种情况下，elasticsearch_query_filter_path 必须始终包含 index,shard，并且不支持点格式。示例

$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="cat_shards=true" --elasticsearch_query_filter_path="cat_shards=index,shard,search*fetch*"

$ curl -s http://127.0.0.1:9222
Proper Elasticsearch exporter

Available /_cat subsystems:
 - cat_allocation
 - cat_shards
 - cat_indices
 - cat_segments
 - cat_nodes
 - cat_recovery
 - cat_health
 - cat_pending_tasks
 - cat_aliases
 - cat_thread_pool
 - cat_plugins
 - cat_fielddata
 - cat_nodeattrs
 - cat_repositories
 - cat_templates
 - cat_transforms
Available /_cluster subsystems:
 - cluster_health
Available /_nodes subsystems:
 - nodes_usage
 - nodes_stats
 - nodes_info
Available /_stats subsystems:
 - stats

Exporter settings:
elasticsearch_url: http://127.0.0.1:9200
elasticsearch_global_timeout: 30s
elasticsearch_query_fields:
elasticsearch_subsystem_timeouts:
 - nodes_stats: 15s
elasticsearch_path_parameters:
 - nodes_info: http,jvm,thread_pool
 - nodes_stats: breaker,indices,jvm,os,process,transport,thread_pool
exporter_skip_labels:
 - cat_allocation: health,status
 - cat_fielddata: id
 - cat_indices: health,status
 - cat_nodeattrs: id
 - cat_nodes: health,status,pid
 - cat_plugins: id,description
 - cat_segments: health,status,checkpoint,prirep
 - cat_shards: health,status,checkpoint,prirep
 - cat_templates: composed_of
 - cat_thread_pool: node_id,ephemeral_node_id,pid
 - cat_transforms: health,status
 - cluster_stats: segment,patterns
exporter_include_labels:
 - cat_aliases: index,alias
 - cat_allocation: node
 - cat_fielddata: node,field
 - cat_health: shards
 - cat_indices: index
 - cat_nodeattrs: node,attr
 - cat_nodes: ip,name,node_role
 - cat_pending_tasks: index
 - cat_plugins: name
 - cat_recovery: index,shard,stage,type
 - cat_repositories: index
 - cat_segments: index,shard
 - cat_shards: index,node,shard
 - cat_templates: name,index_patterns
 - cat_thread_pool: node_name,name,type
 - cat_transforms: index
 - cluster_health: status
 - nodes_info: name
 - nodes_stats: name
 - nodes_usage: name
 - stats: index
exporter_skip_metrics:
 - cat_aliases: filter,routing_index,routing_search,is_write_index
 - cat_nodeattrs: pid
 - cat_recovery: start_time,start_time_millis,stop_time,stop_time_millis
 - cat_templates: order
 - nodes_usage: _nodes_total,_nodes_successful,since
exporter_poll_default_interval: 15s
exporter_poll_intervals:
 - cluster_health: 5s
exporter_skip_zero_metrics: true
exporter_metrics_enabled:
 - cat_health: true
 - cat_indices: true
 - nodes_info: true
 - nodes_stats: true
exporter_metrics_namespace: elasticsearch
exporter_metadata_refresh_interval: 180s
exporter_metrics_lifetime_default_interval: 15s
exporter_metrics_lifetime_interval:
 - cat_indices: 180s
 - cat_nodes: 60s
 - cat_recovery: 60s

自导出度量

# HELP elasticsearch_subsystem_request_duration_seconds The Elasticsearch subsystem request latencies in seconds.
# TYPE elasticsearch_subsystem_request_duration_seconds histogram
elasticsearch_subsystem_request_duration_seconds_bucket{cluster="devnull",subsystem="/_nodes/os",le="0.005"} 0
elasticsearch_subsystem_request_duration_seconds_sum{cluster="devnull",subsystem="/nodes_stats"} 0.130069193
elasticsearch_subsystem_request_duration_seconds_count{cluster="devnull",subsystem="/nodes_stats"} 1
# HELP http_request_duration_seconds The HTTP request latencies in seconds.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{handler="/metrics",le="0.005"} 1
http_request_duration_seconds_sum{handler="/metrics"} 0.004372555
http_request_duration_seconds_count{handler="/metrics"} 1
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.24
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 16
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 25006080
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1605894185.46
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1345773568

调试

级别：info,warn,error,debug,trace

调试HTTP请求

export RUST_LOG=info,reqwest=debug

跟踪一切

export RUST_LOG=trace

开发

启动

cargo run --bin elasticsearch_exporter

测试

cargo test

许可证

MIT

依赖项

~18–34MB
~580K SLoC