Prometheus HTTP API 查询

在 Prometheus 服务器的 /api/v1 下可以访问当前稳定的 HTTP API。新添加的 API 接口在如果没有任务破坏性，也会加在这个 endpoint 。

接口格式

Prometheus 的 API 响应格式是 JSON，每一个成功的 API 请求都会返回 2xx 的状态码。

针对 Prometheus 的 API 的无效请求会返回一个 JSON 的错误对象，并且会返回下列 HTTP 响应码：

400 Bad Request ：一般情况下是参数丢失或者错误造成的。
422 Unprocessable Entity ：一般情况下是表达式无法执行。
503 Service Unavailable ：一般情况下是请求超时或者被中止了。

当请求没有到达 API 就发生错误的时候，会返回其他非 2xx 的状态码。

如果请求中存在不妨碍请求执行的错误， Prometheus 可能返回一组警告。所有请求成功的数据将在对应的数据字段中返回。

JSON 响应的返回格式如下：

{
  "status": "success" | "error",
  "data": <data>,

  // 仅当 status 为 error 时设置。数据字段仍然可以保存额外的数据。
  "errorType": "<string>",
  "error": "<string>",

  // 仅当执行请求时有警告时，数据字段中仍然有数据.

  "warnings": ["<string>"]
}

通用的占位符定义如下：

: 一般输入的时间戳以 RFC3339 的格式来进行提供，或者是一个单位是秒的 Unix 时间戳，可选的小数位数表示秒级精度。输出时间戳是 Unix时间戳，单位为秒。
: Prometheus 时间序列选择器像http_requests_total 或 http_requests_total{方法=~"(GET|POST)}，需要 URL-encoded。
: Prometheus 的时间字符。比如 5m 表示 5 分钟。
: 布尔值， true 或者 false 。

表达式查询

查询语言表达式可以在计算瞬间向量，也可以在计算范围向量。下面的小节描述每种类型的表达式查询的API端点。

瞬时查询

使用 GET 或者 POST 访问下列 API 接口，可以在单个时间点计算瞬时查询:

GET /api/v1/query
POST /api/v1/query

URL 查询参数：

query= : Prometheus 查询表达式。
time= : 时间戳，可选的一个选项，不填的话查询当前时间的值，支持 UNIX 时间戳和 RFC 3339 格式的时间描述。
timeout= : 超时时间，是一个可选的选项。默认使用 -query.timeout 指定的值。

在请求 API 接口时可以使用 POST 方法并且 header 头可以使用 Content-Type: application/x-www-form-urlencoded , 当大查询中包含违反服务端 URL 字符限制时比较有用。

查询结果的数据部分格式如下:

{
  "resultType": "matrix" | "vector" | "scalar" | "string",
  "result": <value>
}

<value> 是查询结果，根据查询的类型不同，格式也不一样。

下面这个示例会查询 up 指标在 2021-11-08T23:10:51.781Z 时间的值。

$ curl 'http://localhost:9090/api/v1/query?query=up&time=2021-11-08T23:10:51.781Z'
{
   "status" : "success",
   "data" : {
      "resultType" : "vector",
      "result" : [
         {
            "metric" : {
               "__name__" : "up",
               "job" : "prometheus",
               "instance" : "localhost:9090"
            },
            "value": [ 1636384171.781, "1" ]
         },
         {
            "metric" : {
               "__name__" : "up",
               "job" : "node",
               "instance" : "localhost:9100"
            },
            "value" : [ 1636384171.781, "0" ]
         }
      ]
   }
}

范围时间查询

使用 GET 或者 POST 访问下列 API 接口，可以在一个时间段内计算查询表达式：

GET /api/v1/query_range
POST /api/v1/query_range

URL 查询参数：

query= : Prometheus 查询表达式
start= : 开始时间戳
end= : 结束时间戳
step= : 步长。简单理解就是查询的这段时间中两个数据点的间隔是多长时间，比如每分钟一个点，或者每十秒一个点，这就是步长。格式是 1m 、10s 这种。
timeout= : 超时时间，是一个可选的选项。如果不指定，默认使用 -query.timeout 指定的值。

和进行瞬时查询一样，在请求 API 接口时可以使用 POST 方法并且 header 头可以使用 Content-Type: application/x-www-form-urlencoded , 当大查询中包含违反服务端 URL 字符限制时比较有用。

查询结果的数据部分格式如下:

{
  "resultType": "matrix",
  "result": <value>
}

<value> 是查询结果，

下面的示例在 30 秒范围内查询表达式 up 的值，查询步长为15秒。

$ curl 'http://localhost:9090/api/v1/query_range?query=up&start=2021-11-08T23:10:51.781Z&end=2021-11-08T23:10:51.781Z&step=15s'
{
   "status" : "success",
   "data" : {
      "resultType" : "matrix",
      "result" : [
         {
            "metric" : {
               "__name__" : "up",
               "job" : "prometheus",
               "instance" : "localhost:9090"
            },
            "values" : [
               [ 1636384171.781, "1" ],
               [ 1636384186.781, "1" ],
               [ 1636384201.781, "1" ]
            ]
         },
         {
            "metric" : {
               "__name__" : "up",
               "job" : "node",
               "instance" : "localhost:9091"
            },
            "values" : [
               [ 1636384171.781, "0" ],
               [ 1636384186.781, "0" ],
               [ 1636384201.781, "1" ]
            ]
         }
      ]
   }
}

查询元数据

Prometheus 提供了一组 API 接口来查询关于 series 及其 label 的元数据。

通过 Label 查找 Series

个人理解，不一定对。Label 是 Metric 的一个标签，一个 Metric 的所有 Label 构成了这个 Metric 的 Series。

下面的接口会返回匹配 Label 的时间序列列表

GET /api/v1/series
POST /api/v1/series

URL 查询参数如下：

match[]= : 要查询的指标的 Series，要至少提供一个match[] 参数。
start= : 开始时间戳
end= : 结束时间戳

查询结果的数据部分由一个对象列表组成，其中包含标识每个 series 的 label 名称和 label值，以成对的方式出现。

下面的例子会返回 up 和 process_start_time_seconds{job="prometheus"} 所有匹配的 Series :

$ curl -g 'http://localhost:9090/api/v1/series?' --data-urlencode 'match[]=up' --data-urlencode 'match[]=process_start_time_seconds{job="prometheus"}'
{
   "status" : "success",
   "data" : [
      {
         "__name__" : "up",
         "job" : "prometheus",
         "instance" : "localhost:9090"
      },
      {
         "__name__" : "up",
         "job" : "node",
         "instance" : "localhost:9091"
      },
      {
         "__name__" : "process_start_time_seconds",
         "job" : "prometheus",
         "instance" : "localhost:9090"
      }
   ]
}

获取 label 名称

这个接口会返回一个 Label 名称的列表：

GET /api/v1/labels
POST /api/v1/labels

URL 查询参数如下：

match[]= : 要查询的指标的 Series，可选。
start= : 开始时间戳，可选。
end= : 结束时间戳，可选。

这个接口最后返回的是一个 JSON 格式，是一个字符类型的 Label 名称列表。

我们来看一个例子：

$ curl 'localhost:9090/api/v1/labels'
{
    "status": "success",
    "data": [
        "__name__",
        "call",
        "code",
        "config",
        "dialer_name",
        "endpoint",
        "event",
        "goversion",
        "handler",
        "instance",
        "interval",
        "job",
        "le",
        "listener_name",
        "name",
        "quantile",
        "reason",
        "role",
        "scrape_job",
        "slice",
        "version"
    ]
}

获取 label 的值

这个接口可以获取指定 Label 名称的值：

GET /api/v1/label/<label_name>/values

URL 查询参数如下：

match[]= : 要查询的指标的 Series，可选。
start= : 开始时间戳，可选。
end= : 结束时间戳，可选。

这个接口最后返回的是一个 JSON 格式，是一个字符类型的 Label 名称列表。

我们来看一个例子，我们指定一个 Label 名称 job，

$ curl http://localhost:9090/api/v1/label/job/values
{
   "status" : "success",
   "data" : [
      "node",
      "prometheus"
   ]
}

查询 Exemplar

这是一个实验性的接口，未来可能会改变，当前版本是 v2.31.0。这个接口需要使用 --enable-feature=exemplar-storage 来进行开启，不开启是无法使用的。

Exemplars 这个接口包含 seriesLabels 、exemplars 两个字段，

seriesLabels 是一个 Label 键值对的列表，列举了当前 Metric 的所有 Label。

exemplars 是列举了存储的 traceID= 和他的值。

后边我们专门找个时间聊聊这些实验特性。

以下接口返回特定时间范围内有效 PromQL 查询的示例列表:

GET /api/v1/query_exemplars
POST /api/v1/query_exemplars

URL 查询参数如下：

query= : Prometheus 表达式查询语句。
start= : 开始时间戳。
end= : 结束时间戳。

我们来看个例子，指定 test_exemplar_metric_total 这个 Metric 去查询

$ curl -g 'http://localhost:9090/api/v1/query_exemplars?query=test_exemplar_metric_total&start=2020-09-14T15:22:25.479Z&end=2020-09-14T15:23:25.479Z'
{
    "status": "success",
    "data": [
        {
            "seriesLabels": {
                "__name__": "test_exemplar_metric_total",
                "instance": "localhost:8090",
                "job": "prometheus",
                "service": "bar"
            },
            "exemplars": [
                {
                    "labels": {
                        "traceID": "EpTxMJ40fUus7aGY"
                    },
                    "value": "6",
                    "timestamp": 1600096945.479,
                }
            ]
        },
        {
            "seriesLabels": {
                "__name__": "test_exemplar_metric_total",
                "instance": "localhost:8090",
                "job": "prometheus",
                "service": "foo"
            },
            "exemplars": [
                {
                    "labels": {
                        "traceID": "Olp9XHlq763ccsfa"
                    },
                    "value": "19",
                    "timestamp": 1600096955.479,
                },
                {
                    "labels": {
                        "traceID": "hCtjygkIHwAN9vs4"
                    },
                    "value": "20",
                    "timestamp": 1600096965.489,
                },
            ]
        }
    ]
}

Target 查询结果格式

Target

这个接口会返回 Prometheus Target 发现的当前状态。

GET /api/v1/targets

默认情况下 activeTargets 和 droppedTargets 都是请求响应的一部分。

discoveredLabels 表示在进行 Label 重新标记发生之前，服务发现找到的未修改的原始 Label。

labels 表示 Label 重新标记后的结果。

$ curl http://localhost:9090/api/v1/targets
{
  "status": "success",
  "data": {
    "activeTargets": [
      {
        "discoveredLabels": {
          "__address__": "127.0.0.1:9090",
          "__metrics_path__": "/metrics",
          "__scheme__": "http",
          "job": "prometheus"
        },
        "labels": {
          "instance": "127.0.0.1:9090",
          "job": "prometheus"
        },
        "scrapePool": "prometheus",
        "scrapeUrl": "http://127.0.0.1:9090/metrics",
        "globalUrl": "http://example-prometheus:9090/metrics",
        "lastError": "",
        "lastScrape": "2017-01-17T15:07:44.723715405+01:00",
        "lastScrapeDuration": 0.050688943,
        "health": "up",
        "scrapeInterval": "1m",
        "scrapeTimeout": "10s"
      }
    ],
    "droppedTargets": [
      {
        "discoveredLabels": {
          "__address__": "127.0.0.1:9100",
          "__metrics_path__": "/metrics",
          "__scheme__": "http",
          "__scrape_interval__": "1m",
          "__scrape_timeout__": "10s",
          "job": "node"
        },
      }
    ]
  }
}

对于上边的这个查询，可以添加一个查询参数 state ，这个参数可以指定对应的 Target 状态，比如 state=active, state=dropped, state=any 。当然，这是个可选参数，不是必须的。

注意，对于过滤掉的目标，仍然返回一个空数组，里边的值会丢掉。比如：

$ curl 'http://localhost:9090/api/v1/targets?state=active'
{
  "status": "success",
  "data": {
    "activeTargets": [
      {
        "discoveredLabels": {
          "__address__": "127.0.0.1:9090",
          "__metrics_path__": "/metrics",
          "__scheme__": "http",
          "job": "prometheus"
        },
        "labels": {
          "instance": "127.0.0.1:9090",
          "job": "prometheus"
        },
        "scrapePool": "prometheus",
        "scrapeUrl": "http://127.0.0.1:9090/metrics",
        "globalUrl": "http://example-prometheus:9090/metrics",
        "lastError": "",
        "lastScrape": "2017-01-17T15:07:44.723715405+01:00",
        "lastScrapeDuration": 50688943,
        "health": "up"
      }
    ],
    "droppedTargets": []
  }
}

查询结果格式

我们通过表达式对 Prometheus 进行查询，会将返回的值放在结果的 data 部分。

<sample_value> 是数值的例子。

JSON 不支持特殊浮点值，比如 NaN 、Inf 、-Inf ，所以样本值是作为引用的 JSON 字符串而不是原始数字传输的。

范围向量

范围向量返回的结果类型是 matrix 。对应的result 属性的格式如下:

[
  {
    "metric": { "<label_name>": "<label_value>", ... },
    "values": [ [ <unix_time>, "<sample_value>" ], ... ]
  },
  ...
]

瞬时向量

瞬时向量返回的结果类型是 vector 。对应的result 属性的格式如下:

[
  {
    "metric": { "<label_name>": "<label_value>", ... },
    "value": [ <unix_time>, "<sample_value>" ]
  },
  ...
]

标量

标量返回的结果类型是 scalar 。对应的result 属性的格式如下:

[ <unix_time>, "<scalar_value>" ]

字符串

字符串返回的结果类型是 string 。对应的result 属性的格式如下:

[ <unix_time>, "<string_value>" ]

Alertmanagers

下面这个接口会返回 Prometheus 当前的告警管理状态。我想了一阵子，暂时没想到这个接口的应用场景，大家有找到这个接口的使用场景可以告诉我一下。

GET /api/v1/alertmanagers

在响应里会包含活跃的告警管理和被遗弃的告警管理。

$ curl http://localhost:9090/api/v1/alertmanagers
{
  "status": "success",
  "data": {
    "activeAlertmanagers": [
      {
        "url": "http://127.0.0.1:9090/api/v1/alerts"
      }
    ],
    "droppedAlertmanagers": [
      {
        "url": "http://127.0.0.1:9093/api/v1/alerts"
      }
    ]
  }
}

Rules

Prometheus 的 /rules 接口会返回当前加载的已经触发的告警规则和记录的告警规则列表。此外，它还返回由每个警报规则的 Prometheus 实例触发的当前处于 active 状态的告警。

/rules 是一个新的接口，当然和总体的 API v1 接口的稳定性相比还差一些。

GET /api/v1/rules

对这个接口进行访问的时候可以使用一个type 参数可以对结果进行过滤，这个参数有两个值，当 type=alert 时只返回告警规则，当type=record时，只返回记录规则。当没有参数或者参数为空的时候，不对结果进行过滤。

$ curl http://localhost:9090/api/v1/rules

{
    "data": {
        "groups": [
            {
                "rules": [
                    {
                        "alerts": [
                            {
                                "activeAt": "2018-07-04T20:27:12.60602144+02:00",
                                "annotations": {
                                    "summary": "High request latency"
                                },
                                "labels": {
                                    "alertname": "HighRequestLatency",
                                    "severity": "page"
                                },
                                "state": "firing",
                                "value": "1e+00"
                            }
                        ],
                        "annotations": {
                            "summary": "High request latency"
                        },
                        "duration": 600,
                        "health": "ok",
                        "labels": {
                            "severity": "page"
                        },
                        "name": "HighRequestLatency",
                        "query": "job:request_latency_seconds:mean5m{job=\"myjob\"} > 0.5",
                        "type": "alerting"
                    },
                    {
                        "health": "ok",
                        "name": "job:http_inprogress_requests:sum",
                        "query": "sum by (job) (http_inprogress_requests)",
                        "type": "recording"
                    }
                ],
                "file": "/rules.yaml",
                "interval": 60,
                "name": "example"
            }
        ]
    },
    "status": "success"
}

Alerts

/alerts 接口会返回一个活跃的告警列表。

和 /rules 一样是一个新的接口，稳定性还在进一步验证当中。

GET /api/v1/alerts

我们来看一个例子。

$ curl http://localhost:9090/api/v1/alerts

{
    "data": {
        "alerts": [
            {
                "activeAt": "2018-07-04T20:27:12.60602144+02:00",
                "annotations": {},
                "labels": {
                    "alertname": "my-alert"
                },
                "state": "firing",
                "value": "1e+00"
            }
        ]
    },
    "status": "success"
}

查询 Target 和 Metric 元数据

查询 Target 元数据

这个接口会返回 Metric 当前可以获取到的所有 Target 的元数据。这也是一个实验性的接口。

GET /api/v1/targets/metadata

URL query parameters:

match_target=: Label 选项，会匹配指定好的 Label，如果为空会返回所有。
metric=: 指定要查询的 Metric 名字，如果不指定会返回所有，建议指定，不然太恐怖。
limit=: 一个限制，最大要返回的 Target 的数量。

curl -G http://localhost:9091/api/v1/targets/metadata \
    --data-urlencode 'metric=go_goroutines' \
    --data-urlencode 'match_target={job="prometheus"}' \
    --data-urlencode 'limit=2'
{
  "status": "success",
  "data": [
    {
      "target": {
        "instance": "127.0.0.1:9090",
        "job": "prometheus"
      },
      "type": "gauge",
      "help": "Number of goroutines that currently exist.",
      "unit": ""
    },
    {
      "target": {
        "instance": "127.0.0.1:9091",
        "job": "prometheus"
      },
      "type": "gauge",
      "help": "Number of goroutines that currently exist.",
      "unit": ""
    }
  ]
}

下面这个例子会返回 Target 中所有包含 instance="127.0.0.1:9090 这个 Label 的 Metric .

curl -G http://localhost:9091/api/v1/targets/metadata \
    --data-urlencode 'match_target={instance="127.0.0.1:9090"}'
{
  "status": "success",
  "data": [
    // ...
    {
      "target": {
        "instance": "127.0.0.1:9090",
        "job": "prometheus"
      },
      "metric": "prometheus_treecache_zookeeper_failures_total",
      "type": "counter",
      "help": "The total number of ZooKeeper failures.",
      "unit": ""
    },
    {
      "target": {
        "instance": "127.0.0.1:9090",
        "job": "prometheus"
      },
      "metric": "prometheus_tsdb_reloads_total",
      "type": "counter",
      "help": "Number of times the database reloaded block data from disk.",
      "unit": ""
    },
    // ...
  ]
}

查询 Metric 元数据

这个接口会返回 Metric 的元数据，比如 HELP 、TYPE、UNIT 等信息。如果有缺失的会留空，这也是个实验性的接口。不过看起来比较有用，比如在展示页面中展示某个指标的时候可以把他的 HELP 信息拿出来直接展示。

GET /api/v1/metadata

URL query parameters:

limit=: 要返回的 Metric 的最大数量.
metric=: 要查询的 Metric 名字，如果为空返回所有.

下面我们来看一个例子，由于只指定了 limit 参数，所有按照这个数量进行过滤，获取到了 cortex_ring_tokens http_requests_total 两个 Metric 的元数据。http_requests_total 有多个对象，不太清楚为什么是一样的两个数据。每个指标会至少有一个 HELP 信息，因为 Prometheus 规定了


curl -G http://localhost:9090/api/v1/metadata?limit=2

{
  "status": "success",
  "data": {
    "cortex_ring_tokens": [
      {
        "type": "gauge",
        "help": "Number of tokens in the ring",
        "unit": ""
      }
    ],
    "http_requests_total": [
      {
        "type": "counter",
        "help": "Number of HTTP requests",
        "unit": ""
      },
      {
        "type": "counter",
        "help": "Amount of HTTP requests",
        "unit": ""
      }
    ]
  }
}

下面这个例子会返回 Metric http_requests_total 的元数据。

curl -G http://localhost:9090/api/v1/metadata?metric=http_requests_total

{
  "status": "success",
  "data": {
    "http_requests_total": [
      {
        "type": "counter",
        "help": "Number of HTTP requests",
        "unit": ""
      },
      {
        "type": "counter",
        "help": "Amount of HTTP requests",
        "unit": ""
      }
    ]
  }
}

6.6 HTTP API 查询

Prometheus HTTP API 查询

接口格式

表达式查询

瞬时查询

范围时间查询

查询元数据

通过 Label 查找 Series

获取 label 名称

获取 label 的值

查询 Exemplar

Target 查询结果格式

Target

查询结果格式

范围向量

瞬时向量

标量

字符串

Alertmanagers

Rules

Alerts

查询 Target 和 Metric 元数据

查询 Target 元数据

查询 Metric 元数据

results matching ""

No results matching ""