Logo

dev-resources.site

for different kinds of informations.

How to Configure a Remote Data Store for Prometheus

Published at
12/21/2024
Categories
prometheus
monitoring
devops
sre
Author
Hrish B
Categories
4 categories in total
prometheus
open
monitoring
open
devops
open
sre
open
How to Configure a Remote Data Store for Prometheus

Introduction

The Prometheus monitoring tool can store its metrics either locally or remotely. You can configure a remote data store using the remote_write configuration. This article describes the various data store options available as well as how to set up a remote store.

Overview of Remote Storage

By default, Prometheus stores data locally wherever it is installed. The data directory can be configured by using the --storage.tsdb.path command line option when starting Prometheus.
In practice you can use a separate disk for higher performance attached to the machine where Prometheus is running.

However, this may not be possible or optimal in all situations as you might want a data store that is more suited for time series data, and has larger storage capabilities for higher data retention. Prometheus would usually run in a standalone VM or a Kubernetes pod or a Docker container, and it would not have access to such data stores by default.

A remote store can add these capabilities to Prometheus. The remote storage option can be set by using the remote_write key in the Prometheus configuration YAML file.

  • Introduction
  • Overview of Remote Storage
  • Remote Store Architecture
  • Remote Store Configuration
    • Basic Syntax
    • Security and Authentication
    • Remote Write Protocol Configuration
    • Network Configuration
    • Metrics Configuration
    • Queue Configuration
  • Remote Storage Options
  • Troubleshooting
    • Prometheus failing to write to the remote storage
    • Network connectivity between Prometheus and the remote store
    • If there is a proxy in between, it might be dropping packets or might not be running
    • Requests are timing out due to network issues
    • Requests are timing out due to the remote store not being able to keep up
  • Best Practices
  • Conclusion
  • FAQ

Remote Store Architecture

Prometheus remote write architecture

Remote Store Configuration

Basic Syntax

A very simple configuration for a remote store that accepts unauthenticated connections would look like this:

remote_write:
- url: "http://192.168.23.4/api/v1/write"
  name: "production-metrics"

You can have multiple remote_write sections in the same Prometheus configuration.

Based on your requirements and the features supported by the remote write server you can configure other options. Let us look at them one by one.

Security and Authentication

To protect your metrics data in transit whether it is traveling via your internal network or through the internet, you can enable both TLS as well as authentication. The remote store server should
support these options.

# Remote write configuration for Prometheus
remote_write:
- url: "https://prometheus-data-store.mydb.io/api/v1/write"
  name: "production-metrics"

  headers:
    Authorization: "Bearer <token>"

  basic_auth:
    username: "prometheus"
    password: "secret-password"

  tls_config:
    insecure_skip_verify: false
    ca_file: "/path/to/ca.pem"
    cert_file: "/path/to/cert.pem"
    key_file: "/path/to/key.pem"

This sample configuration does the following:

  • Adds a Bearer token for authentication, as well as basic auth options. In practice you would use only one of these.
  • Adds a tls_config assuming you have a custom CA which has issued the certificates for the remote store's server. If it's a certificate issued by a well-known CA, you would not have to configure this. This option would come in handy when you have a private CA.

You can also create a separate authorization section for more options while setting the Authorization header. Note that the options below are mutually exclusive - the example is only for illustration.

# Example 1: Default Bearer type with direct credentials
authorization:
  type: Bearer
  credentials: "eyJhbGciOiJIPoI1NiIsInR5cCI6IkpXVCJ9..."

# Example 2: Bearer type with credentials from file. This is mutually exclusive with credentials_file
authorization:
  type: Bearer
  credentials_file: "/etc/prometheus/token.txt"

# Example 3: Custom type with direct credentials
authorization:
  type: CustomAuth
  credentials: "secret-token-123"

Remote Write Protocol Configuration

As of this writing, the remote write specification is undergoing a change.
You probably don't have to worry about this section unless you are optimizing for very specific cases. You can configure the protobuf_message object that Prometheus uses when sending metrics.
This depends on what your remote store server supports.

remote_write:
- url: "http://192.168.23.4/api/v1/write"
  name: "production-metrics"

  protobuf_message: prometheus.WriteRequest

Network Configuration

Based on the properties of your remote store server, you can tune some functional settings.

The remote_timeout key sets the timeout for requests to the remote write endpoint. The default value is 30s. You would not need to set this unless you have a noisy network, or there are shorter timeouts in the network path between your Prometheus server and the remote store server.

If your remote store is behind a proxy server, you can configure the proxy details in the YAML.

remote_write:
- url: "http://192.168.23.4/api/v1/write"
  remote_timeout: 45s
  name: "production-metrics"

  # Proxy configuration
  proxy_url: "http://proxy.internal:4200"
  proxy_connect_header:
    "Proxy-Authorization": ["Basic xxxxxxxxxxxxxxxxxxxx"]
    "X-Custom-Proxy-Header": ["app1", "app2"]
  proxy_from_environment: false

  follow_redirects: true
  enable_http2: true

Metrics Configuration

You can use a relabel_config key to modify or drop specific metrics before they are written to the remote store. The relabel syntax is identical to that used in the scrape_config section. You might want to do this if:

  • You have multiple remote stores and want specific metrics to go to specific stores to avoid unnecessary storage costs.
  • You have one remote store but don't want certain metrics to be written there but let them remain with Prometheus' local storage.
remote_write:
  write_relabel_configs:
    - source_labels: [__name__]
      regex: 'test_metric.*'
      action: drop
    - source_labels: [environment]
      regex: 'staging'
      action: drop

Queue Configuration

The queue_config has settings to fine tune the queue that is used to write to remote storage. Prometheus creates an internal queue for each remote write server. As it collects metrics, Prometheus maintains a write-ahead log (WAL) that it can replay if there's a crash. Each remote destination queue picks up metrics data from the WAL and sends it to the remote store server. Each queue can also have multiple shards, which can be used to configure the amount of parallelism for each queue.

You will have to to tune the queue settings only if you have a very high volume of data and/or are facing issues with the remote store struggling to keep up with your Prometheus server.

You can check out these great writeups on tuning the queue settings for remote_write.

Remote Storage Options

A non-exhaustive list of software that supports the Prometheus remote write protocol includes:

  • Thanos
  • VictoriaMetrics
  • Splunk
  • OpenTSDB
  • Kafka
  • InfluxDB
  • Google BigQuery

Troubleshooting

Prometheus failing to write to the remote storage

This can be caused by a number of issues:

Network connectivity between Prometheus and the remote store

Check if you can reach the remote store using ping or curl.

If there is a proxy in between, it might be dropping packets or might not be running

Check if the proxy is running. Verify that the proxy configuration as well as the Prometheus remote_write proxy settings are correct. Check the proxy server's logs for any errors. The proxy might be blocking large packets.

Requests are timing out due to network issues

Run a traceroute from your Prometheus server to the remote store to see if packets are being dropped.

Requests are timing out due to the remote store not being able to keep up

Tune the queue configuration. If this happens suddenly, it's important to find out the root cause.

  • The number of metrics might have increased due to autoscaling events or an increase in cardinality.
  • The remote store might have disk issues.

Best Practices

  • Backup your data in the remote store.
  • Add security and authentication between your Prometheus and the remote store server. If your remote store does not support this natively, you can add a proxy like nginx in between and configure it to have TLS and authentication.
  • Monitor your remote store metrics for indications of trouble.
  • If you are in a regulated industry, ensure that your remote store is compliant with your requirements. E.g. if it's managed by a cloud vendor, ascertain that their security credentials are sufficient for your needs.

Conclusion

The remote store functionality in Prometheus offers a scalable and flexible way of adding a dedicated storage backend for Prometheus metrics. You can use the remote store for increased data retention,
durability of data, and offline data analysis.

Featured ones: