Logo

dev-resources.site

for different kinds of informations.

CockroachDB: trace logging with Datadog

Published at
1/11/2023
Categories
cockroachdb
datadog
Author
world2mark
Categories
2 categories in total
cockroachdb
open
datadog
open
Author
10 person written this
world2mark
open
CockroachDB: trace logging with Datadog

trace logs

Trace logs are the detailed internal synchronous & asynchronous operations that take place to fulfill a complete transaction or deliver a result.

CockroachDB trace logs

In CockroachDB these operations may include any combination of network interactions, calls to the storage layer, SQL layer, distribution layer, query batching, reads/updates/writes, and various runtime coordinator services.

Normally this is the level of detail is what you’d find inside a CockroachDB debug ZIP bundle, and specifically the jaeger-file that lets you visualize and interact with the execution flow to identify where processing/waiting time is spent.

improving database performance

Datadog provides tools to assist in the tasks of database tuning, query-tuning, and improvements to schemas.

What’s special is that trace-logs are sent to Datadog in real-time during query execution.

In the Datadog portal you can monitor, filter, and explore the same jaeger-style activity on these dissected operations to help guide you with:

  • narrowing down bottle-necks in the cluster network topology
  • capturing transient issues tied to network fluctuations
  • isolating root-causes of poor-performing queries
  • observing application behaviour that send queries
  • tracking down the impact of sub-optimal schemas
  • providing additional context for Cockroach Labs support teams

to the architecture

The trace logging system is divided into 3 areas of integration.

  1. The CockroachDB cluster (source; host machine)
  2. The Datadog agent (OTLP protocol; host machine)
  3. The Datadog web portal (sink; in cloud)

Core Architecture

architecture of CockroachDB and Datadog integration

1) in the cloud: Datadog

Datadog is an integrated cloud-portal offering data-aggregation services, including all the tools needed to specify data-sources, infrastructure observability, build interactive dashboards, explore metrics, define alerts, monitors, and so on.

CockroachDB data source integration (Datadog marketplace)

CockroachDB is one of the many data-source integrations that’s available through their vendor marketplace. This service establishes the core settings for a typical CRDB installation that are required to capture metrics, text-log data, and defines all the back-end monitored properties.

2) on the host machine: Datadog agent

Here we have the Datadog agent. It’s a binary executable that runs as a service in the background, listening to OpenTelemetry data sources. You can find installation instructions on the main portal with download links specific to your target OS.

Datadog agents

They provide agents for many environments, including cloud-based and operator-driven deployments for managed Kubernetes and OpenShift.

3) on the host machine: CockroachDB

CockroachDB relies on the OpenTelemetry protocol as part of the observability instrumentation. this can be enabled by applying the cluster setting property trace.opentelemetry.collector to associate with the Datadog agent.

In this post the agent running locally on my laptop along with CockroachDB so I’m sticking to the default Datadog OLTP workload port for the telemetry connection.


to the installation

create an API key

Datadog provides a multi-tenant view to user-content and data-sources. API keys are used to identify and differentiate who’s data we’re working with, and where the data comes from.

Core Architecture

API keys in Datadog

install the agent

Browse to the Agents Page and download the binary suitable for your OS or platform.

Once installed, configuration files made available so that you can associate your CockroachDB installation with the agent, and associate the agent with your Datadog account.

associate the Agent with Datadog

Using the API key generated above, you must edit the datadog.yaml file and save the key. The OTLP listening port is 4317 by default and can also be changed if you have multiple listeners or agents present.

edit ~/.datadog-agent/datadog-agent/datadog.yaml

...
...
api_key: **4fxxxxxxxxxxxxxxxxxxxxxxxxxxxxe0**
...
...
otlp_config:
  receiver:
    protocols:
      grpc:
        endpoint: localhost:4317
...
...
Enter fullscreen mode Exit fullscreen mode

associate CockroachDB with the agent

Next step (optional): Tag your services and specify CockroachDB endpoints that match your database installation.

Tagging is important where multiple Cockroach DB instances are running on the same server and you need to differentiate between them on your dashboards.

The default prometheus URL setting is localhost:8080/_status/vars but can be changed as follows:

edit ~/.datadog-agent/conf.d/cockroachdb.d/conf.yaml

...
...
## Every instance is scheduled independent of the others.
#
instances:
  - openmetrics_endpoint: http://localhost:8091/_status/vars
    tags:
      - service:mzmz-crdb-8091
      - env:mzmz-env-8091
    service: mzmz-crdb-8091
    ...
    ...
Enter fullscreen mode Exit fullscreen mode

CockroachDB cluster settings

The last setting is to establish the link between CockroachDB and the Datadog agent. This is accomplished by setting the trace-collector to the running Datadog agent as follows:

root@localhost:26257/defaultdb> set cluster setting trace.opentelemetry.collector='localhost:4317';
SET CLUSTER SETTING
Time: 78ms total (execution 78ms / network 0ms)

root@localhost:26257/defaultdb> set cluster setting sql.trace.log_statement_execute=on;
SET CLUSTER SETTING
Time: 41ms total (execution 41ms / network 0ms)

root@localhost:26257/defaultdb> set tracing=on;
SET TRACING
Time: 0ms total (execution 0ms / network 0ms)
Enter fullscreen mode Exit fullscreen mode

Reference

CockroachLabs Docs: Set Tracing

SET TRACING changes the trace recording state of the current session. A trace recording can be inspected with the SHOW TRACE FOR SESSION statement.
SET TRACING=off;  -- Trace recording is disabled.
SET TRACING=cluster;  -- Trace recording is enabled; distributed traces are collected.
SET TRACING=on;  -- Same as cluster.
SET TRACING=kv;  -- Same as cluster except that "kv messages" are collected instead of regular trace messages.
SET TRACING=results; -- Result rows and row counts are copied to the session trace. This must be specified in order for the output of a query to be printed in the session trace.
Enter fullscreen mode Exit fullscreen mode

testing the environment

  1. Launch CockroachDB
  2. Open the Datadog Agent local admin console (default address is http://127.0.0.1:5002
  3. Under Checks --> Checks Summary you should see all green checkmarks.
    Checks Summary for Datadog Agent
    In my installation, I have 2 CockroachDB instances configured, but only one is running.
  4. Sign into your Datadog HQ account
  5. Left hand navigation bar: APM --> Traces
    APM Traces
    Tracing from CockroachDB emits large volumes of data. Searching for sql txn will remove the noise and focus the trace-spans onto your queries for inspection.
  6. Upon running a query, the full trace and spans are sent to this UI and can be dissected.
    APM Traces
    Visualize where (and how) time is spent processing a query, and can be used as a baseline for comparison against related queries. A typical example is to confirm range hot-spots due to long wait-times acquiring a write-latch.

The UI lets you explore the query history and with capabilities such as sorting by duration, isolating long-running statements is a quick process.

By adjusting the filters, you can switch to other CockroachDB run-time activities such as liveness inspections, tracking jobs, or garbage-collector event logs.


conclusion

Tracing capabilities provided by Datadog offers a non-invasive solution for processing and visualizing logs across many deployment options. It's a fully-managed & integrated cloud solution that offers all the back-end instrumentation allowing you to focus on the health of your applications.


terminology & resources

APM

Application Performance Monitoring

OTLP

The OpenTelemetry Protocol specification describes the encoding/transport/delivery mechanism of telemetry data between sources/collectors/backends. What is OpenTelemetry?

Datadog & CockroachDB integration

The Datadog Agent ingests metrics and traces in the OpenTelemetry format (OTLP), which can be produced by OpenTelemetry-instrumented applications such as CockroachDB.

configure CRDB

CockroachDB cluster configuration

configure Datadog

Client-side Datadog configuration (client-side Datadog configuration)
Cloud account Datadog integration (cloud account Datadog integration)

trace

Information about the sub-operations performed as part of a high-level operation (a query or a transaction). This information is internally represented as a tree of "spans", with a special "root span" representing a whole SQL transaction
CockroachLabs Docs: Set Tracing

datadog Article's
30 articles in total
Favicon
Distributed Tracing in Microservices: AWS X-Ray vs DataDog
Favicon
Understanding Datadog: Monitoring and Observability for Modern Applications
Favicon
Golang: Como a observabilidade e profiling revelaram um throttling quase indetectável
Favicon
Datadog : how to filter metrics on tag "team"
Favicon
Self-Healing and Monitoring: A comprehensive guide to revolutionizing System Resilience Through Automation
Favicon
Setting Up Datadog Agent for Nginx Log Collection on AWS EC2
Favicon
How to Monitor your AWS EC2/Workspace with Datadog
Favicon
Getting Started with DataDog's APM: A Developer's Guide
Favicon
Integrating Keycloak with Datadog: Enabling Keycloak Traces in Kubernetes using Datadog APM
Favicon
Monitor EC2 instance metrics with Datadog (step-by-step)
Favicon
Datadog PHP APM filtering
Favicon
Datadog Resource Inventory
Favicon
The Saga of the Replica Lag
Favicon
Send the logs of your Shuttle-powered backend to Datadog
Favicon
Monitoring SQS with Datadog
Favicon
Forecast log costs pre-production
Favicon
Datadog-Cloudformation Integration using Serverless Framework
Favicon
Datadog Alternative: When to Move Out?
Favicon
New dog is ready to rock
Favicon
Datadog vs New Relic: A Duel for Dominance in LLM Observability Platforms
Favicon
Quick tip: Monitoring SingleStoreDB Cloud using Datadog
Favicon
Full Stack Observability: Connecting AWS with Datadog
Favicon
Jump into Datadog With AWS Serverless CDK App
Favicon
How to use DataDog to find utilisation of AWS EBS volume
Favicon
Monitorando RabbitMQ implantado no Kubernetes como Cluster Operator com Datadog Autodiscovery
Favicon
Find your application's hidden secrets using opentelemetry
Favicon
Datadog - export more than 5000 records.
Favicon
Send the logs of your Shuttle-powered backend to Datadog (outdated)
Favicon
CockroachDB: trace logging with Datadog
Favicon
CI/CD con Synthetic Monitoring de Datadog y Bitbucket pipelines

Featured ones: