There are many different ways to operate datadog. This is just an illustrative example.
If you are already familiar with the Datadog install process you may skip this section.
In the event that this documentation conflicts with the Datadog install documentation, please defer to the Datadog documentation.
- Generate an API Key and export it
export DD_API_KEY={{KEY}}
- Install
helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
- Add the datadog helm repo
helm repo add datadog https://helm.datadoghq.com
helm repo update
- Install the agent on each of your clusters:
helm upgrade --install datadog --create-namespace -n datadog -f https://raw.githubusercontent.com/solo-io/solo-cop/main/tools/datadog/datadog-values.yaml --set datadog.site='datadoghq.com' --set datadog.apiKey=$DD_API_KEY datadog/datadog
- Patch Gloo Mesh Mgmt
kubectl patch -n gloo-mesh deployment gloo-mesh-mgmt-server --patch-file https://raw.githubusercontent.com/solo-io/solo-cop/main/tools/datadog/gloo-mesh-patch-file.yaml
- Create a new dashboard.
- Import
gloo-mesh-dashboard.json
into the new dashboard. - Edit
Percent Agents Connected
module.
- Create a query for each workload cluster with
clamp_max
set to1
. - Update the formula to average each cluster and multiply by 100.
- Update distributions to show percentiles.
- Navigate to metrics summary.
- Locate
gloo_mesh.gloo_mesh_reconciler_time_sec
/gloo_mesh.gloo_mesh_translation_time_sec
distributions - In
Advanced
toggle theEnable percentiles and threshold queries
setting
- Apply Unit to Trial License Expiration
- Navigate to metrics summary.
- Locate
gloo_mesh.solo_io_gloo_trial_license
guage. - Set
Unit
dropdown to minutes.
- The
Times
charts on the top left represent system load. Darker colors represent slower translation and reconcile times. - The
Warnings
andErrors
charts on the top right will list warnings and errors in Solo CRD translations to Istio resources. - The second row will demonstrate 75th,90th, and 99th percentiles for translations and reconciliations.
- The three boxes in the second row on the right represent component configuration.
- The third row represents agent connection tracking.
- The fourth and fifth row represent resource usage and cpu throttling for mesh components.
The following example assumes you are running within the
istio-system
namespace, you are leveraging gateway injection, and you have installed istio 1-14-4.
- Patch Istiod
kubectl patch -n istio-system deployment istiod-1-14-4 --patch-file istiod-patch-file.yaml
- Update the sidecar configuration to inject datadog annotations.
For production update with helm.
- Find the injector config map ~
istio-sidecar-injector-1-14-4
- Add the three fields in
istio-proxy-annotations.yaml
underneathdata.config.injectedAnnotations
- Redeploy workloads.
- Repeat per cluster.
- Update the distributions for envoy to allow for percentiles.
- Create a new dashboard and import
istio-dashboard.json
- Change in resources
- Request Information
- Pilot XDS times.
- Total number of specified resources.