We’re overhauling Dgraph’s docs to make them clearer and more approachable. If
you notice any issues during this transition or have suggestions, please
let us know.
/debug/vars
endpoint in JSON format and the
/debug/prometheus_metrics
endpoint in Prometheus’ text-based format. Dgraph
doesn’t store the metrics and only exposes the value of the metrics at that
instant. You can either poll this endpoint to get the data in your monitoring
systems or install
Prometheus. Replace targets
in the configuration file below with the IP address of your Dgraph instances and
run prometheus using the command prometheus --config.file my_config.yaml
.
Raw data exported by Prometheus is available via
/debug/prometheus_metrics
endpoint on Dgraph alphas.grafana_dashboard.json
by following these
instructions.
Amazon CloudWatch
Route53’s health checks can be leveraged to create standard CloudWatch alarms to notify on change in the status of the/health
endpoints of Alpha and Zero.
Considering that the endpoints to monitor are publicly accessible and you have
the AWS credentials and awscli setup, we’ll go
through an example of setting up a simple CloudWatch alarm configured to alert
via email for the Alpha endpoint alpha.acme.org:8080/health
. Dgraph Zero’s
/health
endpoint can also be monitored in a similar way.
Create the Route53 health check
/tmp/create-healthcheck.json
would need to have the values for the
parameters required to create the health check as such:
Currently, Route53 metrics are only
available
in the US East (N. Virginia) region. The CloudWatch Alarm (and the SNS
Topic) should therefore be created in
us-east-1
.[Optional] Creating an SNS topic
SNS topics are used to create message delivery channels. If you do not have any SNS topics configured, one can be created by running the following command:Creating a CloudWatch alarm
The following command creates a CloudWatch alarm with--alarm-actions
set to
the ARN of the SNS topic and the --dimensions
of the alarm set to the health
check ID.
Internal endpoints
If the Alpha endpoint is internal to the VPC network, create a Lambda function that periodically (triggered using CloudWatch Event Rules) requests the/health
path and creates CloudWatch metrics which could then be used to create
the required CloudWatch alarms. The architecture and the CloudFormation template
to achieve the same can be found
here.