Overview

This guide walks you through migrating your Dgraph database from managed cloud services to a self-hosted environment. It covers step-by-step instructions for deployment using various cloud providers and methods, supporting goals like cost savings, increased control, and compliance.
This guide supplements the Dgraph self-managed documentation. Be sure to refer to the self-managed documentation for more information.

Deployment options

When migrating to self-hosted Dgraph, your deployment choice depends on several key factors: data size, team expertise, budget constraints, and control requirements. Here’s how these factors influence your deployment decision: Data Size Considerations:
  • Under 100GB Docker Compose or Linux are suitable options
  • 100GB to 1TB Kubernetes or Linux can handle the load
  • Over 1TB Kubernetes is required for proper scaling and management
Team Expertise Factors:
  • High Kubernetes Experience: Kubernetes deployment is recommended
  • Limited Kubernetes Experience: Docker Compose or Linux are more approachable
Budget Constraints:
  • Cost Optimized: Linux provides the most economical option
  • Balanced: Docker Compose offers a good middle ground
  • Enterprise: Kubernetes provides enterprise-grade features
Control Requirements:
  • Maximum Control: Linux gives you full control over the environment
  • Managed Infrastructure: Kubernetes provides managed infrastructure capabilities
Available Deployment Methods:
  • Kubernetes: Best for large-scale deployments, enterprise environments, and teams with K8s expertise
  • Docker Compose: Ideal for development, testing, and smaller production workloads
  • Linux: Perfect for cost-conscious deployments and teams wanting maximum control

Prerequisites

Before starting your migration, ensure you have the necessary tools, access, and resources.

Required tools

Command Line Tools

  • kubectl (v1.24+)
  • helm (v3.8+)
  • dgraph CLI tool
  • curl or similar HTTP client
  • Cloud provider CLI tools

Development Tools

  • Docker (for local testing)
  • Git (for configuration management)
  • Text editor or IDE
  • SSH client

Access requirements

  • Dgraph Cloud: Admin access to export data
  • Hypermode Graph: Database access credentials
  • Network: Ability to download large datasets

Data export from source systems

The first step in migration is safely exporting your data from your current managed service. This section covers export procedures for both Dgraph Cloud and Hypermode Graphs.

Exporting from Dgraph Cloud

Dgraph Cloud provides several methods for exporting your data, including admin API endpoints and the web interface.

Method 1: Using the Web Interface

1

Access Export Function

Log into your Dgraph Cloud dashboard and navigate to your cluster.Dgraph Cloud Dashboard
2

Navigate to Export

Click on the “Export” tab in your cluster management interface. Export Tab
Location
3

Configure Export Settings

Select your export format and destination. Dgraph Cloud supports JSON or RDF.\Export ConfigurationClick “Start Export” and monitor the progress. Large datasets may take several hours.
Click “Start Export” and monitor the progress. Large datasets may take several hours.
4

Download Exported Data

Once complete, download your exported data files.Export Download

Method 2: Using Admin API

curl -X POST https://your-cluster.grpc.cloud.dgraph.io/admin \
  -H "Content-Type: application/json" \
  -d '{"query": "{ state { groups { id members { id addr leader lastUpdate } } } }"}'

Method 3: Bulk export for large datasets

For datasets larger than 10 GB, use the bulk export feature:
curl -X POST https://your-cluster.grpc.cloud.dgraph.io/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation { 
      export(input: {
        destination: \"s3://your-backup-bucket/$(date +%Y-%m-%d)\",
        format: \"rdf\",
        namespace: 0
      }) { 
        response { 
          message 
          code 
        } 
      } 
    }"
  }'

Exporting from Hypermode Graphs

For larger datasets please contact Hypermode Support to facilitate your graph export.

Using admin endpoint

For smaller datasets you can use the admin endpoint to export your graph.
For larger datasets please contact Hypermode Support to facilitate your graph export.
curl --location 'https://<YOUR_CLUSTER_NAME>.hypermode.host/dgraph/admin' \
--header 'Content-Type: application/json' \
--header 'Dg-Auth: ••••••' \
--data '{"query":"mutation {\n  export(input: { format: \"rdf\" }) {\n    response {\n      message\n      code\n    }\n  }\n}","variables":{}}'

Export validation and preparation

Always validate your exported data before proceeding with the migration.

Data integrity checks

# Check file sizes and contents
ls -lah exported_data/
file exported_data/*

# For RDF exports, count triples

if [[-f "exported_data/g01.rdf.gz"]]; then zcat exported_data/g01.rdf.gz | wc -l
fi

# For JSON exports, validate structure

if [[-f "exported_data/g01.json.gz"]]; then zcat exported_data/g01.json.gz | jq
'.[] | keys' | head -10 fi

Prepare for transfer

1

Organize Export Files

# Create organized directory structure
mkdir -p migration_data/{data,schema,acl,scripts}

# Move files to appropriate directories
mv exported_data/*.rdf.gz migration_data/data/
mv schema* migration_data/schema/
mv acl* migration_data/acl/
2

Create Checksums

# Generate checksums for integrity verification cd migration_data find
. -type f -name "*.gz" -exec sha256sum {} \; > checksums.txt find . -type f
```bash
# Generate checksums for integrity verification
cd migration_data
find . -type f -name "*.gz" -exec sha256sum {} \; > checksums.txt
find . -type f -name "*.json" -exec sha256sum {} \; >> checksums.txt
</Step>

<Step title="Compress for Transfer">
  ```bash
  # Create final migration package
  cd ..
  tar -czf migration_package_$(date +%Y%m%d).tar.gz migration_data/
  
  # Verify package
  tar -tzf migration_package_*.tar.gz | head -10

Pre-migration planning

Proper planning is crucial for a successful migration. This section helps you assess your current environment and plan the migration strategy.

1. Assess current environment

# For Dgraph Cloud
curl -X POST https://your-cluster.grpc.cloud.dgraph.io/admin \
  -H "Content-Type: application/json" \
  -d '{"query": "{ state { groups { id checksum tablets { predicate space } } } }"}'

2. Infrastructure sizing

CPU Requirements

Alpha Nodes: 2-4 cores per 1M edges Zero Nodes: 1-2 cores (coordination only) Load Balancer: 2-4 cores Monitoring: 1-2 cores

Memory Requirements

Alpha Nodes: 4-8 GB base + 1 GB per 10M edges Zero Nodes: 2-4 GB (metadata storage) Load Balancer: 2-4 GB Monitoring: 4-8 GB

Storage Requirements

Data Volume: 3-5x compressed export size WAL Logs: 20-50 GB per node Backup Space: 2x data volume Monitoring: 50-100 GB

Network Requirements

Internal: 1 Gbps minimum between nodes External: 100 Mbps minimum for clients Bandwidth: Plan for 3x normal traffic during migration Latency: <10 ms between data nodes

Data Migration and Import

1. Verify Cluster Status

kubectl get pods -n dgraph
kubectl port-forward -n dgraph svc/dgraph-dgraph-alpha 8080:8080 &
curl http://localhost:8080/health

2. Import Schema


kubectl port-forward -n dgraph svc/dgraph-dgraph-alpha 8080:8080 &
curl -X POST localhost:8080/admin/schema -H "Content-Type: application/json"
\ -d @schema_backup.json
curl -X POST localhost:8080/admin/schema \
  -H "Content-Type: application/json" \
  -d @schema_backup.json

3. Import Data

Refer to the Dgraph bulk loader documentation for efficiently handling larger datasets.

kubectl run dgraph-live-loader \
--image=dgraph/dgraph:v23.1.0 \
--restart=Never \
--namespace=dgraph \
--command -- dgraph live \
--files
--files /data/export.rdf.gz \
--alpha dgraph-dgraph-alpha:9080 \ --zero
dgraph-dgraph-zero:5080

4. Restore ACL Configuration

# Replace with your actual endpoint
DGRAPH_ENDPOINT="localhost:8080"  # Adjust for your deployment

curl -X POST $DGRAPH_ENDPOINT/admin \
-H "Content-Type: application/json" \
-d '{"query": "mutation { addUser(input: {name: \"admin\", password:
-H "Content-Type: application/json" \
-d '{"query": "mutation { addUser(input: {name: \"admin\", password: \"password\"}) { user { name } } }"}'


Post-Migration Verification

Data Integrity Checklist

  • Count total nodes and compare with original - Verify specific data samples - Test query performance - Validate application connections

1. Data Integrity Check

curl -X POST localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query": "{ nodeCount(func: has(_predicate_)) { count(uid) } }"}'

2. Performance Testing

time curl -X POST localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{"query": "{ users(func: allofterms(name, \"john\")) { name email } }"}'

Monitoring and Maintenance

1. Setup Monitoring Stack

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install monitoring prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace

2. Backup Strategy

Set up automated daily backups to ensure data protection.
kubectl create cronjob dgraph-backup \
  --image=dgraph/dgraph:v23.1.0 \
  --schedule="0 2 * * *" \
  --restart=OnFailure \
  --namespace=dgraph \
  -- dgraph export \
  --alpha dgraph-dgraph-alpha:9080 \
  --destination s3://your-backup-bucket/$(date +%Y-%m-%d)

Troubleshooting


Additional Resources

Dgraph Operational Runbooks

The following runbooks provide operational guidance for various scenarios you may encounter during and after migration.

Migration Validation Checklist

Post-Migration Validation

Use this checklist to ensure your migration was successful:Data Integrity
  • Total node count matches source
  • Random data samples verified
  • Schema imported correctly
  • Indexes functioning properly
Performance
  • Query response times acceptable
  • Throughput meets requirements
  • Resource utilization within limits
  • No memory leaks detected
Operations
  • Monitoring and alerting active
  • Backup procedures tested
  • Scaling mechanisms verified
  • Security policies enforced
Application Integration
  • All clients connecting successfully
  • Authentication working
  • API endpoints responding
  • Load balancing functional
This migration guide is a living document. Please contribute improvements, report issues, or share your experiences to help the community. For additional support, join the Dgraph community or consult the operational runbooks in the Hypermode ops-runbooks repository.