Day 22: Dgraph - Querying Knowledge Graphs with DQL

Day 22 challenge

Goal: master DQL querying and multi-language client integration with DgraphTheme: context engineering week - advanced graph query masteryTime investment: ~30 minutes

Welcome to Day 22! Yesterday you built sophisticated knowledge graphs with Dgraph. Today you’ll master DQL (Dgraph Query Language) for complex graph queries and learn to integrate Dgraph with your agents using multiple programming languages. DQL enables sophisticated graph traversal and analysis that powers intelligent agent reasoning.

What you’ll accomplish today

Master DQL syntax for complex graph queries
Use Ratel (Dgraph’s UI) to explore the news knowledge graph
Learn multi-hop graph traversal and aggregation techniques
Integrate Dgraph with agents using Python, JavaScript, and Go clients
Build sophisticated graph-powered agent workflows

This requires access to a Dgraph instance (free Cloud instance available) and familiarity with basic programming concepts. Be sure to complete Day 21 first. You’ll work with multiple client libraries.

Step 1: DQL fundamentals

DQL (Dgraph Query Language) is designed specifically for graph traversal and analysis:

Basic DQL syntax

{
  articles(func: type(Article)) {
    Article.title
    Article.abstract
    Article.uri
  }
}

Key DQL concepts

Functions: Entry points for queries (type(), eq(), allofterms(), etc.)
Predicates: Properties to retrieve or traverse
Variables: Store intermediate results (var(func: ...))
Filters: Refine results at any level(@filter())
Aggregations: Calculate values across sets (count, sum, avg)

DQL vs. other query languages

DQL advantages:

Native graph traversal with unlimited depth
Variables for complex multi-stage queries
Built-in aggregation and filtering at any level
Optimized for distributed graph operations

DQL thinking Unlike SQL joins, DQL follows relationships naturally. Think about traversing paths through connected data rather than combining tables.

Step 2: Exploring with Ratel

Ratel is Dgraph’s built-in UI for query development and visualization. We’ll use Ratel to execute queries and explore the results. Follow the steps described in Day 21 to connect Ratel to your Hypermode Graph.

Filtering and ordering

You can filter results using the @filter directive:

{
  articles(func: type(Article)) @filter(has(Article.abstract)) {
    Article.title
    Article.abstract
  }
}

This returns only articles that have an abstract. To order results, use the orderasc or orderdesc parameter:

{
  articles(func: type(Article), orderasc: Article.title) {
    Article.title
    Article.abstract
  }
}

Schema Improvement: Add an @index to Article.title to enable fast sorting:

<Article.title>: string @index(exact) .

Date filtering

Your schema includes Article.published as a date field. To filter by date:

{
  recent_articles(func: type(Article)) @filter(ge(Article.published, "2025-01-01T00:00:00Z")) {
    Article.title
    Article.published
  }
}

Schema Improvement Add a datetime index for faster date-based queries:

<Article.published>: datetime @index(hour) .

Nested traversals

Follow relationships between entities with nested queries:

{
  topics(func: type(Topic)) {
    Topic.name
    ~Article.topic {  # Traverse reverse edge to articles
      Article.title
      Article.abstract
    }
  }
}

You can also query articles and include their related entities:

{
  articles(func: type(Article)) {
    Article.title
    Article.topic {
      Topic.name
    }
    Article.org {
      Organization.name
    }
  }
}

Full-text search

The schema has a full-text index on Topic.name, enabling text search:

{
  topics(func: anyoftext(Topic.name, "technology AI")) {
    Topic.name
    ~Article.topic {
      Article.title
    }
  }
}

Schema Improvement Add full-text search to Article titles and abstracts:

<Article.title>: string @index(fulltext) .
<Article.abstract>: string @index(fulltext, term) .

Geospatial queries

Your schema has Geo.location as a geo field, enabling location-based queries:

{
  nearby_locations(func: near(Geo.location, [-74.0060, 40.7128], 50000)) {
    Geo.name
    Geo.location
    ~Article.geo {
      Article.title
    }
  }
}

This finds locations within 10km of New York City coordinates and their associated articles.

Vector similarity search

The schema includes Article.embedding with an HNSW vector index, allowing semantic searches:

query vector_search($embedding: string, $limit: int) {
          articles(func: similar_to(Article.embedding, $limit, $embedding)) {
            uid
            Article.title
            Article.abstract
            score
          }
        }

This finds the 5 articles with embeddings most similar to the given vector.

Advanced queries: Combining multiple filters

Combine multiple filters for complex queries:

{
  tech_articles_2025(func: type(Article)) @filter(
    anyoftext(Article.abstract, "technology AI") AND
    ge(Article.published, "2025-01-01") AND
    has(Article.geo)
  ) {
    Article.title
    Article.abstract
    Article.published
    Article.geo {
      Geo.name
      Geo.location
    }
    Article.topic {
      Topic.name
    }
  }
}

Additional schema improvements

To enable more advanced queries, consider these improvements:

Add indexes to organization and author names for searching:

<Organization.name>: string @index(exact, term) .
<Author.name>: string @index(exact, term) .

Add count indexing to quickly count relationships:

<Article.topic>: [uid] @count @reverse .
<Author.article>: [uid] @count @reverse .

Add unique ID constraints for article URIs:
```
<Article.uri>: string @index(exact) @upsert .
```
Add date partitioning for more efficient date range queries:
```
<Article.published>: datetime @index(year, month, day, hour) .
```

These enhancements will provide more query capabilities without requiring changes to your data model.

Client directives

DQL offers several client-side directives that modify query behavior without affecting the underlying data.

The `@cascade` directive

The @cascade directive filters out nodes where any of the requested fields are null or empty:

{
  articles(func: type(Article)) @cascade {
    Article.title
    Article.abstract
    Article.topic {
      Topic.name
    }
  }
}

This returns only articles that have all three fields: title, abstract, and at least one topic.

The `@facets` directive

While not currently configured in your schema, facets let you add metadata to edges. To add and query facets, you’d update your schema like this:

<Article.topic>: [uid] @reverse @facets(relevance: float) .

Then query with:

{
  articles(func: type(Article)) {
    Article.title
    Article.topic @facets(relevance) {
      Topic.name
    }
  }
}

The `@filter` directive (with multiple conditions)

Combine multiple filter conditions using logical operators:

{
  articles(func: type(Article)) @filter(has(Article.abstract) AND (anyoftext(Article.abstract, "climate") OR anyoftext(Article.abstract, "weather"))) {
    Article.title
    Article.abstract
  }
}

The `@recurse` directive

For recursive traversals (useful if your graph has hierarchical relationships):

{
  topics(func: type(Topic)) {
    Topic.name
    subtopics @recurse(depth: 3) {
      name
      subtopics
    }
  }
}

Note This would require adding a self-referential subtopics predicate to your schema.

Aggregation queries

DQL provides functions for aggregating data:

Basic count

{
  total_articles(func: type(Article)) {
    count(uid)
  }
}

Count with grouping

{
  topics(func: type(Topic)) {
    Topic.name
    article_count: count(~Article.topic)
  }
}

This counts how many articles are associated with each topic.

Multiple aggregations

{
  articles(func: type(Article)) {
    // trunk-ignore(vale/error)
    topic_stats: Article.topic {
      # Requires @index(exact) on Topic.name
      topic_min: min(Topic.name)
      topic_max: max(Topic.name)
      topic_count: count(uid)
    }
  }
}

Value-based aggregations

For numeric fields with appropriate indexes (not in your current schema):

{
  # This would require adding a numeric wordCount field with an @index(int)
  article_stats(func: type(Article)) {
    min_words: min(Article.wordCount)
    max_words: max(Article.wordCount)
    avg_words: avg(Article.wordCount)
    sum_words: sum(Article.wordCount)
  }
}

Grouping with `@groupby`

Group and aggregate data (requires adding @index directives to the fields used in @groupby):

{
  articles(func: type(Article)) @groupby(Article.published) {
    month: min(Article.published)
    count: count(uid)
  }
}

Note This would require <Article.published>: datetime @index(month) in the schema.

Date-based aggregations

{
  publications_by_month(func: type(Article)) {
    count: count(uid)
    month: datetrunc(Article.published, "month")
  } @groupby(month)
}

Note This requires the proper datetime index on Article.published.

Combined advanced example

This example combines multiple directives and aggregations:

{
  topic_statistics(func: type(Topic)) @filter(has(~Article.topic)) {
    Topic.name
    articles: ~Article.topic @cascade {
      count: count(uid)
      recent_count: count(uid) @filter(ge(Article.published, "2025-01-01T00:00:00Z"))
      oldest: min(Article.published)
      newest: max(Article.published)
    }
  }
}

This returns each topic with article statistics, including total count, recent count, and publication date ranges.

Step 4: Client integrations

Dgraph provides clients for multiple programming languages, including Python, Go, and JavaScript. You can use these clients to connect to your Dgraph instance and perform operations like queries, mutations, and transactions.

Setup and basic connection

import pydgraph
import grpc
import json
from datetime import datetime

# Create Dgraph client
def create_client():
    stub = pydgraph.DgraphClientStub('localhost:9080')
    client =pydgraph.open("dgraph://<YOUR_HYPERMODE_GRAPH_CONNECTION_STRING>")
    return client

# Example agent integration class
class NewsGraphAgent:
    def __init__(self):
        self.client = create_client()

    def search_articles_by_topic(self, topic_name, limit=10):
        """Search articles using full-text search on topic names"""
        query = f"""
        {{
          topics(func: anyoftext(Topic.name, "{topic_name}")) {{
            Topic.name
            articles: ~Article.topic (first: {limit}) {{
              uid
              Article.title
              Article.abstract
              Article.published
              Article.uri
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def search_articles_by_content(self, search_terms, limit=10):
        """Full-text search across article titles and abstracts"""
        query = f"""
        {{
          articles(func: anyoftext(Article.title, "{search_terms}"), first: {limit}) {{
            uid
            Article.title
            Article.abstract
            Article.published
            Article.topic {{
              Topic.name
            }}
            Article.org {{
              Organization.name
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def get_recent_articles(self, days_back=30, limit=20):
        """Get articles published within the last N days"""
        from datetime import datetime, timedelta
        cutoff_date = (datetime.now() - timedelta(days=days_back)).isoformat() + "Z"

        query = f"""
        {{
          recent_articles(func: type(Article)) @filter(ge(Article.published, "{cutoff_date}"))
          (orderdesc: Article.published, first: {limit}) {{
            uid
            Article.title
            Article.abstract
            Article.published
            Article.topic {{
              Topic.name
            }}
            Article.org {{
              Organization.name
            }}
            Article.geo {{
              Geo.name
              Geo.location
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def search_articles_near_location(self, latitude, longitude, radius_meters=50000, limit=10):
        """Find articles associated with locations near given coordinates"""
        query = f"""
        {{
          nearby_locations(func: near(Geo.location, [{longitude}, {latitude}], {radius_meters})) {{
            Geo.name
            Geo.location
            articles: ~Article.geo (first: {limit}) {{
              uid
              Article.title
              Article.abstract
              Article.published
              Article.topic {{
                Topic.name
              }}
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def vector_similarity_search(self, embedding_vector, limit=5):
        """Perform semantic search using article embeddings"""
        query = """
        query vector_search($embedding: string, $limit: int) {
          articles(func: similar_to(Article.embedding, $limit, $embedding)) {
            uid
            Article.title
            Article.abstract
            Article.published
            score
            Article.topic {
              Topic.name
            }
            Article.org {
              Organization.name
            }
          }
        }
        """

        variables = {
            "$embedding": json.dumps(embedding_vector),
            "$limit": str(limit)
        }

        txn = self.client.txn()
        try:
            response = txn.query(query, variables=variables)
            return json.loads(response.json)
        finally:
            txn.discard()

    def get_topic_statistics(self):
        """Get comprehensive statistics for each topic"""
        query = """
        {
          topic_statistics(func: type(Topic)) @filter(has(~Article.topic)) {
            Topic.name
            total_articles: count(~Article.topic)
            recent_articles: count(~Article.topic @filter(ge(Article.published, "2025-01-01T00:00:00Z")))
            oldest_article: min(val(~Article.topic)) {
              Article.published
            }
            newest_article: max(val(~Article.topic)) {
              Article.published
            }
          }
        }
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def complex_filtered_search(self, content_terms, topic_terms=None, since_date="2025-01-01", has_location=False):
        """Advanced search combining multiple filters and conditions"""
        location_filter = "AND has(Article.geo)" if has_location else ""
        topic_filter = f'AND anyoftext(Article.topic, "{topic_terms}")' if topic_terms else ""

        query = f"""
        {{
          filtered_articles(func: type(Article)) @filter(
            anyoftext(Article.abstract, "{content_terms}") AND
            ge(Article.published, "{since_date}T00:00:00Z")
            {topic_filter}
            {location_filter}
          ) @cascade {{
            uid
            Article.title
            Article.abstract
            Article.published
            Article.topic {{
              Topic.name
            }}
            Article.geo {{
              Geo.name
              Geo.location
            }}
            Article.org {{
              Organization.name
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def analyze_publication_trends(self):
        """Analyze publication patterns over time using groupby"""
        query = """
        {
          publication_trends(func: type(Article)) @groupby(Article.published) {
            month: datetrunc(Article.published, "month")
            article_count: count(uid)
          }
        }
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            data = json.loads(response.json)
            return self._process_publication_trends(data)
        finally:
            txn.discard()

    def get_normalized_article_data(self, limit=10):
        """Get flattened article data using @normalize"""
        query = f"""
        {{
          articles(func: type(Article), first: {limit}) @normalize {{
            title: Article.title
            abstract: Article.abstract
            published: Article.published
            uri: Article.uri
            topics: Article.topic {{
              name: Topic.name
            }}
            organizations: Article.org {{
              name: Organization.name
            }}
            location: Article.geo {{
              name: Geo.name
            }}
          }}
        }}
        """

        txn = self.client.txn()
        try:
            response = txn.query(query)
            return json.loads(response.json)
        finally:
            txn.discard()

    def _process_publication_trends(self, data):
        """Process publication trend data into a more usable format"""
        trends = data.get('publication_trends', [])
        processed_trends = []

        for trend in trends:
            processed_trends.append({
                'month': trend.get('month'),
                'article_count': trend.get('article_count', 0)
            })

        # Sort by month
        processed_trends.sort(key=lambda x: x['month'] if x['month'] else '')

        return {
            'trends': processed_trends,
            'total_months': len(processed_trends),
            'peak_month': max(processed_trends, key=lambda x: x['article_count']) if processed_trends else None
        }

What you’ve accomplished

In 30 minutes, you’ve mastered advanced graph querying: DQL mastery: learned sophisticated query patterns for graph traversal and analysis Ratel exploration: used visual tools to understand graph structure and optimize queries Multi-language integration: implemented Dgraph clients in Python, JavaScript, and Go Agent integration: connected graph reasoning capabilities to intelligent agents Advanced patterns: built complex analysis workflows using graph-native operations

The power of graph querying

DQL enables reasoning that traditional databases can’t: Traditional queries: “What articles mention OpenAI?” Graph-powered queries: “What entities are connected to OpenAI through 2-3 degrees of separation, how has this network evolved over time, and what does the pattern suggest about competitive positioning?” This completes your mastery of context engineering fundamentals.

Week 4 Complete

You’ve mastered context engineering - from prompts to sophisticated graph reasoning. Ready for production deployment in Week 5!

Pro tip for today

Build a comprehensive graph analysis workflow:

Create a complete analysis workflow that:
1. Takes a business question (e.g., "How is the AI industry competitive landscape evolving?")
2. Extracts relevant entities and relationships from the question
3. Designs appropriate DQL queries to explore the graph
4. Analyzes patterns across multiple dimensions (temporal, network, sentiment)
5. Synthesizes insights that answer the original business question
6. Explains the graph reasoning behind each conclusion

Show me both the technical implementation and the business insights it reveals.

This demonstrates the full power of graph-powered agent reasoning.

Time to complete: ~30 minutes Skills learned DQL mastery, Ratel exploration, multi-language client integration, graph-powered agent reasoning, advanced analysis workflows Week 4 complete: context engineering mastery achieved!

Remember Graph querying is about following the connections that reveal hidden insights. The most valuable discoveries often come from relationships that weren’t obvious until you traversed the graph.

Hypermode

Agents

Apps

Graphs

Tools

Resources

Day 22 challenge

​What you’ll accomplish today

​Step 1: DQL fundamentals

​Basic DQL syntax

​Key DQL concepts

​DQL vs. other query languages

​Step 2: Exploring with Ratel

​Filtering and ordering

​Date filtering

​Nested traversals

​Full-text search

​Geospatial queries

​Vector similarity search

​Advanced queries: Combining multiple filters

​Additional schema improvements

​Client directives

​The @cascade directive

​The @facets directive

​The @filter directive (with multiple conditions)

​The @recurse directive

​Aggregation queries

​Basic count

​Count with grouping

​Multiple aggregations

​Value-based aggregations

​Grouping with @groupby

​Date-based aggregations

​Combined advanced example

​Step 4: Client integrations

​Setup and basic connection

​What you’ve accomplished

​The power of graph querying

Week 4 Complete

​Pro tip for today

What you’ll accomplish today

Step 1: DQL fundamentals

Basic DQL syntax

Key DQL concepts

DQL vs. other query languages

Step 2: Exploring with Ratel

Filtering and ordering

Date filtering

Nested traversals

Full-text search

Geospatial queries

Vector similarity search

Advanced queries: Combining multiple filters

Additional schema improvements

Client directives

The `@cascade` directive

The `@facets` directive

The `@filter` directive (with multiple conditions)

The `@recurse` directive

Aggregation queries

Basic count

Count with grouping

Multiple aggregations

Value-based aggregations

Grouping with `@groupby`

Date-based aggregations

Combined advanced example

Step 4: Client integrations

Setup and basic connection

What you’ve accomplished

The power of graph querying

Pro tip for today