We’re overhauling Dgraph’s docs to make them clearer and more approachable. If
you notice any issues during this transition or have suggestions, please
let us know.
eq
, ge
, gt
, le
, lt
) in the query root (func:
)
can only be applied on indexed predicates. Comparison
functions can be used on @filter directives even on predicates that
haven’t been indexed. Filtering on non-indexed predicates can be slow for large
datasets, as they require iterating over all of the possible values at the level
where the filter is being used.
All other functions, in the query root or in the filter can only be applied to
indexed predicates.
For functions on string valued predicates, if no language preference is given,
the function is applied to all languages and strings without a language tag. If
a language preference is given, the function is applied only to strings of the
given language.
Term matching
allofterms
Syntax Example: allofterms(predicate, "space-separated term list")
Schema Types: string
Index Required: term
Matches strings that have all specified terms in any order, case-insensitive.
Usage at root
Query Example: all nodes that havename
containing terms indiana
and
jones
, returning the English name and genre in English.
Usage as filter
Query Example: all Steven Spielberg films that contain the wordsindiana
and
jones
. The @filter(has(director.film))
removes nodes with name Steven
Spielberg that aren’t the director --- the data also contains a character in a
film called Steven Spielberg.
anyofterms
Syntax Example: anyofterms(predicate, "space-separated term list")
Schema Types: string
Index Required: term
Matches strings that have any of the specified terms in any order (case
insensitive).
Usage at root
Query Example: All nodes that have aname
containing either poison
or
peacock
. Many of the returned nodes are movies, but people like Joan Peacock
also meet the search terms because without a cascade directive the
query doesn’t require a genre.
Usage as filter
Query Example: All Steven Spielberg movies that containwar
or spies
. The
@filter(has(director.film))
removes nodes with name Steven Spielberg that
aren’t the director --- the data also contains a character in a film called
Steven Spielberg.
N-gram search
Syntax Examples:ngram(predicate, "a string of text")
Schema Types: string
Index Required: ngram
The ngram
index tokenizes a string into shingles (contiguous sequences of n
words), with support for stop word removal and stemming. The ngram
function
matches strings that contain the given sequence of terms.
Usage at root
Query example: all nodes that have aname
containing quick
, brown
, and
fox
.
Regular expressions
Syntax Examples:regexp(predicate, /regular-expression/)
or case insensitive
regexp(predicate, /regular-expression/i)
Schema Types: string
Index Required: trigram
Matches strings by regular expression. The regular expression language is that
of go regular expressions.
Query Example: At root, match nodes with Steven Sp
at the start of name
,
followed by any characters. For each such matched UID, match the films
containing ryan
. Note the difference with allofterms
, which would match only
ryan
but regular expression search also matches within terms, such as bryan
.
Technical details
A Trigram is a substring of three continuous runes. For example,Dgraph
has
trigrams Dgr
, gra
, rap
, aph
.
To ensure efficiency of regular expression matching, Dgraph uses
trigram indexing. Dgraph converts
the regular expression to a trigram query, uses the trigram index and trigram
query to find possible matches and applies the full regular expression search
only to the possibles.
Writing efficient regular expressions and limitations
Keep the following in mind when designing regular expression queries.- At least one trigram must be matched by the regular expression (patterns shorter than 3 runes aren’t supported) since Dgraph requires regular expressions that can be converted to a trigram query.
- The number of alternative trigrams matched by the regular expression should be
as small as possible (
[a-zA-Z][a-zA-Z][0-9]
isn’t a good idea). Many possible matches means the full regular expression is checked against many strings; where as, if the expression enforces more trigrams to match, Dgraph can make better use of the index and check the full regular expression against a smaller set of possible matches. - Thus, the regular expression should be as precise as possible. Matching longer strings means more required trigrams, which helps to effectively use the index.
- If repeat specifications (
*
,+
,?
,{n,m}
) are used, the entire regular expression must not match the empty string or any string: for example,*
may be used like[Aa]bcd*
but not like(abcd)*
or(abcd)|((defg)*)
- Repeat specifications after bracket expressions (e.g.
[fgh]{7}
,[0-9]+
or[a-z]{3,5}
) are often considered as matching any string because they match too many trigrams. - If the partial result (for subset of trigrams) exceeds 1000000 UIDs during index scan, the query is stopped to prohibit expensive queries.
Fuzzy matching
Syntax:match(predicate, string, distance)
Schema Types: string
Index Required: trigram
Matches predicate values by calculating the
Levenshtein distance to
the string, also known as fuzzy matching. The distance parameter must be
greater than zero (0). Using a greater distance value can yield more but less
accurate results.
Query Example: At root, fuzzy match nodes similar to Stephen
, with a distance
value of less than or equal to 8.
Vector Similarity Search
Syntax Examples:similar_to(predicate, 3, "[0.9, 0.8, 0, 0]")
Alternatively the vector can be passed as a variable:
similar_to(predicate, 3, $vec)
This function finds the nodes that have predicate
close to the provided
vector. The search is based on the distance metric specified in the index
(cosine
, euclidean
, or dotproduct
). The shorter distance indicates more
similarity. The second parameter, 3
specifies that top 3 matches be returned.
Schema Types: float32vector
Index Required: hnsw
Full-Text Search
Syntax Examples:alloftext(predicate, "space-separated text")
and
anyoftext(predicate, "space-separated text")
Schema Types: string
Index Required: fulltext
Apply full-text search with stemming and stop words to find strings matching all
or any of the given text.
The following steps are applied during index generation and to process full-text
search arguments:
- Tokenization (according to Unicode word boundaries).
- Conversion to lowercase.
- Unicode-normalization (to Normalization Form KC).
- Stemming using language-specific stemmer (if supported by language).
- Stop words removal (if supported by language).
Language | Country Code | Stemming | Stop words |
---|---|---|---|
Arabic | ar | ✓ | ✓ |
Armenian | hy | ✓ | |
Basque | eu | ✓ | |
Bulgarian | bg | ✓ | |
Catalan | ca | ✓ | |
Chinese | zh | ✓ | ✓ |
Czech | cs | ✓ | |
Danish | da | ✓ | ✓ |
Dutch | nl | ✓ | ✓ |
English | en | ✓ | ✓ |
Finnish | fi | ✓ | ✓ |
French | fr | ✓ | ✓ |
Gaelic | ga | ✓ | |
Galician | gl | ✓ | |
German | de | ✓ | ✓ |
Greek | el | ✓ | |
Hindi | hi | ✓ | ✓ |
Hungarian | hu | ✓ | ✓ |
Indonesian | id | ✓ | |
Italian | it | ✓ | ✓ |
Japanese | ja | ✓ | ✓ |
Korean | ko | ✓ | ✓ |
Norwegian | no | ✓ | ✓ |
Persian | fa | ✓ | |
Portuguese | pt | ✓ | ✓ |
Romanian | ro | ✓ | ✓ |
Russian | ru | ✓ | ✓ |
Spanish | es | ✓ | ✓ |
Swedish | sv | ✓ | ✓ |
Turkish | tr | ✓ | ✓ |
dog
, dogs
, bark
, barks
, barking
,
etc. Stop word removal eliminates the
and which
.
Inequality
equal to
Syntax Examples:eq(predicate, value)
eq(val(varName), value)
eq(predicate, val(varName))
eq(count(predicate), value)
eq(predicate, [val1, val2, ..., valN])
eq(predicate, [$var1, "value", ..., $varN])
int
, float
, bool
, string
, dateTime
Index Required: An index is required for the eq(predicate, ...)
forms (see
table below) when used at query root. For count(predicate)
at the query root,
the @count
index is required. For variables the values have been calculated as
part of the query, so no index is required.
Type | Index Options |
---|---|
int | int |
float | float |
bool | bool |
string | exact , hash , term , fulltext |
dateTime | dateTime |
true
and false
, so with eq
this becomes, for
example, eq(boolPred, true)
.
Query Example: Movies with exactly thirteen genres.
less than, less than or equal to, greater than and greater than or equal to
Syntax Examples: for inequalityIE
IE(predicate, value)
IE(val(varName), value)
IE(predicate, val(varName))
IE(count(predicate), value)
IE
replaced by
le
less than or equal tolt
less thange
greater than or equal togt
greater than
int
, float
, string
, dateTime
Index required: An index is required for the IE(predicate, ...)
forms (see
table below) when used at query root. For count(predicate)
at the query root,
the @count
index is required. For variables the values have been calculated as
part of the query, so no index is required.
Type | Index Options |
---|---|
int | int |
float | float |
string | exact |
dateTime | dateTime |
Steven
in name
and have directed
more than 100
actors.
initial_release_date
greater than that of the movie Minority Report.
between
Syntax Example:between(predicate, startDateValue, endDateValue)
Schema Types: Scalar types, including dateTime
, int
, float
and string
Index Required: dateTime
, int
, float
, and exact
on strings
Returns nodes that match an inclusive range of indexed values. The between
keyword performs a range check on the index to improve query efficiency, helping
to prevent a wide-ranging query on a large set of data from running slowly.
A common use case for the between
keyword is to search within a dataset
indexed by dateTime
. The following example query demonstrates this use case.
Query Example: Movies initially released in 1977, listed by genre.
UID
Syntax Examples:q(func: uid(<uid>))
predicate @filter(uid(<uid1>, ..., <uidn>))
predicate @filter(uid(a))
for variablea
q(func: uid(a,b))
for variablesa
andb
q(func: uid($uids))
for multiple uids in DQL Variables. You have to set the value of this variable as a string (e.g"[0x1, 0x2, 0x3]"
) in queryWithVars.
a
, uid(a)
represents the set of UIDs stored in a
. For
value variable b
, uid(b)
represents the UIDs from the UID to value map. With
two or more variables, uid(a,b,...)
represents the union of all the variables.
uid(<uid>)
, like an identity function, will return the requested UID even if
the node does not have any edges.
If the UID of a node is known, values for the node can be read directly.
uid_in
Syntax Examples:q(func: ...) @filter(uid_in(predicate, <uid>))
predicate1 @filter(uid_in(predicate2, <uid>))
predicate1 @filter(uid_in(predicate2, [<uid1>, ..., <uidn>]))
predicate1 @filter(uid_in(predicate2, uid(myVariable) ))
uid
function filters nodes at the current level based on UID,
function uid_in
allows looking ahead along an edge to check that it leads to a
particular UID. This can often save an extra query block and avoids returning
the edge.
uid_in
cannot be used at root. It accepts multiple UIDs as its argument, and
it accepts a UID variable (which can contain a map of UIDs).
Query Example: The collaborations of Marc Caro and Jean-Pierre Jeunet (UID
0x99706). If the UID of Jean-Pierre Jeunet is known, querying this way removes
the need to have a block extracting his UID into a variable and the extra edge
traversal and filter for ~director.film
.
type
Query Example: all nodes of type “Animal”type(Animal)
equivalent to eq(dgraph.type,"Animal")
type() can also be used as a filter:
has
Syntax Examples:has(predicate)
Schema Types: all
Determines if a node has a particular predicate.
Query Example: First five directors and all their movies that have a release
date recorded. Directors have directed at least one film --- equivalent
semantics to gt(count(director.film), 0)
.
Geolocation
As of now we only support indexing Point, Polygon and MultiPolygon geometry
types. However, Dgraph can
store other types of gelocation data.
Mutations
To make use of the geo functions you would need an index on your predicate.Point
.
Polygon
with a node. Adding a MultiPolygon
is also similar.
Query
near
Syntax Example:near(predicate, [long, lat], distance)
Schema Types: geo
Index Required: geo
Matches all entities where the location given by predicate
is within
distance
meters of geojson coordinate [long, lat]
.
Query Example: Tourist destinations within 1000 meters (1 kilometer) of a point
in Golden Gate Park in San Francisco.
within
Syntax Example:within(predicate, [[[long1, lat1], ..., [longN, latN]]])
Schema Types: geo
Index Required: geo
Matches all entities where the location given by predicate
lies within the
polygon specified by the geojson coordinate array.
Query Example: Tourist destinations within the specified area of Golden Gate
Park, San Francisco.
contains
Syntax Examples:contains(predicate, [long, lat])
or
contains(predicate, [[long1, lat1], ..., [longN, latN]])
Schema Types: geo
Index Required: geo
Matches all entities where the polygon describing the location given by
predicate
contains geojson coordinate [long, lat]
or given geojson polygon.
Query Example : All entities that contain a point in the flamingo enclosure of
San Francisco Zoo.
intersects
Syntax Example:intersects(predicate, [[[long1, lat1], ..., [longN, latN]]])
Schema Types: geo
Index Required: geo
Matches all entities where the polygon describing the location given by
predicate
intersects the given geojson polygon.