We’re overhauling Dgraph’s docs to make them clearer and more approachable. If
you notice any issues during this transition or have suggestions, please
let us know.
In every app’s development lifecycle, there’s a point where the underlying
schema doesn’t fit the requirements and must be changed for good. That requires
a migration for both schema and the underlying data. This article guides you
through common migration scenarios you can encounter with Dgraph and help you
avoid any pitfalls around them.
These are the most common scenarios that can occur:
- Renaming a type
- Renaming a field
- Changing a field’s type
- Adding
@id
to an existing field
As long as you can avoid migration, avoid it. Because there can be scenarios
where you might need to update downstream clients, which can be hard. So, its
always best to try out things first, once you are confident enough, then only
push them to production.
Renaming a type
Let’s say you had the following schema:
type User {
id: ID!
name: String
}
and you had your app working fine with it. Now, you feel that the name AppUser
would be more sensible than the name User
because User
seems a bit generic
to you. Then you are in a situation where you need migration.
This can be handled in a couple of ways:
- Migrate all the data for type
User
to use the new name AppUser
. OR,
- Just use the
@dgraph(type: ...)
directive to
maintain backward compatibility with the existing data.
Depending on your use-case, you might find option 1 or 2 better for you. For
example, if you have accumulated very little data for the User
type till now,
then you might want to go with option #1. But, if you have an active app with a
very large dataset then updating the node of each user may not be a thing you
might want to commit to, as that can require some maintenance downtime. So,
option #2 could be a better choice in such conditions.
Option #2 makes your new schema compatible with your existing data. Here’s an
example:
type AppUser @dgraph(type: "User") {
id: ID!
name: String
}
So, no downtime required. Migration is done by just updating your schema. Fast,
easy, and simple.
Note that, irrespective of what option you choose for migration on Dgraph side,
you still need to migrate your GraphQL clients to use the new name in
queries/mutations. For example, the query getUser
would now be renamed to
getAppUser
. So, your downstream clients need to update that bit in the code.
Renaming a field
Just like renaming a type, let’s say you had the following working schema:
type User {
id: ID!
name: String
phone: String
}
and now you figured that it would be better to call phone
as tel
. You need
migration.
You have the same two choices as before:
- Migrate all the data for the field
phone
to use the new name tel
. OR,
- Just use the
@dgraph(pred: ...)
directive to
maintain backward compatibility with the existing data.
Here’s an example if you want to go with option #2:
type User {
id: ID!
name: String
tel: String @dgraph(pred: "User.phone")
}
Again, note that, irrespective of what option you choose for migration on Dgraph
side, you still need to migrate your GraphQL clients to use the new name in
queries/mutations. For example, the following query:
query {
getUser(id: "0x05") {
name
phone
}
}
would now have to be changed to:
query {
getUser(id: "0x05") {
name
tel
}
}
So, your downstream clients need to update that bit in the code.
Changing a field’s type
There can be multiple scenarios in this category:
- List -> Single item
String
-> Int
- Any other combination you can imagine
It is strictly advisable that you figure out a solid schema before going in
production, so that you don’t have to deal with such cases later. Nevertheless,
if you ended up in such a situation, you have to migrate your data to fit the
new schema. There is no easy way around here.
An example scenario is, if you initially had this schema:
type Todo {
id: ID!
task: String
owner: Owner
}
type Owner {
name: String! @id
todo: [Todo] @hasInverse(field: "owner")
}
and later you decided that you want an owner to have only one to do at a time.
So, you want to make your schema look like this:
type Todo {
id: ID!
task: String
owner: Owner
}
type Owner {
name: String! @id
todo: Todo @hasInverse(field: "owner")
}
If you try updating your schema, you may end up getting an error like this:
resolving updateGQLSchema failed because succeeded in saving GraphQL schema but failed to
alter Dgraph schema - GraphQL layer may exhibit unexpected behavior, reapplying the old
GraphQL schema may prevent any issues: Schema change not allowed from [uid] => uid without
deleting pred: owner.todo
This is a red flag. As the error message says, you should revert to the old
schema to make your clients work correctly. In such cases, you should have
migrated your data to fit the new schema before applying the new schema. The
steps for such a data migration varies from case to case, and so can’t all be
listed down here, but you need to migrate your data first, is all you need to
keep in mind while making such changes.
Adding @id
to an existing field
Let’s say you had the following schema:
type User {
id: ID!
username: String
}
and now you think that username
must be unique for every user. So, you change
the schema to this:
type User {
id: ID!
username: String! @id
}
Now, here’s the catch: with the old schema, it was possible that there could
have existed multiple users with the username Alice
. If that was true, then
the queries would break in such cases. Like, if you run this query after the
schema change:
query {
getUser(username: "Alice") {
id
}
}
Then it might error out saying:
A list was returned, but GraphQL was expecting just one item. This indicates an internal error - probably a mismatch between the GraphQL and Dgraph/remote schemas. The value was resolved as null (which may trigger GraphQL error propagation) and as much other data as possible returned.
So, while making such a schema change, you need to make sure that the underlying
data really honors the uniqueness constraint on the username field. If not, you
need to do a data migration to honor such constraints.
Unused fields
For example, let’s assume that you have deployed the following schema:
type TestDataMigration {
id: ID!
someInfo: String!
someOtherInfo: String
}
Then you create a TestDataMigration
with someOtherInfo
value.
Then you update the Schema and remove the field.
type TestDataMigration {
id: ID!
someInfo: String!
}
The data you have previously created is still in the graph database !
Moreover if you delete the TestDataMigration
object using its id
, the
GraphQL API delete operation is successful.
If you followed the GraphQL - DQL Schema mapping, you
understand that Dgraph has used the list the known list of predicates (id
,
someInfo
) and removed them. In fact, Dgraph also removed the
dgraph.type
predicate and so thisTestDataMigration
node isn’t visible anymore
to the GraphQL API.
The point is that a node with this uid
exists and has a predicate
someOtherInfo
. This is because this data has been created initially and
nothing in the process of deploying a new version and then using a delete
operation by ID instructed Dgraph to delete this predicate.
You end up with a node without type (i.e without a dgraph.type
predicate) and
with an old predicate value which is ‘invisible’ to your GraphQL API!
When doing a GraphQL schema deployment, you must take care of the data cleaning
and data migration. The good news is that DQL offers you the tools to identify
(search) potential issues and to correct the data (mutations).
In the previous case, you can alter the database and completely delete the
predicate or you can write an ‘upsert’ DQL query that searches the nodes of
interest and delete the unused predicate for those nodes.
New non-nullable field
Another obvious example appears if you deploy a new version containing a new
non-nullable field for an existing type. The existing ‘nodes’ of the same type
in the graph don’t have this predicate. A GraphQL query reaching those nodes
returns a list of errors. You can easily write an ‘upsert’ DQL mutation to find
all node of this type not having the new predicate and update them with a
default value.