Build a typed GraphQL API on Django with Strawberry. Design a clean schema, batch nested resolvers with DataLoaders to eliminate N+1, paginate with Relay connections, and secure against query-depth abuse.
GraphQL hands clients exactly the data they ask for — and hands your database an N+1 problem by default. Strawberry is the modern, type-hint-native GraphQL library for Python. This tutorial builds a production schema on Django, solves the query-explosion problem properly with DataLoaders, models mutations and pagination the way real clients expect, and locks the single endpoint down against the abuse vectors GraphQL uniquely exposes.
GraphQL is not automatically better than REST — it is a different set of trade-offs. It shines when many different clients need different shapes of the same data (a web app, a mobile app, and partners all hitting one graph), when you want to avoid the over-fetching and under-fetching that forces REST into endpoint sprawl, and when a strongly typed, introspectable schema is itself valuable. It costs you caching simplicity (no more leaning on HTTP caching by URL), a new performance failure mode (N+1 by construction), and a bigger security surface (one endpoint accepting arbitrary query shapes). Choose it deliberately, for those upsides — not because it is fashionable.
Strawberry uses Python's native type hints and dataclasses, so your schema is just typed code — there is no parallel DSL to keep in sync with your models. It has first-class async support, a clean Django integration via strawberry-django, and code-first ergonomics that make refactors safe because your editor and type checker understand the schema. Graphene still works and has a large install base, but for new projects Strawberry is the better default: less boilerplate, real types, and an actively maintained ecosystem.
pip install "strawberry-graphql[django]"
# urls.py
from strawberry.django.views import GraphQLView
from .schema import schema
urlpatterns = [
path("graphql/", GraphQLView.as_view(schema=schema)),
]
Map Django models to types with strawberry_django.type. The auto sentinel pulls each field's type from the model, so the schema stays in lockstep with migrations and you don't restate types by hand:
import strawberry, strawberry_django
from strawberry import auto
@strawberry_django.type(models.Author)
class Author:
id: auto
name: auto
books: list["Book"]
@strawberry_django.type(models.Book)
class Book:
id: auto
title: auto
published: auto
author: Author
@strawberry.type
class Query:
books: list[Book] = strawberry_django.field()
authors: list[Author] = strawberry_django.field()
schema = strawberry.Schema(query=Query)
Schema design is API design, and a published GraphQL schema is effectively forever — clients couple to your field names and nullability. A few rules that age well: name by domain concept rather than table; prefer non-null only where the data is genuinely always present, because nullability is hard to tighten later without a breaking change; expose enums as real GraphQL enums so clients get validation for free; and return connections, not bare lists, for anything that can grow, so you never have to retrofit pagination onto a field clients already treat as a full list.
When a field needs logic, write a resolver. The info argument carries request context — the current user, the request object, and (critically) your per-request DataLoaders:
@strawberry.type
class Query:
@strawberry.field
def my_books(self, info: strawberry.Info) -> list[Book]:
user = info.context.request.user
if not user.is_authenticated:
raise PermissionError("Login required.")
return list(models.Book.objects.filter(author__user=user))
Keep resolvers thin: authorize, then delegate to a model method or service function. Business logic embedded in resolvers is as hard to reuse and test as business logic embedded in views.
A query for 50 books, each asking for its author, runs one query for the books and then fifty more — one per author resolver. Nest a level deeper (each author's publisher) and it multiplies again into hundreds. GraphQL's flexibility is exactly what makes this so easy to trigger: the client controls the shape, so you cannot hand-tune one queryset per endpoint the way you would in REST. The naive resolver looks innocent and is a latent outage:
# One query PER book — the classic N+1
@strawberry.field
def author(self, info) -> Author:
return models.Author.objects.get(id=self.author_id)
You cannot solve this by adding select_related to one queryset, because there is no single queryset — the engine resolves fields lazily as it walks the client's tree. The answer is batching.
A DataLoader collects every key requested during one tick of the event loop and resolves them in a single batched query. It also caches within the request, so asking for the same author twice hits the database once. This is the single most important pattern in production GraphQL.
from strawberry.dataloader import DataLoader
async def load_authors(keys: list[int]) -> list[Author]:
rows = {a.id: a async for a in
models.Author.objects.filter(id__in=keys)}
return [rows.get(k) for k in keys] # MUST match input order & length
The loader's return list must match the order and length of the input keys — that contract is how the batched results map back to each individual caller. A key with no row returns None in its slot. Loaders cache, so they cannot be global; build them fresh per request and stash them on context:
class Context(StrawberryDjangoContext):
def __init__(self, request):
super().__init__(request=request, response=None)
self.author_loader = DataLoader(load_function=load_authors)
# the resolver becomes a one-liner that batches automatically:
@strawberry.field
async def author(self, info) -> Author:
return await info.context.author_loader.load(self.author_id)
Now 50 books trigger one batched author query no matter how deeply the client nests, and a thousand books trigger one query of a thousand IDs. You will write a loader for essentially every foreign key in your graph — treat that as the cost of doing GraphQL correctly, not as optional polish. For reverse relations (an author's books) the same pattern applies, grouping rows by the parent key.
Mutations are resolvers under a Mutation type. Use input types for arguments and return the affected object so clients can update their local cache without a refetch:
@strawberry.input
class CreateBookInput:
title: str
author_id: int
@strawberry.type
class Mutation:
@strawberry.mutation
def create_book(self, info, data: CreateBookInput) -> Book:
user = info.context.request.user
author = models.Author.objects.get(id=data.author_id)
if author.user_id != user.id:
raise PermissionError("Not your author.")
return models.Book.objects.create(
title=data.title, author=author)
schema = strawberry.Schema(query=Query, mutation=Mutation)
For predictable error handling, return a union of a success type and one or more error types instead of throwing. Clients branch on __typename — "did I get a Book or a ValidationError?" — which is far more robust than parsing error strings out of the top-level errors array. This is the GraphQL idiom for expected, recoverable failures; reserve thrown errors for the genuinely exceptional.
Cursor pagination is the GraphQL norm, and Strawberry's relay module gives you Connection types with edges, nodes, pageInfo, and opaque cursors out of the box:
@strawberry.type
class Query:
books: strawberry.relay.ListConnection[Book] = (
strawberry_django.connection())
Cursors are keyset-based, so they are constant time at any depth and stable under inserts — unlike offset pagination, where a new row shifts every subsequent page and users see duplicates. The shape (edges { node { ... } cursor } plus pageInfo { hasNextPage endCursor }) is exactly what Apollo, Relay, and urql expect, so client tooling paginates your API with zero custom code. Always enforce a maximum first/last so nobody requests ten million edges in one call.
GraphQL subscriptions push data over WebSockets — live order status, notifications, collaborative editing. Strawberry supports them via an async generator resolver, typically backed by Django Channels and a Redis pub/sub layer. They are powerful but operationally heavier than queries (you now run a stateful WebSocket layer), so add them when you have a real-time requirement, not speculatively.
REST spreads its attack surface across many URLs, each with its own narrow contract. GraphQL concentrates it into one endpoint that accepts arbitrary shapes — which means a single crafted query can nest thirty levels deep, request millions of rows, or map your entire schema. These protections are mandatory, not optional:
from strawberry.extensions import QueryDepthLimiter, MaxTokensLimiter
schema = strawberry.Schema(
query=Query, mutation=Mutation,
extensions=[QueryDepthLimiter(max_depth=10),
MaxTokensLimiter(max_token_count=2000)],
)
Authenticate at the view — reuse Django sessions for a same-origin web client, or validate a JWT for third parties — then authorize per field. GraphQL's granularity means one query can touch public and private fields at once, so gating the whole endpoint is too coarse: guard sensitive fields individually with a permission extension or a check at the top of the resolver. Field-level authorization is more work than REST's per-view permissions, but it matches GraphQL's per-field access model.
You lose HTTP caching by URL, so caching moves into the resolver layer. Three tiers: the per-request DataLoader cache (free, automatic); a short-lived application cache (Redis) for hot, expensive resolvers keyed by argument and user; and persisted-query plus CDN caching for public, cacheable queries where a stable query hash becomes a cache key. Always include the viewer in the cache key for anything user-specific, or you will serve one user another's data — the GraphQL equivalent of the cross-tenant leak.
A real graph has dozens of types and hundreds of fields, and a single schema.py becomes unmaintainable fast. Split types by domain into modules, define each Query/Mutation fragment locally, and merge them at the root with strawberry.tools.merge_types:
from strawberry.tools import merge_types
CombinedQuery = merge_types("Query", (BookQuery, AuthorQuery, OrderQuery))
CombinedMutation = merge_types("Mutation", (BookMutation, OrderMutation))
schema = strawberry.Schema(query=CombinedQuery, mutation=CombinedMutation)
This keeps each domain's types, resolvers, and loaders co-located with the Django app they belong to, so the graph scales the same way your codebase does. The root schema becomes a thin assembly point rather than a god-module everyone edits and conflicts on.
When multiple teams or services each own part of the graph, federation lets them publish independent subgraphs that compose into one supergraph a gateway serves. Strawberry supports Apollo Federation: a type can be defined in one service and extended with fields in another, joined by a key field. This is an organizational tool more than a technical one — reach for it when separate teams need to ship their slices of the graph on independent schedules, not for a single-team app where the merge-types approach above is simpler and has no gateway to operate.
Because one query fans out across many resolvers, "the request is slow" is not actionable — you need to know which field was slow. Add a tracing extension that records per-resolver timing, and export it to your APM (the same Sentry/OpenTelemetry stack you use elsewhere):
from strawberry.extensions.tracing import OpenTelemetryExtension
schema = strawberry.Schema(
query=Query, mutation=Mutation,
extensions=[OpenTelemetryExtension],
)
Now a slow query shows up as a span tree: you can see that author took 4ms total across a batched loader while recommendations took 800ms, and you optimize the right field. Pair this with logging the operation name (insist clients send named operations) so your dashboards group by query, not by the single /graphql/ URL that otherwise hides everything behind one route.
Per-URL rate limiting is meaningless when every request hits /graphql/. Limit on something that reflects real cost instead: the computed query complexity from your cost analysis, or the operation name plus user. A cheap { me { name } } and an expensive nested report should not count the same against a quota. Track spend per user in Redis keyed by a rolling window, and reject when the summed complexity over the window exceeds the budget — this is rate limiting that actually maps to database load rather than to request count.
You rarely rewrite a REST API to GraphQL in one cut, and you shouldn't. Run both side by side: stand up /graphql/ next to your existing /api/, and have GraphQL resolvers call the same service-layer functions your DRF views already use, so business logic lives in one place and both APIs stay consistent. Move one client feature at a time onto the graph, measure, and only deprecate a REST endpoint once nothing calls it. The shared service layer is what makes this safe — if your logic lives in views or resolvers instead, the two APIs drift and you maintain everything twice.
Execute queries directly against the schema in tests and assert on both data and query count — the query-count assertion is your permanent regression net for DataLoaders:
def test_books_query_is_batched(django_assert_max_num_queries):
query = "{ books { title author { name } } }"
with django_assert_max_num_queries(2): # books + ONE batched authors
result = schema.execute_sync(query)
assert result.errors is None
def test_depth_limit_rejects_deep_query():
evil = "{ a { b { c { d { e { f { g { h { i { j { k }}}}}}}}}}"
result = schema.execute_sync(evil)
assert result.errors # blocked by depth limiter
If someone removes a loader, the first test fails because the query count jumps; if someone loosens the depth limiter, the second fails. Both bugs are caught in CI instead of production.
Strawberry makes GraphQL on Django feel like writing ordinary typed Python, and code-first typing keeps the schema honest as you refactor. The non-negotiables for production are DataLoaders on every nested relation (or you will have an N+1 disaster), cursor pagination via Relay connections with hard page caps, union return types for clean mutation errors, and a full set of guardrails — depth limits, complexity caps, introspection off, persisted queries, resolver timeouts — because one URL accepting arbitrary shapes is a far bigger attack surface than REST. Move caching into the resolver layer and always key it by viewer. Add query-count and depth tests so neither the batching nor the limits can silently regress. Adopt GraphQL for its real strengths — many clients, one typed graph, no over-fetching — and pay its costs deliberately, and it becomes a genuine pleasure to build on.