Generic repository pattern
First, to avoid misunderstandings, let me explain what I understand by generic repository. Have your ever seen an interface like this:
Or maybe you saw it’s twin brother that have a slightly
different variant of
Inspiration for the first of these examples comes from
official Microsoft documentation for ASP.NET MVC 4.
As for the second example you can find countless number of blogs that
describe this variant of the repository pattern e.g.
and also here
sometimes with slight variantions like returning
IEnumerable<TEntity> instead of
And in the later case often with an additional method for generating
So what is wrong with them you may ask? So far almost nothing,
not counting of course badly naming of the methods from Microsoft example(
they should be called
But “almost nothing” does not equal “nothing”. One problem that I find with these interfaces is that they violate Interface Segregation Principle. They expose full set of CRUD operations even for entities for which e.g. deleting does not make sense (for example when you deactivate users instead of deleting them from DB; also see Udi Dahan post about deleting data). But this problem can be easily solved by splitting this interface into three - one for reading, one for updating and one for deleting entities.
The real problem that I have with these interfaces comes from their improper usage. The original idea behind them is that they should be used as a base interfaces for your custom repository interfaces, just like this:
And that your command handlers and services (in other words clients of your custom repositories) should decide what methods are needed and should be put on your custom repository interfaces.
That is the theory. Unfortunately what I already saw a few times in my career instead is this:
Someone has created a working implementation of
What is worse this implementation is almost always registered in IoC container and
can be injected into your command handlers and
services like any other dependency:
This looks nice and clean but is not. I will tell you more about why
this is wrong later. Now I want to deal with one “solution” to the
GenericRepository<T> misinterpretation that
I often hear from other developers.
This solution sounds like this (dialog during code-review):
JIM SENIOR: Have you ever heard that
ISession or Entity Framework
DbSet is a repository?
Indeed what you just created is a tin wrapper over either
Actually we can replace this
GenericRepository<T> by e.g.
DbSet and get pretty must the same results.
The only service that
IGenericRepository<T> provides is that it hides
most of the thirty methods that
JONNY JUNIOR: Oh, indeed what you just said make sense.
I guess using generic repository
pattern here was a bit of overengineering. (Happily gets back to coding…)
For me using either
GenericRepository<T> or raw
DbSet is wrong most of the
time (one exception that I can accept is when you write
the most CRUDest application ever, then don’t bother
DbSet in your services). And why? Due to the following reasons:
- The only option to make sure that your LINQ queries will be properly translated to SQL is to test them against the same kind of database that you use in production environment. But when your queries are scattered over methods of your services it may be hard to create integration tests for them. For example look at the code:
To execute above query you must fulfill two if’s conditions. This will make an integration test for the above query less readable and more fragile. Instead imagine that this query is encapsulated by a repository method. In integration test you would just call that repo method and check the results - simple isn’t it?
I am sure that you agree with me that inline LINQ queries inside services are not reusable and that they have a nasty tendency to duplicate themselves over the codebase. Even when a programmer decides to extract query to it’s own method, it will usually be a private method on a particular service. Moving queries to repository methods makes them automatically reusable across entire application.
Inline LINQ queries are not named. Usually the only clue what a particular query does (without going deep it’s logic) is the name of the variable that holds query result. Unfortunately for us inventing a good variable names is a skill that only comes with the experience and since we have a lot of junior devs in our industry we are faced with names like
orders. Wrapping the query inside a repo method will automatically give it a name. Even if this name is not perfect we can refactor it later and all places that call this method will benefit from our refactoring automatically!
Sometimes for performance reasons we are forced to use raw SQL to get our data from DB. Do you really want to litter your business logic with low level technical stuff like
DbConnections, query parameters and
SqlExceptions? Let’s hide this low level stuff inside a repository and let our business code concentrate on business logic. Also see Single level of abstraction principle.
So what is the solution you may ask? Get ready…
What we need is the “specific” repository pattern
We should start repository design by specifying it’s interface.
The interface should contain only methods required by clients of
the repository. In other words if nobody needs to delete entities of a given type
or it does not make sense from business point of view
we will not add
Delete method to the interface.
If you are afraid that you will end up with different names for
basic CRUD operations like
Delete on one repo and
Remove on the other
you may create helper interfaces like
ICanUpdateEntity<TEntity> etc. that will contain only methods for
specific usage like deleting, updating etc.
Then the repository interface can inherit
appropriate subset of them.
None of the methods on the repository interface should return
Also make sure that the repository implementation does not
IQueryable<T> value hidden as
ToArray() to materialize query results before returning them
to the client.
When it comes to the repository implementation, the implementation is free
to inherit from abstract
GenericRepository<TEntity> base class.
Alternatively it may use
DbSet directly if it is more convenient.
No matter what approach you choose remember that “excessive” methods
inherited from base class
may be hidden by the repository interface.
Please remember that your repository is NOT responsible for managing
database transactions. This concern is best managed using
Unit of Work pattern.
This pattern is already implemented by both
(think change tracking and dirty checking),
we only need a better interface over them:
For the most web applications it is enough to start transaction using
at the beginning of the HTTP request and either
the end of the request. This can be done by using either an action filter
or a decorator around command handlers and/or services.
Example repository created using the above guidelines:
This should be obvious by now, but let’s not take chances. Every method of our repositories should be covered by one or more integration tests, which should use the same kind of DB that we use in production environment. Remember always use integration tests to test your repositories.
Turbocharging the repository pattern
There is no rose without thorns and presented above approach also has some serious drawbacks. Some of them can be fixed by using a different architecture than classic 3-layer arch. Most common problems with “specific” repositories are as follows:
- Repositories can over long periods of time accumulate
dozens and dozens of
Find*methods. Often these methods will be very similar to each other. There are two ways to combat this unwanted grow. One is to use a query object pattern. Basically you group several of these
Find*methods together into one more general
Findmethod. That method should accept an object that will represent a query criteria. For example:
To create a query from the criteria object we examine each search criteria and build query step-by-step:
The second solution to this problem is more robust and reliable. Usually too big repositories are accompanied by huge services and overgrown entities. You can slim down both your repos and services by using something that I call CQRS-light. It differs from full-blown CQRS by using exactly the same database tables for both reads and writes. When doing CQRS-light we can use the same ORM framework for both reading and writing data and slowly migrate to real CQRS only in these parts of our application that really need it (do recall this 80+ columns searchable grid that generates 20+ inner join query that halts your DB server? - real CQRS can help here).
The diagram below presents typical architecture of CQRS-light application:
The key principles of CQRS-light are:
Split all user actions into two categories. In the first put all actions that can modify the state of the system like e.g. creating a new order in an e-commerce app. In the second
category put all actions that do not modify state of the system e.g. viewing an order details. First category represents commands (writes), the second one queries (reads). Only commands can change state of the system.
Query handlers do NOT use repositories to access data. They access DB using whatever technology they want. Usual configurations include a single ORM on both read and write sides, ORM for writes and micro-ORM like Dapper for reads or using ORM for writes and raw SQL for reads.
Command handlers can only use repositories to access and modify data. Command handlers should not call query handlers to fetch data from database. If a command handler needs to execute a complex query and this query can be answered by a query handler you should duplicate this query logic and put it in both query handler and in a repository method (read and write sides must be separated).
Query handlers are tested only using integration tests. For command handlers you will have unit and optionally integration tests. Repositories will be tested using integration tests.
CQRS even in the “light” version is a huge topic and deserves a blog post of it’s own. MediatR library is a good starting point if you want to find out more about CQRS-light approach.
Let us return to the subject of the “specific” repository pattern drawbacks. The second drawback that I want to mention is unwanted migration of the business logic into query definitions. For example even this simple query:
contains a piece of business logic that describes what
it means for an order to be active.
Usually ORM’s prevent us from encapsulating such pieces of logic
into a separate properties like
What we need here is the specification pattern. You can find pretty decent overview of the specification pattern here. Our query method when we use the specification pattern should look similar to: