My friends and family are under attack in Ukraine.
Donate to protect them directly or help international organizations.

Repositories Without ORMs

Whether you want to use ORMs or not is up to you. I'm here to show you how you can write clean and testable code in case you want to try it without an ORM.

Let's focus on the following objectives:

Avoid tight coupling to third party libraries.
Access data from any storage.
Represent data using objects.
Write easily testable code.

Repository Interface

Before we jump into the implementation, we need to figure out its interface. We will not create generic methods that may or may not be used. Unused code is dead code. The easiest way to determine the interface is to start using the repository before we create it.

Say we need a list of featured products for our home page:

$this->products->getFeatured();

Or perhaps we need a list of recommendations based on the product that we're currently viewing:

$this->products->getRecommendedByProduct($productNumber);

We pick names that fit best within the context, instead of generic names. Reusability should not come at the expense of readability.

Once we have a few usages, we can create an interface:

interface ProductRepository
{
    /**
     * @return Product[]
     */
    public function getFeatured(): array;

    /**
     * @return Product[]
     */
    public function getRecommendedByProduct(ProductNumber $productNumber): array;
}

Notice that the second method takes a ProductNumber argument. This is just a little quality-of-life improvement to avoid accidentally passing empty strings, or strings that contain values that are not valid product numbers. Objects in, objects out. This makes for an interface that is easier to read and harder to misuse.

Repository Implementation

We can have as many implementations as we like, perhaps even multiple implementations in the same application, such as when we need to synchronize with a secondary data source.

Say we want to fetch products using PDO:

final class PdoProductRepository implements ProductRepository
{
    public function __construct(private PDO $pdo) {}
}

A clean way to supply this class with a database connection is to simply add PDO to the constructor parameters. When we instantiate the class, we need to provide a PDO instance. This is called dependency injection.

Let's implement one of the methods:

/**
 * @return Product[]
 */
public function getFeatured(): array
{
    $rows = $this->pdo
        ->prepare(/* raw SQL */)
        ->execute(/* bind parameters */);

    return array_map(function(array $row) {
        return $this->productFromRow($row);
    }, $rows);
}

Fetching data is generally the same. We write one or more queries exactly the way we want them, then prepare and execute to get the rows. We then manually map rows to our objects. Since we have multiple methods returning products, we can reuse the mapping logic in a private productFromRow. That method can also be in the Product. It depends on what makes more sense in our application.

What about reusing complex portions of queries, such as with query builders? There are libraries that allow us to do just that, but without the overhead of an ORM or a special query language. I typically use Aura.SqlQuery.

Mapping

Mapping is just a method that takes data in one format, and converts it to the desired object.

private function productFromRow(array $row): Product
{
    return new Product(
        new ProductNumber($row['number']),
        new ProductName($row['name']),
        new Money($row['price'], $row['currency']),
        new ImagePath($row['img_path'])
    );
}

Notice that we use some additional classes like ProductNumber and ProductName. This gives us extra validation, so that if the database values are corrupt or if we accidentally pass the wrong values, we'll immediately stop execution and log the exception. There is no point in continuing if execution will result in an error down the line.

ProductNumber ensures that it adheres to our application's definition of a product number. For example, it must be alphanumeric with 12 characters. This way, we can't accidentally pass some other number in there, and then spend hours debugging unexpected 404s.

ProductName ensures that the name is not empty. It would make no sense to pass an empty string around just so that it eventually breaks the UI without notifying the developers.

The other value objects work similarly. The exact rules depend on our application. It's not necessary to represent every value using classes in order for these repositories to work. However, the extra effort makes code significantly easier to read and debug, especially as the application complexity grows.

Our Product object does not map every single column to the object. Having a mega-object that is a reflection of the table is seldom useful. We start with the object that we need, and query only what that object requires. The data can even come from multiple tables. The object's structure should be independent of the table. We can always have other objects with a different combination of columns, such as ProductDetails. It's better to have multiple specialized objects that are easy to work with, than giant reusable data bags where half of the properties are null.

Persisting

We don't give the repository one big object to save it. That often creates problems, like the need to track which properties changed, so that we don't override the whole graph when updating just one value.

We go for more specialized methods. Let's add some to the interface:

// This could be tied to a toggle
public function deactivateProduct(ProductNumber $number);

// This could be tied to a form
public function updateDetailsProduct(ProductNumber $number, ProductDetails $details);

// This could be tied to an inline editor for the name
public function renameProduct(ProductNumber $number, ProductName $name);

It seems like more work initially, but pays off because we seldom even need to debug anymore. Bugs can still happen, but they become extremely easy to pinpoint, so this approach saves time.

Testing

With fewer dependencies, testing becomes easy. I usually have acceptance tests for the application as a whole. For repositories, I write small integration tests:

protected function setUp(): void
{
    // Connect to test database
    $this->pdo = new PDO('mysql:host=localhost;port=3307;dbname=integration');
    // Reset to a known state; perhaps execute some additional queries
    $this->pdo->exec(file_get_contents('reset.sql'));
}

/** @test */
public function getFeatured(): void
{
    // Arrange
    $repo = new PdoProductRepository($this->pdo);
    // Act
    $products = $repo->getFeatured();
    // Assert
    self::assertCount(2, $products);
}

Because these tests are slower than unit tests, we should extract as much unit-testable logic out of it as possible, to keep it lean.

For all other classes that don't connect to the database, we can write unit tests:

/**
  * @test
  * @expectedException DomainException
  */
public function cannotCreateWithEmptyString()
{
    new ProductName('');
}

Any classes that use the repositories can also be unit-tested. For example, if we have a controller that requires a repository, we'd write something like this:

protected function setUp(): void
{
    // Mock the interface, not the implementation
    $this->products = $this->createMock(ProductRepository::class);
    // Reset to a known state; perhaps execute some additional queries
    $this->controller = new HomeController($this->products);
}

We can now stub repository methods and test the controller. With the approach presented in this article, we maintain full control over every step, yet the code remains quite simple.