Building flexible components with generics and protocols

Published on: November 11, 2019

Recently I wanted to build a generic data source layer. This data source would be able to return pretty much anything from a local cache, or if the local cache doesn't contain the requested object, it would fetch the object from a server and then cache the result locally before returning it to me. To achieve this, I figured that I would write a generic local cache, a generic remote cache and a wrapper that would combine both caches, allowing me to transparently retrieve objects without having to worry about where the object came from.

It didn't take long before I saw the first compiler warnings and remembered that generics can be extremely hard to bend to your will. Especially if your generic object uses other generic objects to transparently do generic things.

In this blog post, I will show you my approach to tackling complicated problems like this, and how I use pseudo code to design and implement a fluent API that works exactly as I wanted. In this blog post we'll go over the following topics:

  • Designing an API without getting lost in the details
  • Knowing how simple generics are used in Swift, and how you can use them in your code.
  • Understanding you can have protocols with generic requirements, also known as associated types.
  • Combining generics with protocols that have associated types

Are you ready to enter the brain-melting world of generics? Good, let's go!

Designing an API without getting lost in the details

I promised you generics. Instead, I'm going to show you how to design an API that uses generics first. This is to establish a goal, something we can work towards throughout this blogpost. Generics are complicated enough as they are and I don't want you to get confused to the point where you're not sure what we were building again.

In the introduction of this post, I mentioned that I wanted to build a generic data store that cached data locally, and would use a remote data store as a back up in case the required data didn't exist locally.

A good way to get started building something like this is to write down some pseudo-code that demonstrates how you would like to use the API or component you're building. Here's what I wrote for the caching layer:

let localDataStore = UserDataStore()
let remoteDataStore = UserApi()
let dataStore = CacheBackedDataStore(localDataStore, remoteDataStore)

dataStore.fetch(userID) { result in 
  // handle result
}

This is pretty straightforward, right? You can see that I want to create two stores and a wrapping store. The wrapping store is the one that's used to retrieve information and it uses a callback to inform the caller about the results. Simple and straightforward, just how I like it. Keep in mind that whatever we design has to work with more than user objects. We also want to be able to store other information in this structure, for example, documents that belong to the user.

Let's dive a bit deeper and write a pseudo-implementation for CacheBackedDataStore:

class CacheBackedDataStore {
  let localStore: LocalStore
  let remoteStore: RemoteStore

  func fetch(_ identifier: IdentifierType, completion: @escaping Result<T, Error>) {
    localStore.fetchObject(identifier) { result in 
      if let result = try? result.get() {
         completion(.success(result))
      } else {
        remoteStore.fetchObject(identifier) { result in 
          if let result = try? result.get() {
            completion(.success(result))
          } else {
            // extract error and forward to the completion handler
          }
        }
      }
    }
  }
}

You might notice the type T on the result here. This type T is where our generic adventure begins. It's the start of a rabbit hole where we're going to turn everything into objects that could be anything. At this point, we have enough "design" to get started with setting up some of our building blocks. To do this, we're going to have a look at generics in Swift.

Adding simple generics to your code

In the pseudo-code design that I showed you in the previous section, I used the type T. Whenever we write code with generics in Swift, we typically use T to flag a type as generic. A generic type can be pretty much anything as long as it satisfies the constraints that are specified for it. If we don't specify any constraints, T can be anything you want. An example of a generic type that can be anything you want is Array. Let's look at two identical ways to define an empty array in Swift:

let words1 = [String]()
let words2 = Array<String>()

Notice that the second way uses the type name Array followed by <String>. This informs the compiler that we're defining an array where the type of element is String. Now let's try to imagine what the type definition for Array might look like:

struct Array<T> {
  // implementation code
}

This code declares a struct of type Array that contains some type T that is generic; it could be anything we want, as long as we specify it when creating an instance or when we use it as a type. In the earlier example, let words2 = Array<String> we defined T to be of type String. Let's look at one more basic example before we move on:

struct SpecializedPrinter<T> {
  func print(_ object: T) {
    print(object)
  }
}

This code declares a SpecializedPrinter that's generic over T and it has a function called print, that takes an object of type T and prints it to the console. If you paste the above into a Playground, you can use this SpecializedPrinter struct as follows:

let printer = SpecializedPrinter<String>()
printer.print("Hello!") // this is fine
printer.print(10) // this is not okay since T for this printer is String, not Int

Now that you know a bit about generics, I think we can write the first bit of code for the CacheBackedDataSource object:

struct CacheBackedDataSource<T> {
  func find(_ objectID: String, completion: @escaping (Result<T?, Error>) -> Void) {

  }
}

We're not doing much here, but it's an important milestone in your journey to mastering generics in Swift. You have written a data source that claims to cache any type (T) and will do an asynchronous lookup for an item based on a string identifier. The find(_:completion:) function will call the completion block with a Result object that contains an optional instance of T, or an Error object.

In the pseudo-code from earlier in this post, there were two properties:

let localStore: LocalStore
let remoteStore: RemoteStore

Since the caching layer should be as generic and flexible as possible, let's define LocalStore and RemoteStore as protocols. This will give us tons of flexibility, allowing any object to act as the local or remote store as long as they implement the appropriate functionality:

protocol LocalStore {

}

protocol RemoteStore {

}

And in these protocols, we will define methods to fetch the object we need, and in the local store, we'll define a method that persists an object.

protocol LocalStore {
  func find(_ objectID: String, completion: @escaping (Result<T, Error>) -> Void)
  func persist(_ object: T)
}

protocol RemoteStore {
  func find(_ objectID: String, completion: @escaping (Result<T, Error>) -> Void)
}

Unfortunately, this doesn't work. Our protocols don't know what T is since they're not generic. So how do we make these protocols generic? That's the topic of the next section.

Adding generics to protocols

While we can define a generic parameter on a struct by adding it to the type declaration between <>, just like we did for struct CacheBackedDataSource<T>, this is not allowed for protocols. If you want to have a protocol with a generic parameter, you need to declare the generic type as an associatedtype on the protocol. An associatedtype does not have to be implemented as a generic on objects that implement the protocol. I will demonstrate this shortly. For now, let's fix the local and remote store protocols so you can see associatedtype in action:

protocol LocalStore {
  associatedtype StoredObject

  func find(_ objectID: String, completion: @escaping (Result<StoredObject, Error>) -> Void)
  func persist(_ object: StoredObject)
}

protocol RemoteStore {
  associatedtype TargetObject

  func find(_ objectID: String, completion: @escaping (Result<TargetObject, Error>) -> Void)
}

Notice how we're not using a short name like T here. This is because the associated type does not necessarily have to be generic, and we want the purpose of this type to be communicated a bit better than we typically do when you're defining a generic parameter on a struct. Let's create two structs that we conform to LocalStore and RemoteStore to see how associatedtype works in the context of objects that conform to our protocols.

struct ArrayBackedUserStore: LocalStore {
  func find(_ objectID: String, completion: @escaping (Result<User, Error>) -> Void) {

  }

  func persist(_ object: User) {

  }
}

struct RemoteUserStore: RemoteStore {
  func find(_ objectID: String, completion: @escaping (Result<User, Error>) -> Void) {

  }
}

All that's needed to implement the protocol's associatedtype in this example is to use the same type in all places where the protocol uses its associated type. An alternative that's a bit more verbose would be to define a typealias inside of a conforming object and use the protocols associatedtype where we currently use the User object. An example of this would look like this:

struct RemoteUserStore: RemoteStore {
  typealias TargetObject = User

  func find(_ objectID: String, completion: @escaping (Result<TargetObject, Error>) -> Void) {

  }
}

I prefer the former way where we use User in place of TargetObject, it's just easier to read in my opinion.

Since we're dealing with data that comes from a remote server in RemoteUserStore, it would be quite convenient to constraint the value of TargetObject to only allow Decodable types to be used in place of TargetObject. We can do this as follows:

protocol RemoteStore {
  associatedtype TargetObject: Decodable

  func find(_ objectID: String, completion: @escaping (Result<TargetObject, Error>) -> Void)
}

If we try to use User in place of TargetObject and User isn't Decodable, we're shown the following error by the compiler:

candidate would match and infer 'TargetObject' = 'User' if 'User' conformed to 'Decodable'
func find(_ objectID: String, completion: @escaping (Result<User, Error>) -> Void) {

We now have the following code prepared for the CacheBackedDataSource and the local and remote store protocols:

struct CacheBackedDataSource<T> {
  func find(_ objectID: String, completion: @escaping (Result<T, Error>) -> Void) {

  }
}

protocol LocalStore {
  associatedtype StoredObject

  func find(_ objectID: String, completion: @escaping (Result<StoredObject, Error>) -> Void)
  func persist(_ object: StoredObject)
}

protocol RemoteStore {
  associatedtype TargetObject: Decodable

  func find(_ objectID: String, completion: @escaping (Result<TargetObject, Error>) -> Void)
}

Let's add some properties for the local and remote store to the CacheBackedDataStore:

struct CacheBackedDataSource<T> {
  let localStore: LocalStore
  let remoteStore: RemoteStore

  func find(_ objectID: String, completion: @escaping (Result<T, Error>) -> Void) {

  }
}

Unfortunately, this won't compile. The following errors are thrown by the Swift compiler:

error: protocol 'LocalStore' can only be used as a generic constraint because it has Self or associated type requirements
  let localStore: LocalStore
                  ^

error: protocol 'RemoteStore' can only be used as a generic constraint because it has Self or associated type requirements
  let remoteStore: RemoteStore
                   ^

Let's see how we can fix this error in the next session.

Using a protocol with associated type requirements as a generic constraint

Before we look deeper into the compiler errors we're currently stuck with I want to quickly recap what we've got prepared so far. Because even though the code doesn't compile, it's quite impressive already. We have a generic data source that can retrieve any object T.

We also have a protocol for a local store that caches any type we want, the adopter of the protocol can decide what type is cached exactly. All that matters is that the object that implements the protocol has a find method that performs a lookup based on an identifier and invokes a callback with a Result object. It also has a persist method that is expected to store objects that have the same type as the type object that the local store can fetch.

Lastly, we have a protocol for a remote store that fetches any kind of object, as long as it conforms to Decodable. Similar to how local store works, the implementer of the RemoteStore can decide what the type of the TargetObject will be.

This is really powerful stuff and if the above is a bit confusing to you, that's okay. It's not simple or straightforward even though the code looks fairly simple. Try following along with the code we've written so far, re-read what you've learned and maybe take a short break to let it sink in. I'm sure it will eventually.

In order to use the local and remote store protocols as types on the CacheBackedDataSource, we need to add generic parameters to the CacheBackedDataSource, and constrain these parameters so they have to implement our protocols. Replace your current implementation of CacheBackedDataSource with the following:

struct CacheBackedDataSource<Local: LocalStore, Remote: RemoteStore> {
  private let localStore: Local
  private let remoteStore: Remote

  func find(_ objectID: String, completion: @escaping (Result<Local.StoredObject, Error>) -> Void) {

  }
}

The declaration of CacheBackedDataSource now has two generic parameters, Local and Remote. Each has to conform to its respective protocol. This means that the localStore and remoteStore should not be of type LocalStore and RemoteStore. Instead, they should be of type Local and Remote. Note that Result<T, Error> has been replaced with Result<Local.StoredObject, Error>. The find method now uses whatever type of object the LocalStore stores as the type for its Result. This is really powerful because the underlying store now dictates the type of objects returned by the data source object.

There's still one problem though. Nothing prevents us from locally storing something that's completely incompatible with the remote store. Luckily we can apply constraints to the generic parameters of our struct. Update the declaration of CacheBackedDataSource as follows:

struct CacheBackedDataSource<Local: LocalStore, Remote: RemoteStore> where Local.StoredObject == Remote.TargetObject

We can now only create CacheBackedDataSource objects that use the same type of object for the local and remote stores. Before I show you how to create an instance of CacheBackedDataSource, let's implement the find method first:

func find(_ objectID: String, completion: @escaping (Result<Local.StoredObject, Error>) -> Void) {
  localStore.find(objectID) { result in
    do {
      let object = try result.get()
      completion(.success(object))
    } catch {
      self.remoteStore.find(objectID) { result in
        do {
          let object = try result.get()
          self.localStore.persist(object)
          completion(.success(object))
        } catch {
          completion(.failure(error))
        }
      }
    }
  }
}

The find method works by calling find on the local store. If the requested object is found, then the callback is invoked and the result is passed back to the caller. If an error occurred, for example, because the object wasn't found, the remote store is used. If the remote store finds the requested object, it's persisted in the local store and the result is passed back to the caller. If the object wasn't found or an error occurred in the remote store, we invoke the completion closure with the received error.

Note that this setup is extremely flexible. The implementation of CacheBackedDataSource doesn't care what it's caching. It only knows how to use a local store with a fallback to a remote store. Pretty awesome, right? Let's wrap this up by creating an instance of the CacheBackedDataSource:

let localUserStore = ArrayBackedUserStore()
let remoteUserStore = RemoteUserStore()
let cache = CacheBackedDataSource(localStore: localUserStore, remoteStore: remoteUserStore)
cache.find("someObjectId") { (result: Result<User, Error>) in

}

All you need to do is create instances of your stores, and supply them to the cache. You can then call find on your cache and the compiler is able to understand that the result object that's passed to the completion closure for find is a Result<User, Error>.

Take a look at the pseudo-code I showed you at the beginning of this post. It's very close to what we ended up implementing, and it's just as powerful as we imagined! If you've been following along, try to create some CacheBackedDataSource objects for other types. It should be fairly straightforward.

In summary

You have learned so much in this blog post. I wouldn't be surprised if you have to read it one or two more times to make complete sense out of all these generics and type constraints. And we haven't even covered all of it! Generics are an unbelievably powerful and complex feature of the Swift language but I hope that I have been able to help you make some sense of them. Overall, you now know that adding <T> to an object's declaration adds a generic parameter, which means that anytime you use T inside of that object, it's whatever the user of that object decided it to be.

You also learned that you can add associatedtype to a protocol to have it support generic types. And to top it off, you learned how you can use a protocol that has an associated type as a constraint for an object's generic parameter for maximum flexibility. If your brain hurts a bit after reading all this then again, don't worry. This stuff is hard, confusing, weird and complex. And if you have any questions, comments or need somebody to talk to because you feel lost now, don't hesitate to reach out on Twitter!

Categories

Swift

Subscribe to my newsletter