Understanding type erasure in Swift

Published on: May 18, 2020

Swift's type system is (mostly) fantastic. Its tight constraints and flexible generics allow developers to express complicated concepts in an extremely safe manner because the Swift compiler will detect and flag any inconsistencies within the types in your program.

While this is great most of the time, there are times where Swift's strict typing gets in the way of what we're trying to build. This is especially true if you're working on code that involves protocols and generics.

With protocols and generics, you can express ideas that are insanely complex and flexible. But sometimes you're coding along happily and the Swift compiler starts yelling at you. You've hit one of those scenarios where your code is so flexible and dynamic that Swift isn't having it.

Let's say you want to write a function that returns an object that conforms to a protocol that has an associated type? Not going to happen unless you use an opaque result type.

But what if you don't want to return the exact same concrete type from your function all the time? Unfortunately, opaque result types won't help you there. Luckily, Swift 5.7 which came out in 2022 allows us to define so-called primary assocated types which allow us to specialize our opaque return types where needed.

It's important to notw that primary associated types remove many of the reasons the use type erasure in your app, but they don't make type erasure completely obsolete.

So when the Swift compiler keeps yelling at you and you have no idea how to make it stop, it might be time to apply some type erasure.

In this week's blog post I will explain what type erasure is and show an example of how type erasure can be used to craft highly flexible code that the Swift compiler will be happy to compile.

There are multiple scenarios where type erasure makes sense and I want to cover two of them.

Using type erasure to hide implementation details

The most straightforward way to think of type erasure is to consider it a way to hide an object's "real" type. Some examples that come to mind immediately are Combine's AnyCancellable and AnyPublisher. An AnyPublisher in Combine is generic over an Output and a Failure. If you're not familiar with Combine, you can read up in the Combine category on this blog. All you really need to know about AnyPublisher is that it conforms to the Publisher protocol and wraps another publisher. Combine comes with tons of built-in publishers like Publishers.Map, Publishers.FlatMap, Future, Publishers.Filter, and many, many more.

Often when you're working with Combine, you will write functions that set up a chain of publishers. You usually don't want to expose the publishers you used to callers of your function. In essence, all you want to expose is that you're creating a publisher that emits values of a certain type (Output) or fails with a specific error (Failure). So instead of writing this:

func fetchData() -> URLSession.DataTaskPublisher<(data: Data, response: URLResponse), URLError> {
  return URLSession.shared.dataTaskPublisher(for: someURL)
}

You will usually want to write this:

func fetchData() -> AnyPublisher<(data: Data, response: URLResponse), URLError> {
  return URLSession.shared.dataTaskPublisher(for: someURL)
    .eraseToAnyPublisher()
}

By applying type erasure to the publisher created in fetchData we are now free to change its implementation as needed, and callers of fetchData don't need to care about the exact publisher that's used under the hood.

When you think about how you can refactor this code, you might be tempted to try and use a protocol instead of an AnyPublisher. And you'd be right to wonder why we wouldn't.

Since a Publisher has an Output and Failure that we want to be able to use, using some Publisher wouldn't work. We wouldn't be able to return Publisher due to its associated type constraints, so returning some Publisher would allow the code to compile but it would be pretty useless:

func fetchData() -> some Publisher {
  return URLSession.shared.dataTaskPublisher(for: someURL)
}

fetchData().sink(receiveCompletion: { completion in
  print(completion)
}, receiveValue: { output in
  print(output.data) // Value of type '(some Publisher).Output' has no member 'data'
})

Because some Publisher hides the true type of the generics used by Publisher, there is no way to do anything useful with the output or completion in this example. An AnyPublisher hides the underlying type just like some Publisher does, except you can still define what the Output and Failure types are for the publisher by writing AnyPublisher<Output, Failure>.

With primary associated types in Swift 5.7, you can write the following code:

func fetchData() -> any Publisher<(Data, URLResponse), URLError> {
  return URLSession.shared.dataTaskPublisher(for: someURL)
}

The only problem with using primary associated types on a Publisher is that not all methods that exist on publishers like an AnyPublisher are added to the Publisher protocol. This means that we might lose some functionality by not using AnyPublisher.

I will show you how type erasure works in the next section. But first I want to show you a slightly different application of type erasure from the Combine framework. In Combine, you'll find an object called AnyCancellable. If you use Combine, you will encounter AnyCancellable when you subscribe to a publisher using one of Combine's built-in subscription methods.

Without going into too much detail, Combine has a protocol called Cancellable. This protocol requires that conforming objects implement a cancel method that can be called to cancel a subscription to a publisher's output. Combine provides three objects that conform to Cancellable:

  1. AnyCancellable
  2. Subscribers.Assign
  3. Subscribers.Sink

The Assign and Sink subscribers match up with two of Publisher's methods:

  1. assign(to:on:)
  2. sink(receiveCompletion:receiveValue)

These two methods both return AnyCancellable instances rather than Subscribers.Assign and Subscribers.Sink. Apple could have chosen to make both of these methods return Cancellable instead of AnyCancellable.

But they didn't.

The reason Apple applies type erasure in this example is that they don't want users of assign(to:on:) and sink(receiveCompletion:receiveValue) to know which type is returned exactly. It simply doesn't matter. All you need to know is that it's an AnyCancellable. Not just that it's Cancellable, but that it could be _any MARKDOWN_HASH9fba8e737f748904c9dc7415d4876e4aMARKDOWN<em>HASH.

Because AnyCancellable erases the type of the original Cancellable by wrapping it, you don't know if the AnyCancellable wraps a Subscribers.Sink or some other kind of internal, private Cancellable that we're not supposed to know about.

If you have a need to hide implementation details in your code, or if you run into a case where you want to return an object that conforms to a protocol that has an associated type that you need to access without returning the actual type of object you wanted to return, type erasure just might be what you're looking for.

Applying type erasure in your codebase

To apply type erasure to an object, you need to define a wrapper. Let's look at an example:

protocol DataStore {
  associatedtype StoredType

  func store(_ object: StoredType, forKey: String)
  func fetchObject(forKey key: String) -> StoredType?
}

class AnyDataStore<StoredType>: DataStore {
  private let storeObject: (StoredType, String) -> Void
  private let fetchObject: (String) -> StoredType?

  init<Store: DataStore>(wrappedStore: Store) where Store.StoredType == StoredType {
    self.storeObject = wrappedStore.store
    self.fetchObject = wrappedStore.fetchObject
  }

  func store(_ object: StoredType, forKey key: String) {
    storeObject(object, key)
  }

  func fetchObject(forKey key: String) -> StoredType? {
    return fetchObject(key)
  }
}

This example defines a DataStore protocol and a type erasing wrapper called AnyDataStore. The purpose of the AnyDataStore is to provide an abstraction that hides the underlying data store entirely. Much like Combine's AnyPublisher. The AnyDataStore object makes extensive use of generics and if you're not too familiar with them this object probably looks a little bit confusing.

The AnyDataStore itself is generic over StoredType. This is the type of object that the underlying DataStore stores. The initializer for AnyDataStore is generic over Store where Store conforms to DataStore and the objects that are stored in the Store must match the objects stored by the AnyDataStore. Due to the way this wrapper is set up that should always be the case but Swift requires us to be explicit.

We want to forward any calls on AnyDataStore to the wrapped store, but we can't hold on to the wrapped store since that would require making AnyDataStore generic over the underlying data store, which would expose the underlying datastore. Instead, we capture references to the method we need in the storeObject and fetchObject properties and forward any calls to store(_:forKey:) and fetchObject(forKey:) to their respective stored references.

It's quite a generics feast and again, if you're not too familiar with them this can look confusing. I wrote about generics a while ago so make sure to click through to that post if you want to learn more.

Let's see how this AnyDataStore can be used in an example:

class InMemoryImageStore: DataStore {
  var images = [String: UIImage]()

  func store(_ object: UIImage, forKey key: String) {
    images[key] = object
  }

  func fetchObject(forKey key: String) -> UIImage? {
    return images[key]
  }
}

struct FileManagerImageStore: DataStore {
  typealias StoredType = UIImage

  func store(_ object: UIImage, forKey key: String) {
    // write image to file system
  }

  func fetchObject(forKey key: String) -> UIImage? {
    return nil // grab image from file system
  }
}

class StorageManager {
  func preferredImageStore() -> AnyDataStore<UIImage> {
    if Bool.random() {
      let fileManagerStore = FileManagerImageStore()
      return AnyDataStore(wrappedStore: fileManagerStore)
    } else {
      let memoryStore = InMemoryImageStore()
      return AnyDataStore(wrappedStore: memoryStore)
    }
  }
}

In the code snippet above I create two different data stores and a StorageManager that is responsible for providing a preferred storage solution. Since the StorageManager decides which storage we want to use it returns an AnyDataStore that's generic over UIImage. So when you call preferredImageStore() all you know is that you'll receive an object that conforms to DataStore and provides UIImage object.

Of course, the StorageManager I wrote is pretty terrible. When you're working with data and storing it you need a lot more control over what happens and whether data is persisted. And more importantly, a StorageManager that will randomly switch between stores is not that useful. However, the important part here is not whether or not my DataStore is good. It's that you can use type erasure to hide what's happening under the hood while making your code more flexible in the process.

The example of AnyDataStore I just showed you is very similar to the AnyPublisher scenario that I described in the previous section. It's pretty complex but I think it's good to know this exists and how it (possibly) looks under the hood.

In the previous section, I also mentioned AnyCancellable. An object like that is much simpler to recreate because it doesn't involve any generics or associated types. Let's try to create something similar except my version will be called AnyPersistable:

protocol Persistable {
  func persist()
}

class AnyPersistable: Persistable {
  private let wrapped: Persistable

  init(wrapped: Persistable) {
    self.wrapped = wrapped
  }

  func persist() {
    wrapped.persist()
  }
}

An abstraction like the one I showed could be useful if you're dealing with a whole bunch of objects that need to be persisted but you want to hide what these objects really are. Since there are no complicated generics involved in this example it's okay to hold on to the Persistable object that's wrapped by AnyPersistable.

In summary

In this post, you learned about type erasure. I showed you what type erasing is, and why it's used. You saw how Apple's Combine framework uses type erasure to abstract Publisher and Cancellable objects and hide their implementation details. This can be really useful, especially if you're working on a framework or library where you don't want others to know which objects you are using internally to prevent users from making any assumptions about how your API works internally.

After explaining how type erasure is used, I showed you two examples. First, you saw a complicated example that uses generics and stores references to functions as closures. It's pretty complex if you haven't seen anything like it before so don't feel bad if it looks a little crazy to you. I know that with time and experience, a construction like the one I showed you will start to make more sense. Type erasure can be a pretty complicated topic.

The second example I showed you was simpler because it doesn't involve any generics. It mimics what Apple does with Combine's AnyCancellable to hide the underlying Cancellable objects from developers.

If you have any questions about this post or if you have feedback for me, reach out to me on Twitter

Categories

Swift

Subscribe to my newsletter