Removing duplicate values from an array in Swift

Published by donnywals on

Arrays in Swift can hold on to all kinds of data. A common desire developers have when they use arrays, is to remove duplicate values from their arrays. Doing this is, unfortunately, not trivial. Objects that you store in an array are not guaranteed to be comparable. This means that it's not always possible to determine whether two objects are the same. For example, the following model is not comparable:

struct Point {
  let x: Int
  let y: Int
}

However, a keen eye might notice that two instances of Point could easily be compared and two points that are equal would have the same x and y values. Before you can remove defaults from an array, you should consider implementing Equatable for the object you're storing in your array so we can use the == operator to compare them:

extension Point: Equatable {
  static func ==(lhs: Point, rhs: Point) -> Bool {
    return lhs.x == rhs.x && lhs.y == rhs.y
  }
}

Once you've determined that you're dealing with Equatable objects, you can define an extension on Array that will help you remove duplicate values as follows:

extension Array where Element: Equatable {
  func uniqueElements() -> [Element] {
    var out = [Element]()

    for element in self {
      if !out.contains(element) {
        out.append(element)
      }
    }

    return out
  }
}

This way of removing duplicates from your array is not very efficient. You have to loop through all of the elements in the array to build a new array and every call to contains loops through the out array. This isn't ideal and in some cases we can do better.

I say some cases because the more optimal solution requires that the element in your array conforms to Hashable. For the Point struct I showed you earlier this is easy to achieve. Since Point only has Int properties and Int is Hashable, Swift can synthesize an implementation of Hashable when we conform Point to Hashable:

extension Point: Hashable {}

If the elements in your array are already hashable, you don't have to declare this extension yourself.

For an array of hashable elements, we can use the following implementation of uniqueElements():

extension Array where Element: Hashable {
  func uniqueElements() -> [Element] {
    var seen = Set<Element>()
    var out = [Element]()

    for element in self {
      if !seen.contains(element) {
        out.append(element)
        seen.insert(element)
      }
    }

    return out
  }
}

This code looks very similar to the previous version but don't be fooled. It's much better. Note that I defined a Set<Element> in this updated implementation. A Set enforces uniqueness and allows us to look up elements in constant time. This means that seen.contains(element) doesn't have to loop over all elements in the set to find the element you're looking for. This is a buge improvement over the Equatable version of this algorithm because it removes an entire nested loop (which is hidden in contains) from our implementation. Note that the loop from this code can be cleaned up a bit with a compactMap instead of a for loop. This doesn't change the performance but I think it looks a bit nicer:

extension Array where Element: Hashable {
  func uniqueElements() -> [Element] {
    var seen = Set<Element>()

    return self.compactMap { element in
      guard !seen.contains(element)
        else { return nil }

      seen.insert(element)
      return element
    }
  }
}

Functionally these two implementations are the same, and they also have the same performance characteristics so pick whichever one you like.

There is one more way of implementing uniqueElements() for an array of Hashable elements that is even more efficient. It does come with one caveat though. When you use this last version, you might lose the order of your original array which means you should only use this version if you don't care about the ordering of your array:

extension Array where Element: Hashable {
  func unsortedUniqueElements() -> [Element] {
    let set = Set(self)
    return Array(set)
  }
}

By converting the array to a Set, all duplicate values are automatically dropped from the array. We can then convert the set back to an array to get an array with all duplicate values removed. Because sets don't enforce any ordering, you might lose the original order of your array. The previous two versions of uniqueElements preserved the ordering of your input. You should use this version if you need your array's order to be preserved. If you don't care about order and your elements are Hashable I would recommend to use the Set approach I showed last.

I hope this quick tip gave you some useful insights into arrays and how you can deduplicate them. If you have questions, feedback or alternative solutions don't hesitate to reach out on Twitter.


Learn Combine with my new book

Learn everything you need to know about Combine and how you can use it in your projects with my new book Practical Combine. You'll get eleven chapters, a Playground and a handful of sample projects to help you get up and running with Combine as soon as possible.

The book is available as a digital download for just $19.99!

Get Practical Combine

Receive weekly updates about my posts