Array of Objects - No Duplicates

Explanation of different ways to remove duplicates from an Array of objects

Published: 15 January 2020

remove duplicates javascript arrays objects Sets 

Standard Approach - Ugly But Fast


consider a data set to remove duplicates from:

const personsArr = [
  { id: "59bcc1ed-a9e5-409a-b14a-ea8ccb8085e6", username: "Steve_McDermott67" },
  { id: "70315223-a0cd-459d-988d-b7974cc8e9bc", username: "Sienna.Collier70" },
  { id: "22913077-0877-461a-9b69-bf217d484583", username: "Gerard79" },
  { id: "18d5897c-47a2-458a-b694-2530505d72b4", username: "Mozelle.McKenzie" },
  { id: "22913077-0877-461a-9b69-bf217d484583", username: "Gerard79" },
  { id: "bc824009-75a4-421b-810a-ede14d11b022", username: "Ally86" },
  { id: "afa72482-1783-46d3-a930-03792e439d60", username: "Giles.Dietrich" },
  { id: "18d5897c-47a2-458a-b694-2530505d72b4", username: "Mozelle.McKenzie" },
  { id: "bc824009-75a4-421b-810a-ede14d11b022", username: "Ally86" },
  { id: "59bcc1ed-a9e5-409a-b14a-ea8ccb8085e6", username: "Steve_McDermott67" },
]

The obvious way is to create a new array by looping over the personsArray.

Then keep track of all persons that have already been included and skip them, like so:

const includedPersons = {
  //  will contain data like
  // 'bc824009-75a4-421b-810a':true
}

const newArr = personsArr.filter(person => {
  //if you havent been included before, we must include you
  if (!includedPersons[person.id]) {
    includedPersons[person.id] = true
    return true
  }
  //otherwise we will skip you
  return false
})

This approach is very verbose and ugly to look at. However, it is the fastest way (in terms of execution speed) since there is no nested looping and object lookup is way faster than array lookup.

A Seemingly Better Approach - Clean but Expensive


Another way is to use array's built in method reduce and shorten the code like so:

const newArr = personsArr.reduce((accumulator, currentElement) => {
  // find if the current element exists in the accumulator already
  const foundIndex = accumulator.findIndex(elm => elm.id === currentElement.id)

  //if it doesnt add it to the accumulator
  if (foundIndex < 0) accumulator.push(currentElement)

  //always return the accumulator at the end
  return accumulator
}, [])

However, here we have a nested loop - we are looping over personsArr and then for each element, we also loop over the contents of the accumulator.

Although findIndex will stop at the first instance of finding the correct id, it can still be a very expensive operation.

Imagine having an array of 10,000 items. For the very first item, accumulator will be empty so that should be pretty fast.

But when we start to have 5000 elements in the accumulator, we will have to loop over a huge number to eliminate duplicates.

This is a big no-no in terms of speed and performance.

Best Approach - Clean & Fast


The modern approach involves using Sets in javascript. Consider the following

const stringifiedArr = personsArr.map(person => JSON.stringify(person))
// above line converts every person object and stringify it:
// "{"id":"59bcc1ed-a9e5-409a-b14a-ea8ccb8085e6","username":"Steve_McDermott67"}"

const setOfStringifiedValues = new Set(stringifiedArr)
// above line then creates a set which only contains unique values

const newArrFromSet = Array.from(setOfStringifiedValues)
// above line creates a new Array back from the set (with each element still being a stringified object)

const finalArr = newArrFromSet.map(person => JSON.parse(person))
//finally we parse each element of the array to recreate our objects

Explanation:

Sets won't allow us to push a duplicate value in them and are amazing in speed at looking up values.

But there is a catch - sets can only stop duplication for primitive types like Strings, Numbers, etc. If we try to apply the same priciple to objects or arrays, sets will allow duplicates. Reason is - all objects are reference types and even though they may have same key/value pairs in them, sets will treat them as different objects (which they technically are).

So, to get around this problem, we convert all of our objects to strings (by JSON.stringify), then create a new Set using these stringified values (where Sets will happily eliminate duplicates).

At last we convert the set back to an array and then loop over each stringified object to convert it back to a javascript object (using JSON.parse).

Steps are as follows:

(1) Stringify all objects inside the array

(2) Create a set with these values

(3) convert the set back to array

(4) Convert all stringified values back to javascript objects

To make it super clean code, we can write it all together:

const newArr = Array.from(
  new Set(personsArr.map(person => JSON.stringify(person)))
).map(stringifiedPerson => JSON.parse(stringifiedPerson))

MDN Reference for Sets