Why Big O doesn't tell the whole story

If you are already experienced with algorithmic complexity, you already know what I will tell you here. I won't claim that the Big O notation is a sham, or that you have been misled.
However, if you are new to the world of algorithms, you may find a deeper understanding of the matter after reading this.

To understand why an algorithm's Big O notation may be misleading, we first have to understand what it is and what it is not.

What Big O is not

Big O describes how an algorithm scales with its input. Just that, nothing more.

Big O does not actually tell you the exact time an algorithm will take to execute. It can't do that. Nothing can.

What Big O is

Shortly put, Big O is a technique used to describe the time and space complexity of an algorithm, based on its input size. It simply states how an algorithm scales with its input.

For example, O(n) algorithms run in linear time, which means that the time it will take to execute scales with its input. If you double the input, you double the time it will take: 1 second for 1 item, 2 seconds for 2 items, 4 seconds for 4 items, and so on. It scales linearly.
However, O(n²) algorithms run in quadratic time, which means that if you double the input, it quadruples the time: 1 second for 1 item, 4 seconds for 2 items, and 16 seconds for 4 items. It rises like crazy:

This means that, for a sufficiently large input, an O(n) algorithm will always outperform an O(n²) algorithm.

Constant operations

For a sufficiently large input, an O(n) algorithm will always outperform an O(n²) algorithm.

What if the input is not sufficiently large?

Every algorithm does some constant operations, like iterating through an array, calculating the midpoint for binary search, and so on. Those operations are constant - they always run in the same time, so they are not needed for a time complexity analysis. Even with 10 million items, those constant operations will always have the same impact.

How do you compare two O(n) algorithms?
Imagine you have the following algorithms: linearSearch and linearSearchSlow, both of them O(n).

const linearSearch = (items, target) => {
  for (let i = 0; i < items.length; i++) {
    if (items[i] === target) {
      return i;
    }
  }

  return -1;
};

const linearSearch = (items, target) => {
  for (let i = 0; i < items.length; i++) {
    for (let j = 0; j < 1_000_000_000; j++) {
      // do that a billion times, for every element
      simulateTheUniverse();
    }

    if (items[i] === target) {
      return i;
    }
  }

  return -1;
};

Simulating the universe is a constant operation - you iterate over every atom and calculate its position. Doing it a billion times is still a constant operation. We are doing it exactly a billion times, it doesn't depend on the input size of the algorithm. Since it is a constant operation, both algorithms are technically O(n). However, one of them will be a bit slower, you decide which one.

This also means that for some input, an O(n²) algorithm can be faster than O(n).

Obviously, that example is a bit extreme, but it helps demonstrate my point.
If you want to see me mess around to find out what a sufficiently large input is for binary search vs linear search, check out my article: When is binary search worth it?.

The worst case is not the average

Big O is useful because we can have an idea of how our algorithm will perform at its worst:
"Hope for the best, prepare for the worst"

However, in real life, things are not so simple. Algorithms don't have just their worst cases. They have their good ones too! And on average, you get something in-between.

Take a look at QuickSort - its worst-case complexity is O(n²), but it's one of the most efficient sorting algorithms out there. Its worst case occurs when the pivot happens to be the array's minimum or maximum, but that is such a rare case that this algorithm is still widely used. Its average-case complexity is O(n*log(n)), like other good algorithms.

Even so, QuickSort becomes the quickest after a certain threshold, due to the overhead of constant operations. For smaller arrays, another sorting algorithm can be employed. In fact, this is what V8 (the engine which powers Chrome and NodeJS) used to do - check out Getting things sorted in V8 on V8's blog.

Conclusion

In practice, you will rarely need to think beyond an algorithm's Big O, and with this article, I am not saying you should. All I am trying to do is provide deeper insight into how complexity analysis works for people not already familiar.