 ## Complexity Theory 101: Introduction to Big O | by Sara A. Metwalli | Dec, 2020

One of the important terms you will come across as a data scientist or a developer is “Big O notation.” Big O is a notation used to express any computer algorithm’s complexity in terms of time and space. , Big O refers to how an algorithm scales concerning its input.

This is particularly essential for data science applications. Most of the datasets we use to train and develop machine learning models are medium in size. So, it is quite important for the data scientist to fully understand how the model’s behavior change when applying bigger datasets to it will.

The best and more efficient way to test this behavior is using Big O notation. If you’re getting into data science from a technical background — studied computer science, engineering, or any related fields — then you may be familiar with Big O notation. However, if you’re switching to data science from a non-technical field, then you might find Big O nation somewhat complex.

The good news is, even those with technical backgrounds sometimes find Big O to be confusing. That’s not because it’s difficult to understand, rather because sometimes applying it may not be simple.

This article will provide a simple introduction to Big O and how you can drive the Big O of your code. To explain the different complexities, I will use Python code. However, the same logic is implementable in any other programming language.

Let’s start things simple and work our way up. Assume we have some data entries stored in a list form — in C++, Java, and other languages, it’s called an array — and I want to get the 5th item of this list.

This is a simple problem to solve in Python; we need to access the list’s 5th address and read its content.

`dataEntries = [53,71,40,11,23,10,-7,32]print(dataEntries)`

Accessing a specific address — index — in a list requires O(1) — read as order 1 — which means the size of the list does not matter. Whether it has 7 entries or 1000000 entries, I will just access location 4 to get the 5th item.

Complexity O(1) is the best, it’s not always achievable, but if it is, then your code is independent of the input size.

Other operations that have complexity O(1) are the print function, simple arithmetic — addition, subtraction, and multiplication and division in the case of integers. Multiplication tends to get more complex when it deals with matrixes.

The function below has a complexity O(1). If I want to generalize it to add/subtract an arbitrary number of ints, then the complexity becomes O(n).

`def addSub2num(a,b,op):if op == "+":return a+belif op == "-":return a-belse:print("invalid op")`

The codes we’ve dealt with so far didn’t include any loops or iterations. There we complexities tend to increase.

Getting back to the list `dataEntrie`, instead of getting the 5th item, I want to print all the entries in the list. To do that, I will need to use a loop to do so. This means I will need to repeat the same process number of times equals how many items I have on the list. In our case, 8 times.

`dataEntries = [53,71,40,11,23,10,-7,32]for i in dataEntries:print(i)`

The complexity of a loop depends on how many times this loop is triggered. If we have a list of 5 items, then the loop will iterate 5 times; if it has 1000 items, we’ll iterate 1000 times, and so on. As you might have noticed, unlike the previous examples, this code is highly dependant on the size of the input (`dataEntries`).

To represent this, we will assume the list’s size is n, and so the complexity of the code will be O(n). Another example is assuming all items of a list.

`dataEntries = [53,71,40,11,23,10,-7,32]sum = 0for i in dataEntries:sum += i    print(sum)`

This complexity is often referred to as linear complexity. Think of it like this, since the relationship between the size and the number of iterations is constant — can be described in terms y = ax — then the complexity is linear.

So, one loop is O(n), what happens if the loops are nested?

If our code contains nested loops, for example, summing the numbers in a matrix — 2D list, find duplicates in a list, or basically any code that had nested loops.

Let’s focus on summing the items of a 2D list:

`list2D = [[1, 2],[3, 4]]sum = 0for row in range (len(list2D)):for col in range(len(list2D)):sum = sum + input[row][col]`

Since each loop has a complexity O(2) and they are nested, then we multiply their complexities to become O(4). That is O(n²).

It’s important to say that when we calculate the complexity, we are trying to figure out the worst-case scenario. For a matrix, the time needed for addition follows a polynomial `an²+bn+c`. That is also why O(n²) is called polynomial complexity.

Before we get into the explanation, log here is log base 2 and not how it is used in math, which is log base 10.

For example, assume we want to sum up the even numbers from 1 up to a certain upper limit n. Then we might do something like this:

`i = 1sum = 0n = 100while(i < n):i = i * 2sum += i`

The difference here is, our loop is skipping some numbers and only being triggered log(n) times. Hence, the complexity here becomes O(log(n)).

O(log(n)) complexity is often the complexity of most divide and concurs type algorithms, such as binary search, balanced binary search trees, many recursive algorithms, and priority queues.

If we have a code or an algorithm with complexity O(log(n)) that gets repeated multiple times, then it becomes O(n log(n)). Famous examples of this are merge sort and quicksort.

Going through the above examples, you might have figured out some rules for calculating Big O, but let’s sum them up:

1. Reading, writing an item in a list or a dictionary has O(1).
2. Going through an iterable is O(n).
3. Nested loops lead to O(n²) complexity.
4. Any divide and concur approach or loops handling binary numbers have O(n log(n)) complexity.
5. We sum up the complexity of sequential loops and multiply the complexity of nested loops.

Although most of the attention falls on data collection, reading, and analysis in the field of data science, being comfortable with time and space complexities (Big O) can take your skillset to the next level.

I believe every data scientist should know the basics of Big O notation. Knowing how the algorithm you chose to implement an application behaves with different sizes and shapes of input datasets can make or break your project.

Any coder can solve a problem given enough time, solving it well, however — that’s what we want to do! — Rob Conery

Some algorithms work perfectly on small datasets yet, fail at keeping up with larger datasets. Big O notation represents a simple way for data scientists to know what to expect from the algorithm and how to use it efficiently.

Big O’s basics are quite simple and straight forward, so learning it will be much easier than learning other aspects of data science. Mastering it, however, will need a lot of practice. On the bright side, as a data scientist, you build many projects, which means many chances to hone your Big O skills.