The Skills You Need To Succeed In Business Intelligence | by Adam Finer | Nov, 2020
So we’ll start with the most obvious, the basic foundation and source of all business intelligence, data. For a geek like me, data can be quite exciting. It’s nerdy, I know, but I actually often get a buzz out of getting access to a brand new data source for the first time. It’s like diving into the unknown or setting off on an adventure into uncharted territory to discover things that no one has ever discovered before.
Because, sometimes, that’s essentially what I’m doing. Manipulating and aggregating the data, and crunching the numbers to gain insights from it. To learn things that neither I nor the data’s owners knew before. Anyway, I digress a little but not without reason. I’m kind of demonstrating to you that, unless you can get passionate about data, there’s not much point in exploring the possibility of a career in data science. So that is the first skill you need. Well, more of a character trait really.
When it comes to data though there are 3 main areas you need to know about:
- What it is
- Where it is
- How to access and query it.
“To learn things that neither I nor the data’s owners knew before”
‘What it is’ means what it’s made up of and the different formats it comes in. Fairly straightforward basic stuff that’s quick to learn. To find out more about what data is made up of, check out this video from the Vitamin BI YouTube channel.
In terms of the different formats it comes in, well that depends on its source. You essentially have structured (in this I’ll include semi-structured) and unstructured data. Structured data tends to be the kind you would probably most associate with what data looks like. The kind of traditional rows and columns tabular format you might find in an Excel spreadsheet.
Unstructured data is basically everything else. And by that I mean literally anything else. Emails, videos, audio, pdfs, log files, anything. I must say that unstructured data is fairly uncommon when it comes to the realm of BI so it’s not something you would necessarily need to be an expert on (I’m certainly not).
You would however need to be very familiar with structured data, which tends to be stored in Relational Database Management Systems, or RDBMS. We’ll talk more about these shortly. What I mean by ‘familiar’ is how they function in terms the ‘relational’ model as it pertains to the storage of data. Also how to access and query these kinds of databases. More on this coming up.
This leads quite nicely into the 2nd element of data I mentioned, where it is. So, as I said, the data you will need to deal with, more often than not, is structured data that’s stored in some kind of relational database, things like MySQL, Postgres SQL, SQL Server (more on this in a minute) or it could be something as basic as an a csv (comma separated values) file or indeed an Excel file. To analyse these last 2 properly in BI tools, however, they will need to be structured just like a table in a relational database is, i.e. in a tabular rows and columns format.
Going back to relational databases, they’re either on-premise or in the cloud. So either installed on an in-house server or, as the popular meme jokes, on someone else’s computer, i.e. the cloud. The cloud is in fact much more than just that but for the purposes of this discussion that’s essentially what we’re talking about.
I just mentioned some of the different SQL database types. Well, there are cloud versions of these as well. So instead of installing one yourself on an in-house server, you can essentially rent what’s called an instance of one in the cloud.
Ok, so now we’ve covered the ‘what’ and ‘where’ of data. The third part is knowing how to access and query data. Again, we’ll break this down into the different types of data.
When it comes to relational databases, they have what’s called a host address, basically its IP address where it can be contacted. And then you need to use a username and password that’s been set up in the database in order to connect to it and query its data. To query just means to ‘ask questions of’ and this is done using the query language called SQL, also commonly called/pronounced “sequel”. It stands for Structured Query Language and it has different versions (like I mentioned before, MySQL, Postgres etc) whose syntax varies slightly but is mainly the same when it comes to querying.
When it comes to cloud data sources (other than those cloud versions of RDBMS), the way these are communicated with is usually via what’s called an API (Application Programming Interface). The most common types of these are REST and SOAP. They have what are called endpoints that let you access specific datasets and the way querying works is that you make calls to the API, that contain parameters, asking it to return the data you want.
So I guess a good question you might want to ask here is, “do I need to learn all about how APIs work?” and the answer is “yes” and “no”. Yes, you do need to have a good understanding of how they work because you may be called upon to write custom API calls to query data and, if you haven’t got a clue about them then you’ll be a bit lost but you don’t need to be an expert, know how to build them, that kind of thing.
When you’re working with these online services and their data sources, you’ll more than likely be doing so with a BI tool. And normally, the tool you’re working with will have a data connector for that specific source, which means you don’t need to formulate the individual API calls yourself. If it doesn’t have a connector then you’ve chosen the wrong tool which is a whole different story. So again, although it’s a good idea to have basic API knowledge in your tool belt, it’s not essential.
So, basic knowledge of APIs is a good idea but not essential. What is mandatory, in my eyes however, is knowing SQL for querying. Not necessarily how to build and administer databases, but how to select the data you need to work with, how to filter and pre-aggregate it and how to join different tables together into what are called ‘views’. You may be asking “but won’t the BI tool be able to do that for me as well?”. The answer to that is almost definitely yes. However, the problem is that you may need to do some work on the data before you connect it to your BI tool. Let’s say that the data you need to visualise is contained across several different tables in a database. These tables contain all historical data, let’s say millions of rows in each. But you only want to query data for a specific period of time. And you don’t need all of the dozens of fields (or columns) of every table. To just connect to all that data with your BI tool and then query that is really inefficient. It means that your BI tool is going to need to aggregate millions of rows across different tables for every query which will make your dashboards really slow to load.
And what if your BI tool doesn’t have a graphical interface that you can use to join the different tables together? You’ll need to know the SQL code to do that as well.
All this being said, BI tools are becoming more and more advanced in their capabilities and functionalities so there’s a good chance you’ll be able to do most of what I’ve talked about without knowing SQL. But what if you can’t?
Another benefit of knowing how to query data using SQL is that it helps you to understand how querying works in general. How queries are formulated to join and aggregate data.
If you’d like to learn basic SQL in 15 minutes, check out this video on my YouTube Channel (once you’ve finished the article, of course).
After you understand and are comfortable with data you’ll then need to know how to use BI tools. Or, more specifically, how to use one BI tool inside and out. And I mean how to do everything with it, become a full-on expert. Because once you have experience of pushing one BI tool to its limits, you’ll then probably be able to use any other BI tool to a fairly proficient degree within a very short space of time. This is because most of them function in a very similar way. And the reason for this is that there isn’t 100 different ways to query the same data. The result behind the data visualisations you’ll need to produce for your dashboards is identical so it stands to reason that the way to get to that result using different tools should be very similar.
Here’s a question: Have you ever used a pivot table? If you don’t know what that is then watch this video here. If you have then you pretty much know how to use a BI tool to query and visualise your data. Because most BI tools, at least the ones I’ve seen, basically have a pivot table engine at their core to query data. The way the tool’s interface presents the functionalities of the pivot table model will differ, the way things are named, how the elements of the query are placed into the pivot table etc. But it’s still a pivot table at its core.
Of course there’s more to a BI tool than visualising data by placing data fields into a pivot table. You’ll also need to know how apply manipulations to the results of queries and also create custom fields by writing formulas. Lots of BI tools have their own proprietary language for writing them that you’ll need to learn. Others will actually incorporate elements of SQL or something very similar (like Google’s Data Studio). So, another good reason to learn SQL.
Writing custom formulas is all part of the day to day of working in BI. Because a lot of the time you’ll be problem solving. Which is why I talked about pushing one BI tool to its limits. Because, once you have, you’ll become familiar with the kinds of problems you’ll need to solve. And although the data is always different, the same kinds of problems always have a tendency to reappear. So by learning how to solve these problems using one tool you’ll be better equipped and know how to approach them when using a different one.
The next skill to have is to know how to effectively visualise data and build reports or dashboards.
There is far more that goes into a dashboard than just trying to make it look as pretty or as cool as possible. So much more. Data visualisation theory is something that’s important to have a solid grasp of and there are some really great books on the subject. If you’d like to learn more then I’d suggest a great place to start would be “The Visual Display of Quantitative Information” by Edward Tufte (affiliate link).
In terms of dashboard design there are certain rules that you should follow like making sure you tell a story and not overloading dashboards with too much information. I’m not going to go into much more detail than that because I’ve written another book dedicated to this subject. If you’d like to get your hands on a copy just get in touch via my email at the end of this book.
Here are some other things that might indicate whether you’d be suited to a career in Business Intelligence.
Passion For Data
I mentioned right at the beginning about how I got excited about getting my hands on new data. It’s worth repeating that if this doesn’t resonate with you then you should probably look at a different career choice.
Penchant For Problem Solving
Solving problems is what you’ll be doing a lot of if you pursue a career in Business Intelligence or Data Science. Rarely do projects that involve wrangling data from multiple different sources go as smoothly as planned.
The good thing, however, is that the longer you’ve been doing the job and the more projects you’ve worked on, the more you will anticipate these problems and plan ahead to deal with them. Not all of them, but at least some.
This ties in with data visualisation. It’s all well and good knowing the hard and fast rules of good dashboard design. But they will only get you so far. You’ll also need to have an artistic, creative side as well to take your dashboards to the next level and have clients coming back for more or recommending your work to others.