In 2017 I was tired of my job in IT and wanted to make a change.
I wanted to become a data analyst. I had always enjoyed data. From sports to personal finance to counting calories, data and numbers just made sense to me.
If you’ve always been a “numbers guy” but don’t work as a data analyst this guide will walk you through what you need to do to land your first data analyst job.
It’s Not Hard to Do Data Analysis
Growing up I played fantasy football. Picking players based on how they had performed and projecting how they would perform. Finding value in the draft by seeing how other players are over-valuing. Or even capitalizing on weekly matchups. It was all data related and it was all interesting.
If you have the skills to be decent at fantasy football, you have the skills to become a data analyst.
This is not to say that you need to be a fantasy football player to become a data analyst. I’m just saying that if you enjoy data and have done data analysis for your hobbies and games in your life, you likely already have the aptitude to be a data analyst.
The trick is using those skills to demonstrate to future employers you have those skills. You can’t put fantasy football on your resume, but you can put the skills that make you good at fantasy football on your resume.
Build a Data Portfolio!
If you have no experience being a data analyst, you have no way to show you can actually work with data. This is why you should have a portfolio of data projects that you can speak to and show to employers. Here’s mine.
A portfolio exhibits the type of work you are capable of doing. You’re literally telling the world what you can produce. A portfolio is proof. It’s a certainty. It’s something your future employer wants to see.
Build a personal blog, it doesn’t have to be fancy, and start recording your progress. After each data project do a write up about it and post it to your blog. This will be your portfolio of projects and it will showcase the skills you have. It’ll provide proof of your data abilities. Since you won’t have a resume full of data experience you can demonstrate your ability on your blog.
If you look at my Tableau Public Profile, you’ll notice some of those early projects are super rough. I was new to Tableau. I was learning. But you can see as time progresses my dashboards get cleaner and slicker the more I learned and the more comfortable I became with the platform.
Finally, a portfolio shows you have passion. No one paid you to do these projects. No one forced you to put what you’re capable of out there. You did it out of your own volition. Hiring managers like to see that. They want someone with passion, not just someone who will punch in and punch out.
A portfolio exhibits the type of work you are capable of doing. It records your journey by showing how much you’ve grown. And a portfolio shows you have a passion for data.
With that said let’s talk about some of the skills you’ll need to showcase in your portfolio
Working with data can be roughly summed into three parts: Storing the data, Cleaning the data, and Presenting the data. I’m simplifying a bit, but if you’re looking for an entry/junior data analyst job this is a good framework to get you started.
There are a ton of tools used in each one of the parts. There are hundreds, and every company uses a different set of tools. But it’s okay, they are all relatively similar. If you can show you understand one of them, it’s assumed you can understand them all.
For this guide, I’m going to recommend the tools/software to work with.I’ve used or currently use all of them. And they’re all free, so that should make life easier. Trust me. If you get good at these tools and show it in your portfolio, you’ll get hired.
The next sections will break down the tools, and then I’ll be suggesting some project titles. If you don’t know what the project titles mean, good. I hope you use them as jumping off points to explore different concepts and aspects of the tools I’m describing.
As a data analyst, you’ll likely have three consistent places of storing data. You’ll be keeping the data in a database of some kind, usually a SQL database. Or you’ll pull the data directly from a platform like Facebook or Google. Or you’ll have some excel/CSV you keep data in.
That means you’ll want to understand SQL, a platform, and excel.
If you don’t know SQL(pronounced sequel) is a language that is used to query databases. Querying a database is essentially using a set of rules to pull the data you want. So if you had a database full of information bout flowers. You could create a query to only have the data of the red flowers. Being able to select the specific data you want to have is a vital skill for any data analyst.
There are a lot of free resources out there to learn SQL. You’ll want to practice and become familiar with SQL. However creating a MySQL database on your laptop can be a bit of a pain, and isn’t very relevant. Most of your data projects in your portfolio will likely be using excel/CSV files.
That being said, SQL is very important and you should get comfortable with it.
In terms of learning a platform, I recommend Google Analytics. Google Analytics is a free and popular platform. Most major companies use it. If you can work with Google Analytics, you can work with all the other platforms. You can open a free demo account and work with it pretty easily. Here is my guide on how to get started with the Google Analytics Demo Account.
In your portfolio, you’ll want to show an ability to work on a platform and that you are competent with excel. Further, if you can find a way to show an understanding of SQL that would certainly help.
Portfolio Project Ideas:
- How to remove duplicates using SQL or Excel
- This is an important skill in cleaning data
- Pulling data from Google Analytics and merging data in SQL or Excel
- Merging data can be tricky and pulling data from a platform is a skill you will use every day. You’ll want to show you’re proficient.
- An overview of Excel Functions
- Excel may not be the most exciting software, but it is used everywhere.
- The anatomy of a SQL statement
- This goes to show your understanding of SQL.
Cleaning and working with data is where you will spend most of your time as a data analyst. Data is almost never collected perfectly. Formatting can be inconsistent. There could be typos. There can even be duplicate data. Cleaning data is the skill you’ll want to practice and showcase the most.
You can use Excel to clean data. However, cleaning data is why you’ll want to learn a programming language. The two programming languages used in data are R and Python. I use R. There are plenty of people who use Python. People who come to data from a computer science degree tend to work with Python.
The important thing isn’t which one to learn. It’s that you learn one of them. It’s assumed that if you can work with R you can work/learn Python. And visa versa.
Both are free. I recommend R. R is more data-centric while Python is a programming language that can do data things. In addition, I think it is a more intuitive language to learn. You can also use SQL to clean your data, and that is a fantastic skill to have. However, there is a somewhat steep learning curve to getting a SQL environment setup on a laptop. So while I recommend R. Know that using SQL to clean data is very much okay, and in many places recommended.
Portfolio Project Ideas:
- Using Lubridate with R to convert Time and Dates
- Dates and times are the most frustrating part of cleaning data. You want to show your ability to work with different date formats. Lubridate is a package in R that will help you in this regard.
- How I use Dplyr
- Dplyr is a package in R specifically geared towards cleaning data.
- Exploring data with ggplot2.
- Ggplot is a package in R for graphing data. Often when exploring data you’ll be using graphs to first understand what you are working with.
Presenting the Data
Data presentation usually involves two parts: Insights and dashboards/graphs. Insights tell people what the data is saying and what should be done about it. Dashboards and graphs back it up as evidence. Honestly, I’m being a bit simplistic.But this is roughly how it goes.
You can use R and Python to present data. That option does exist. People do it. That being said, there are better options. Most companies don’t use Python or R for data presentation. Most companies use a dashboarding or data visualization tool.
There are two tools you can use for this: Google Data Studio and Tableau. I recommend Tableau. It’s more professional and is more respected.
Portfolio Project Ideas:
- Creating a Custom Color Palette in Tableau
- This requires an understanding of some of the Tableau file structure.
- Layering a graph into a Tooltip in Tableau.
- This a more advanced thing you can do in Tableau it shows you’re more than just a beginner.
- Connecting Tableau to Google Analytics
- API connections are everywhere in data practicing with Google Analytics will allow you to gain experience with that.
- Month over Month Comparisons in Tableau
- Tableau does not have a built-in Month over month comparison feature. Everyone wants one. You’ll have to build it yourself.
Summary of Tools
How to make a Data Project
I’ve been talking a lot about using your portfolio to showcase your data projects. Well, here’s how you’ll do it.
Remember to document every step you take in your portfolio and show the code you used. This is the most important part.
Get the Data
Getting data is notoriously difficult. Getting clean data is even harder. People are very secretive with their data. They often have to work hard to obtain it and they do not like sharing. Data can have sensitive information, which is just another reason it is difficult to obtain. Below are some free resources I’ve used in the past with my data projects.
You may have to get creative. If you’re spending an hour copying and pasting data into a spreadsheet, don’t worry we’ve all been there.
This is far and away the most important step in working with data. It’s the step that’s going to take the most time. It’s also the least sexy. Absolutely document this step.
At this point, you’ll have data to work with. And at this point, you’ll likely need to clean the data. It may be changing date formats. It could be moving decimal places. Use R, Python, or SQL to code the changes in the data, and then add that code with before and after screenshots to your portfolio post.
Exploration & Insights
With the clean data, explore it. Put it into Tableau and start making some graphs. See if there’s anything that sticks out. Ask yourself these questions:
- Is there something weird? Is there something going against my assumptions? Why?
- Are two variables correlated? Why?
- Are there any spikes? Why?
- What are the trends? Why Are they trending? Why did the trend change?
- I believe this about the data, is it true? Why?
If you work your way through these questions you’ll likely find something interesting. And if you don’t that’s fine. The important thing is that you can explain it.
You’ve got data. It’s clean. And you found some interesting stuff. Now its time to present it. Build a dashboard around your findings.
Take a 1-3 of the interesting things you’ve found and group them together. And try to build a story around it. Put text boxes in your dashboard so a person looking at the dashboard would learn the insights without having to do all the work you’ve done.
Creating a Data Post
You’ve got data. It’s clean. And you found some interesting stuff. Now it’s time to present it. Build a dashboard around your findings.
Take 1-3 of the interesting things you’ve found and group them together. And try to build a story around them. If you look at my post showing how happiness in Venezuela fell. You can tie to the change in its economy as it became more socialistic and impoverished. Put text boxes in your dashboard so a person looking at the dashboard would learn the insights without having to do all the work you’ve done.
That’s the job of a data analyst. You’re a researcher. It’s very important that you can effectively communicate what you found in the data.
Publish Your Data Project
Final step. Publish the final project plus all of the documented steps. There are a lot of possible places to publish your projects. Place number one is your portfolio/blog. After that, you can do what you’re comfortable with. Below are where I publish some of my data projects.
Examples of Where I Publish
A few things to keep in mind.
- Keep pushing yourself. If you have four data projects that are all the same just with different data, you’re not showcasing your growth.
- Showcase different parts of the project process. It’s okay to have a simple graph if the data cleaning process was very complex. Make sure you document it and point it out.
- Explore different types of data and visualizations.
- This is just a guideline. I have one data project that only talks about how I got the data off a website. It had no graphs or visualizations. That’s fine. You want to showcase skills and growth.
Use this process of creating and documenting data projects. After you have 3 or 4 data projects that you’re proud of in your portfolio, you will likely be ready to apply for data analyst jobs. You’ll be able to talk about some of your data experience. Problems you encountered and how you fixed them. Essentially, you’ll be able to nail an interview and get hired.