Best Free Resources to Learn Data Science in 2021
You want to get into data science but you don't want to pay for expensive courses? You've come to the right place. Today I will talk exclusively about the best online resources to learn data science. I will cover resources across 4 categories: Coding tools, Python, Statistics and Maths for Machine Learning, Machine Learning Algorithms. This is a bottom up approach to learning data science.
Coding Tools
Let's talk about coding tools. There's a very good reason I'm talking about this before machine learning. As a data scientist you'll spend the majority of your day on the computer in the terminal. More often than not you'll need to have a good grasp of basic coding practices like using git, command line, debugging and profiling, shell scripting and basic data wrangling.
A few months ago, I found an online course called The Missing Semester by MIT. This is a free online course that aims to teach students everything that they should teach in computer science degrees. This course teaches you the most practical skills that you will actually use daily in your role as a data science practitioner. I know all too many data scientists and engineers that struggle at the beginning to use coding tools properly despite knowing lots about the latest ML algorithms.
Python
Next up is python. Python is the most commonly used coding language by data scientists. It's used for data wrangling, cleaning, exploration analysis and implementing machine learning algorithms. You should approach learning Python as a three step process.
Python Syntax
The first step is to learn the python syntax which will expose you to coding if you have never done it before and it will get you familiar with basics of the language. Treat this step like you're learning about verbs, nouns and the basics of sentence when you learn to speak english or any other language. For this, the best place to start is the python website. They have kindly put together a list of great introductory python tutorials. If you like an interactive way of learning with do it yourself exercises, learnpython.org is an awesome website that not only covers the basics of python but also numpy and pandas, python packages that are the bread and butter of the data science world.
Problem Solving with Python
The second step is to learn problem solving using python. HackerRank and LeetCode are excellent platforms to start solving toy problems. They will expose you to a wide array of coding problems and get you thinking in an application focused manner. Because, what good is knowing python syntax if you can't use it to solve problems.
Pet Project with Python
The final step in learning python is creating your own pet project. A good place to start is just googling pet projects for python. For example, one of mine a long time ago was building a home loan repayment calculator and visualiser in python. But you can pick any problem you're passionate about, whether it be building a weather app, a snakes and ladder game or an interactive dashboard for your personal finance.
Maths for Machine Learning
Now that you know how to learn python, let's talk about statistics. The best place to start is Khan Academy. This course about statistics and probability will teach you about data distributions, confidence intervals, hypothesis testing, regression and many more basics. I can't recommend it highly enough as I use it every now and then to refresh my stats knowledge.
Moving on to Maths for Machine Learning. The best resource I could find was Mathematics for Machine Learning on Coursera. It is divided into 3 specialisations: Linear Algebra, Multivariate Calculus and Principal component analysis. I would say that the first two are the most important. The third you can learn later when you are focused on Machine Learning Algorithms.
Machine learning Algorithms
Finally, let's talk about Machine Learning. I have previously recommended the Stanford Machine Learning Course on Coursera which I think is still a great basic ML course. But recently I noticed that Kaggle has some good intro courses in Machine Learning across a broad range of topics. After you've done these, you can start competing in competitions for the sake of learning. Kaggle has a great collaborative atmosphere where users share their solutions for you to learn. Additionally if you looking to delve deeper into deep learning, a subet of machine learning. I couldn't recommend this book highly enough. This is the absolute bible of deep learning. Recently I've also been going through Hands-On Machine Learning with Scikit-Learn and Tensorflow which is an awesome applications focussed books.