They say the best time to do something is “now,” but where to start is often confusing for a lot of people, let alone those who want to get started on data science and machine learning. In this article, the author provides some tips and resources for beginners who want to get into the same situation.

From Towardsdatascience, written by Daniel Bourke, Compiled by The Heart of the Machine, with participation by Han Fang and Yiming.

“I want to learn about machine learning and ARTIFICIAL intelligence. Where do I start?”

Let’s start here.



Two years ago, I started teaching myself machine learning online and shared my learning process through YouTube and blogs. I had no idea what I was doing, and I had never written code before I decided to start learning machine learning.

When people find out about my work, they usually send private messages and ask questions. I don’t necessarily have all the answers, but I’ll try to respond. The question most often asked is, “Where do I start?” “Followed by:” How much math foundation do I need?”


I answered a bunch of them this morning.

Someone told me he had already started learning Python and was going to learn machine learning, but didn’t know what to do next.

“I’ve learned Python, what do I do next?”

I replied with a list of learning steps and copied them here. If you want to become a machine learning practitioner but don’t know how to write code, use this article as an outline. My learning style is code first: get the code up and running and then learn theory, math, statistics and probability as needed, rather than theory at the beginning.

Remember, there are a lot of obstacles when you start learning machine learning. Take your time. Bookmark this article for easy reference.




I tend to use Python because I started with Python and continue to use it. You can use other languages as well, but all the steps in this article are based on Python.

Learn Python, data science tools, and machine learning concepts

The email writers who asked me said they had learned some Python. But this step also works for novices. Spend a few months learning Python programming and different machine learning concepts. You’ll need both.

Practice using data science tools like Jupyter and Anaconda while learning Python programming. Spend a few hours researching what they are used for and why.

Learning resources

  1. Artificial intelligence elements (https://www.elementsofai.com/) – overview the concept of artificial intelligence and machine learning.

  2. On Coursera Python tutorial – (https://bit.ly/pythoneverybodycoursera) from the beginning to learn Python.

  3. Through freeCodeCamp learning Python (https://youtu.be/rfscVS0vtbw) – a video covered all main Python concepts.

  4. Corey Schafer Anaconda tutorial (https://youtu.be/YJC6ldI3hWk) – a video learn Anaconda science and machine learning need to configure the environment (data).

  5. Dataquest novice Jupyter Notebook tutorial (https://www.dataquest.io/blog/jupyter-notebook-tutorial/) – learn to start and run an article Jupyter Notebook.

  6. Corey Schafer’s Jupyter Note tutorial (https://www.youtube.com/watch?v=HW29067qVWk) – a video to learn to use Jupyter Notebook.

Learn data analysis, manipulation, and visualization using Pandas, Numpy, and Matplotlib

Once you’ve mastered some Python skills, you’ll want to learn how to handle and manipulate data. To do this, you’ll need to be familiar with Pandas, Numpy, and Matplotlib.

  • Pandas allows you to manipulate two-dimensional data, similar to tables of information in an Excel file, containing rows and columns. This type of data is called structured data.

  • Numpy can help you do numerical calculations. Machine learning takes everything you can think of and turns it into numbers, and looks for patterns in those numbers.

  • Matplotlib helps you draw graphs and visualize data. It can be difficult for humans to understand a bunch of numbers in a table. We prefer to see a graph with a line running through it. Visualization can better communicate your findings.

Learning resources

  1. Python Applied Data Science on Cousera (http://bit.ly/courseraDS) – Start honing Python skills in data science.

  2. Introduction to 10 minutes pandas (https://pandas.pydata.org/pandas-docs/stable/gettingstarted/10min.html) – a quick overview of pandas library and some of the most useful function.

  3. Codebasics Python pandas tutorial (https://youtu.be/CmorAWRsCAw) – YouTube series introduces all the main function of the pandas.

  4. FreeCodeCamp NumPy tutorial (https://youtu.be/QUT1VHiLmmI) – a YouTube video to learn NumPy.

  5. Sentdex Matplotlib tutorial (https://www.youtube.com/watch?v=q7Bo_J8x_dw&list=PLQVvvaa0QuDfefDfXb9Yf0la1fPDKluPF) – YouTube Matplotlib series helps you learn all the most useful features of Matplotlib.

Learn machine learning with SciKit-Learn



Now that you have the skills to manipulate and visualize data, it’s time to learn how to look for patterns in data. Scikit-learn is a Python library that contains many useful machine learning algorithms for you to use, as well as many other useful functions to explore the effects of learning algorithms.

The focus is on what kind of machine learning problems are involved in learning, such as classification and regression, and what algorithms are best suited to solve these problems. You don’t need to understand each algorithm from scratch just yet, but learn how to apply them.

Learning resources

  1. Scikit-learn Python machine learning for Data School (https://www.youtube.com/watch?v=elojMnjn4kk&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A) – a YouTube playlist teach you scikit – learn All the major functions of phi.

  2. A brief introduction to exploratory data analysis by Daniel Bourke (https://towardsdatascience.com/a-gentle-introduction-to-exploratory-data-analysis-f11d843b8184) – the knowledge you learned in the two steps above fusion in a project . Provides code and videos to help you get started with your first Kaggle contest.

  3. Daniel Formosso exploratory data analysis based on scikit – learn notes (https://github.com/dformoso/sklearn-classification) – more than a deeper version of resources, An end-to-end project that practices the above is included.

Learning deep learning neural network

Deep learning and neural networks work best on data without much structure. Two-dimensional data has structure, as do images, video, audio files, and natural language text, but not much.

Tip: In most cases, you’ll want to use a set of decision trees (algorithms like random forest or XGBoost) for structured data, while for unstructured data, you’ll want to use deep learning or transfer learning (taking a pre-trained neural network and applying it to your problem).

You can start writing down tips like this on a post-it note and then gather information as you go along.

Learning resources

  1. Andrew Ng on Cousera deeplearning. Ai (https://bit.ly/courseradl) (https://bit.ly/courseradl) – one of the most commercially successful practitioners deep learning course teaching.

  2. Jeremy Howard fast. Ai deep learning courses (https://course.fast.ai/) (https://bit.ly/courseradl) – one of the best practitioners in the industry deep learning practical methods of teaching.

Other courses and books

During the learning process, ideally you can practice what you are learning with your own mini-projects. It doesn’t have to be a complicated, world-changing thing, but you can say “I did this with X”. Then share your work via Github or blog. Github is for showing your code, and blog posts are for showing how you express the work you’ve done. You should post these for each project. The best way to apply for a job is if you’ve already done what the job requires. Sharing your work is a great way to show future potential employers what you can do.


After you’re familiar with how to use different machine learning and deep learning frameworks, you can try to consolidate your knowledge by building them from scratch. You don’t always have to do this in production or machine learning, but knowing how things work from the inside will help you build your own work.

Learning resources

  1. How to Start your own machine Learning Project (https://towardsdatascience.com/how-to-start-your-own-machine-learning-projects-4872a41e4e9c) – may find it hard to start your own project, This article can give you some guidance.

  2. Jeremy Howard fast. Ai (https://course.fast.ai/part2) – the basis of the study of the top-down, after studying this course will help you to fill in the blanks from down to up side.

  3. Grokking Deep Learning by Andrew Trask (https://amzn.to/2H497My) — This book will teach you how to build neural networks from scratch and why you should know how to build them.

  4. Daniel Bourke books recommended by machine learning (https://www.youtube.com/watch?v=7R08MPXxiFQ) – the YouTube video to sort out some of the best books of machine learning.

Answering questions

How long does each step take?

It could take you six months or more. Take it easy. Learning new things takes time. As a data scientist or machine learning engineer, the main skill you’re developing is how to ask good questions about data and then use your tools to try to find the answers.

Sometimes you feel like you haven’t learned anything. Or even backwards. Ignore it. Instead of measuring by day, see how far you’ve come in a year.

Where can I learn these skills?

I’ve listed a few resources above, all online and mostly free, and there are many more.

DataCamp (http://bit.ly/datacampmrdbourke) is a very good learning website. In addition, my Machine Learning and Artificial Intelligence resources database (https://bit.ly/AIMLresources) to the Learning materials of the free and paid.

Remember, a big part of being a data scientist or machine learning engineer is solving problems. Explore each step here with your first assignment and create your own curriculum to help with learning.

If you want to know a self-guided example is what kind of machine learning course, look at my Self – Created AI Masters Degree (https://bit.ly/aimastersdegree). This is how I went from zero coding to machine learning engineer in the last nine months. It’s not perfect, but my real experience, so you can try it.


What about statistics? What about math? Probability?

You’ll learn these things by doing them. Start with the code. Get the code up and running. Trying to learn all about statistics, math, probability before running the code is like trying to boil the ocean. It makes you wince.

Statistics, mathematics, and probability don’t matter if the code doesn’t run. Run it, then use your research skills to see if it works.

Certificate?

Certificates are great, but you don’t study for them, you study for skills. Don’t make the same mistake I did and assume that more credentials mean more skills, it doesn’t. Build a knowledge base through the courses and resources described above, and then develop your expertise through your own projects (which cannot be taught in the course).

Reference link: towardsdatascience.com/5-beginner-…