Eight steps to machine learning

Last month, Kaggle co-founder and CTO Ben Hamner answered a series of questions about Kaggle, machine learning, and artificial intelligence on Quora. The Kaggle Team rewrote and rewrote the core summary of Hamner’s eight Steps to Machine Learning.

Learning machine learning and artificial intelligence is better than ever. In recent years, the field has developed rapidly and yielded fruitful results. Experts open source high-quality software tools and libraries, and new online resources and blog posts proliferated. Machine learning has generated billions of dollars in industry revenue, unprecedented resources and jobs. But that also means getting started with machine learning can be a bit confusing. Here’s how I got started. If you get stuck somewhere in this article, search Kaggle (maybe someone has had the same problem before) and ask a question in the Kaggle forum (if no one has asked the question before), it’s a great way to find direction and solve the problem.

Pick a problem that interests you. Starting with a problem you want to solve rather than an intimidating, unstructured list of topics (you can Google a list of machine-needed resources, which I won’t provide here), you’ll find it easy to focus and actively learn. Solving problems will force you to get deeper and more engaged, rather than just passively reading about machine learning. There are a few criteria for choosing a good introductory question: The question covers an area of personal interest to you and the data is readily available and very suitable for solving the problem (otherwise most of your time would be wasted). Can you comfortably use the data (or some relevant subset of the data) on a single machine without finding problems? Be worried!!! We’ve provided some great machine learning problems on Kaggle through our Entry Contest series. Click Titanic Contest (www.kaggle.com/c/titanic) to open…
Make a quick, shoddy, and clunky end-to-end solution to your problem. It’s really easy to get bogged down in implementation details or debugging bad machine learning algorithms, and you want to avoid it. Your goal here is to get something super basic as quickly as possible, covering end-to-end problems: reading data and processing it into a form suitable for machine learning, training the basic model, creating results and evaluating its performance.
Develop and refine your initial plan

Now that you have a functional baseline, it’s time to innovate. Try to improve each component of the initial solution and measure the impact to see where it makes sense to spend your time. In many cases, capturing more data or improving data cleansing and preprocessing steps has a higher ROI than optimizing the machine learning model itself. Part of this step should include hands-on use of the data — examining the rows and visualizing the distribution to better understand its structure and oddities.

Write and share solutions

The best way to get feedback on your solution is to write it down and share it. The writing process is a new way of teasing out solutions and leading to better understanding. This will also enable others to understand what you are doing and provide feedback to help you learn. This also kick-starts your machine learning portfolio, which helps you demonstrate your capabilities and get the job. The Kaggle dataset and The Kaggle kernel are a great way for you to share data and solutions, get feedback from others, see how others extend your problems, and start filling out your Kaggle files.

Repeat steps 1-4 on a series of different problems. Now that you have solved the single problem you are interested in, do this multiple times in a series of different areas. Did you start with tabular data? One more problem involving less structured text, and another problem dealing with images. Were machine learning problems initially structured for you? Much of the innovative and valuable work is on how to turn a loosely defined business or research goal into a well-defined machine learning problem from the very beginning. Solve a problem type in this way. The Kaggle contest and The Kaggle data set provide a good starting point for clearly defined machine learning problems and raw data resources suitable for machine learning.
Seriously participate in a Kaggle contest (if you haven’t already)

Giving the best answer to a problem that thousands of people are working on is a huge learning opportunity: it forces you to iterate over the same problem and allows you to discover what works. Individual competition BBS have about other people how to use your method to deal with rich resources and debugging problems, the kernel provides about begin with simple way to solve the problem of data exploratory opinions, and the winning post (blog.kaggle.com/category/wi… Kaggle competitions also provide a unique opportunity to team up with other people. People in the community have different backgrounds and skills, and each person can play both teaching and learning roles. You never know, maybe your future colleagues are in the Kaggle community.

Apply for a job in machine learning

This allows you to spend most of your time on machine learning and really improve your game. Deciding on the type of position you want to pursue and building a portfolio of relevant personal representation projects is a strong starting point. If you’re not ready to interview for a machine learning position, take on new projects and seek consulting opportunities in your current position; Participating in citizen hackathons and taking advantage of data-related community service opportunities are additional ways to gain a foothold. Professional work requires strong programming skills and can be used to greatly improve performance — the boost that comes through focused projects will generate many downstream benefits.

Valuable opportunities for professional machine learning jobs include:

Application of machine learning in production Systems Focuses on machine learning research and advances in exploratory analysis of product and business decisions using machine learningCopy the code

Teaching machine learning to others can help solidify your understanding of the core concepts of machine learning. There are many ways to teach others, so choose the one that works best for you:

Write research papers give presentations write blog posts and tutorials answer questions on Kaggle, Quora, and other sites personal coaching and tutoring Share code examples (on the Kaggle kernel and on GitHub) teach classes and write books

Related Posts

Spark: Database (Hbase+Mysql)

How to design accurate push notifications?

Nodejs to install mongodb