Machine Learning
Last updated
Last updated
Machine learning is all about using your computer to "learn" how to deal with problems without “programming". (It’s a branch of artificial intelligence)
We take some data, train a model on that data, and use the trained model to make predictions on new data. Basically is a way to make the computer create a program that gives some output with a known input and that latter give a intelligent output to a different but similar input.
We need machine learning on cases that would be difficult to program by hand all possible variants of a classification/prediction problem
The basic Idea of Machine Learning is to make the computer learn something from the data. Machine learning comes in two flavors:
Supervised Learning: You give to the computer some pairs of inputs/outputs, so in the future new when new inputs are presented you have an intelligent output.
Unsupervised Learning: You let the computer learn from the data itself without showing what is the expected output.
Image Classification: Your train with images/labels. Then in the future you give a new image expecting that the computer will recognise the new object (Classification)
Market Prediction: You train the computer with historical market data and ask the computer to predict the new price in the future (Regression)
Clustering: You ask the computer to separate similar data into clusters, this is essential in research and science.
High Dimension Visualisation: Use the computer to help us visualise high dimension data.
Generative Models: After a model captures the probability distribution of your input data, it will be able to generate more data. This can be very useful to make your classifier more robust.
Imagine the following problem, you are working on a system that should classify if a tumour is benign or malignent, at first the only information that you have to make a decision is the tumour size. We can see the training data distribution for this example below. Observe that the characteristic (or feature) of tumour size does not seem to be on its own a good indicator to decide if the tumour is malignant or benign.
Now consider that we add one more feature to the problem (Age).
The intuition is that by adding more features that are relevant to the problem that you want to solve, you will make your system more robust. Complex systems like this one could have up to thousands of features. One question that you may ask is how can I determine what features that are relevant to my problem. Also, which algorithm to use to best tackle the infinite amount of possible features, for example Support Vector Machines have some mathematical tricks that allow you to use a very large number of features.
The idea is to give a set of inputs and it's expected outputs, so after training we will have a model (hypothesis) that will then map new data to one of the categories trained on.
Ex: Imagine that you give a set of images and the following categories, duck or not duck, the idea is that after training you can get an image of a duck from the internet and the model should tell you it's a "duck".
There are a lot of different machine learning algorithms, in this book we will concentrate more on neural networks, but there is no one single best algorithm it all depends on the problem that you need to solve and the amount of data available.
This is the super simple recipe (maybe covers 50% of possible cases), we will explain the “how” later but this gives some hint on how to think when dealing with a machine learning problem.
First check if your model works well on the training data, and if not make the model more complex (Deeper, or more neurons)
If yes then test on the “test” data, if not you overfit, and the most reliable way to cure overfit is to get more data (Putting the test data in the training data does not count)
By the way the biggest public image dataset (imagenet) is not big enough to the 1000 classes imagenet competition
On the next chapter we will learn the basics of Linear Algebra needed in artificial intelligence.