3 Introduction

While “machine learning” is relatively new, the process of learning itself is not. All of us are already familiar with how to learn - by improving from our mistakes. By repeating what is successful and avoiding what fails, we learn by doing, by experience, or trial-and-error. Machines learn similarly.

Take, for example, the process of studying for an exam. Some study methods work well, but other methods do not. The “data” is the practice problem, and the “label” is the answer (A, B, C, D, E). We want to build a mental “model” that reads the question and predicts the answer.

We all know that memorizing answers without understanding concepts is ineffective, and statistics calls this “overfitting.” Conversely, not learning enough of the details and only learning the high-level concepts is “underfitting.”

The more practice problems we do, the larger the training data set, and the better the prediction. When we see new problems that have not appeared in the practice exams, we often have difficulty. Quizzing ourselves on real questions estimates our preparedness, which is identical to a process known as “holdout testing” or “cross-validation.”

We can clearly state our objective: get as many correct answers as possible! We want to predict the solution to every problem correctly. Said another way, we are trying to minimize the error, known as the “loss function.”

Different study methods work well for different people. Some cover material quickly, and others slowly absorb every detail. A model has many “parameters,” such as the “learning rate.” The only way to know which parameters are best is to test them on real data, known as “training.”