Other Things To Notice#

Besides what we’ve seen so far.#

There are other things to notice if you want to make training a machine learning model easy.

How hard can training be?#

Let’s face it, training is very hard, because of how hard it is to debug a machine learning system. printing out all the values in the model? You’ll still not understand why something goes wrong because those numbers are meaningless to you. Trial and error? Well, it’s super time-consuming to do so, and will make your computer very heated.

But fret not! Most training issues can be categorized into the following types:

Gradient issues#

Gradient issues are the most common ones. It happens when you’re models gradients are out of control, not in the desired range. When gradients are too large or small, or when you are unlucky and stuck in a saddle point.

Learning rate#

Learning rate is how much you update your model. When learning rate is too small or large, training may get super slow.

Optimizer#

An optimizer is responsible for updating the model. If the wrong optimizer is selected, training can be deceptively slow and ineffective.

Batch size#

When you have a too big or small batch, bad things happen because of probability.

Overfitting and underfitting#

Sometimes when you’re model is too complicated for the task, the model looks dope in training but is useless when being used on real world data. Or the model is just overly simple for the task, the model doesn’t seem to learn anything in training.

Summary#

Most of the issues are caused because of how machine learning systems’ intolerance to big or small numbers, so carefully selecting/tuning a hyperparameter is key to solving many ML issues encountered in training.