Overfitting occurs when the model attempts to match the training set too closely. On fresh data, the overfitted model is unable to produce accurate predictions.
Important details:
The model will attempt to match the data too closely and will pick up on noise in the data when the training data set is limited or the given model is complex.
An overfitted model picks up patterns that are unique to the training set and overlooks the generic patterns.
Regularization can reduce overfitting.
Overfitting can also be decreased by training on a large and diversed training data points.
Overfitting can be detected by high variation .i.e, if the test data has a high error rate while the training data has a low error rate.
A high variance model will overfit the data and is flexible in capturing every detail—relevant or not—and noise in the data.
A high variance model is also indicated as: Training error << Validation error.
More training data will improve the generalization of the given model and avoids overfitting.