Prediction in R
A model that has been fitted to a set of data can be used to predict the outcome variable of either the same data set, or a different data set provided that the data include the same prediction variables that were used to fit the model.
Each regression and classification model implements a predict function which performs 2 basic steps,
- A CheckPredictionVariables function tests whether all fitted variables are included in the prediction data set and returns an error if they are not. It also generates a warning for cases where a categorical prediction variable takes a new class (known as a factor level in R) that was not used for fitting, and sets the new classes to NA.
- Call the predict method of the underlying R package and return predictions for the full set of data before any subset filter. The predictions may be NA due to new factor levels and depending on the treatment of missing data.