Finding the training- and test-error for a glm-model

I have a data-set with approx. 200 000 observations and 10 predictors, with continuous target. I have divided this data into a training set and a test set (70%/30%). I want to compare a glm-model against a random forest-model. I'm attempting to do so by comparing their performance on the test set. I would like to calculate the training- and the test-errors for the glm-model. So far I've tried 10-fold cross-validation on the training set:

library(caret) train.control


then in the summary, it says RMSE=0.2915827, is this the training-error? How do I get the test-error from this?

  machine-learning
  generalized-linear-model
  training-error

asked Jun 11, 2020 at 8:19
AnnieFrannie AnnieFrannie
139 10 10 bronze badges
1 Answer 1
$\begingroup$
The RMSE reported by the summary of the model being trained is the training error. Think about it in terms of what that model object has seen; it’s only been trained on the training set.
To find the test error comparable to the training RMSE use the predict function and basic math expressions:
Predictions = predict(model, data=test) testRMSE = sqrt(mean((Predictions-test$y)^2)) testRMSE 
Where test is your test set of observations and y is the column variable you are predicting