The height-by-weight example illustrates this concept. If you didn't collect data in this all-zero range, you can't trust the value of the constant. The value of the constant is a prediction for the response value when all predictors equal zero. You should never use a regression model to make a prediction for a point that is outside the range of your data because the relationship between the variables might change. Don't even try! Zero Settings for All of the Predictor Variables Can Be Outside the Data RangeĮven if it’s possible for all of the predictor variables to equal zero, that data point might be outside the range of the observed data. If all of the predictors can’t be zero, it is impossible to interpret the value of the constant. It becomes even more unlikely that ALL of the predictors can realistically be set to zero.
Now imagine a multiple regression analysis with many predictors. No human can have zero height or a negative weight! If height is zero, the regression equation predicts that weight is -114.3 kilograms!Ĭlearly this constant is meaningless and you shouldn’t even try to give it meaning. From the regression equation, we see that the intercept value is -114.3. If you follow the blue fitted line down to where it intercepts the y-axis, it is a fairly negative value. Below, I’ve changed the scale of the y-axis on that fitted line plot, but the regression results are the same as before. In my last post about the interpretation of regression p-values and coefficients, I used a fitted line plot to illustrate a weight-by-height regression analysis. However, a zero setting for all predictors in a model is often an impossible/nonsensical combination, as it is in the following example. I’ve often seen the constant described as the mean response value when all predictor variables are set to zero. Zero Settings for All of the Predictor Variables Is Often Impossible The concepts hold true for multiple linear regression, but I can’t graph the higher dimensions that are required. However, a 2D fitted line plot can only display the results from simple regression, which has one predictor variable and the response. I'll use fitted line plots to illustrate the concepts because it really brings the math to life. In this post, I’ll show you everything you need to know about the constant in linear regression analysis. Paradoxically, while the value is generally meaningless, it is crucial to include the constant term in most regression models! That’s not surprising because the value of the constant term is almost always meaningless! While the concept is simple, I’ve seen a lot of confusion about interpreting the constant. Also known as the y intercept, it is simply the value at which the fitted line crosses the y-axis. The constant term in linear regression analysis seems to be such a simple thing.