Chapter 1. Scatterplots and Regression

 

  1. Regression is study of dependence.
  2. How Y changes on the average as the value of X is varied.
  3. Linear regression is important instance of regression methodology and is most commonly used.
  4. Virtually all other regression methods build upon an understanding of how linear regression works.
  5. The goal of regression is to understand how the values of Y change as X is varied over  its range of possible values.
  6. One important function of the scatterplot is to decide if we might reasonably assume that the response on the vertical axis is independent of the predictor on the horizontal axis.
  7. The extreme values on the left and right of the horizontal axis are points that are likely to be important in fitting regression models and are called leverage points.
  8. The separated points on the vertical axis are potentially outlier.
  9. Outliers are more easily discovered in residual plots.
  10. Residual plot gain resolution in the plot.
  11. Residual plot is obtained by removing the expected linear/nonlinear trend in the data.
  12. On Inheritance of Height data (mheight and dheight) n =  1375
    1. relation looks to be reasonably linear. 
  13. On Forbes data (atmospheric pressure and boiling point of water) n=17
    1. relation looks to be linear initially, while residual were not random.
    2. there is small systematic deviation of experimental values from fitted OLS straight line.
    3. based on physical theory, log(pres) is expected to be linearly related to bp.
    4. log10(pres) vs bp is observed to be reasonably linear.
    5. choice of base has no material effect on the appearance of the graph or on fitted regression models, but interpretation of parameters can depend on the choice of base.
    6. transformation of variables is a key to extend usefulness of linear regression models.
  14. On Length at age of smallmouth bass (length at capture in mm vs age at capture) n = 439
    1. only fish of age 8 or less considered for plot
    2. angular rings on scales is used to determine the age of fish
    3. data are cross-sectional, meaning that all observations were taken at the same time.
    4. in longitudinal study, the same fish would be measured each year, possibly requiring many years of taking measurements.
    5. relation is not expected to be linear.
  15. On predicting the weather (early winter snowfall vs late winter snowfall)
    1. interest in regression problem will be in testing the hypothesis that the two variables are uncorrelated (fitting mean line) vs they are not uncorrelated (fitting OLS).
  16. On Turkey Growth (weight gain vs dose)
    1. straight line does not seem to be a reasonable representation of the average dependence of the response on the predictor.
  17. OLS estimated straight line is line for mean function, in general (linear mean function).
  18. Non-linear mean function might be more appropriate for growth models.
  19. We may have parametric model for the mean function and will use data to estimate the parameters.
  20. The variance function also characterizes the graph, and in many problems we will assume at least at first that the variance function is constant.
  21. The null plot has a horizontal straight line as its mean function, constant variance function, and no separated points.
  22. Smoothers for the mean function, we can estimate E(Y|X=x) using a simple nonparametric smoother obtained by averaging the repeated observations at each value of X.
  23. Smoothers can also be defined when we do not have repeated observations at values of the predictor by averaging the observed data for all values of X close to, but not necessarily equal to x.
  24. The marginal relationships between the response and each of the variables are not sufficient to understand the joint relationship between the response and the more than one predictor at a time.
  25. The interrelationships between the predictors are also important.

Comments

Popular posts from this blog

Clear Understanding on Sin, Cos and Tan (Trigonometric Functions)

Clear Understanding on Mahalanobis Distance

Vignettes for Matrix concepts, related operations