Posts

Showing posts from July, 2023

PCA Excercise: Room Temperature Data

Image
Data Set: Reference:  http://openmv.net/info/room-temperature Meta Data: Description: Temperature measurements, in Kelvin, taken from 4 corners of a room. Data source: Simulated data. Data shape: 144 rows and 4 columns

Robust covariance matrix estimation in SPSS

   IN SPSS,  Produces the parameter estimates along with robust or heteroskedasticity-consistent (HC) standard errors. When  Parameter estimates with robust standard errors  is selected, the following methods are available for the robust covariance matrix estimation. HC0 Based on the original asymptotic or large sample robust, empirical, or "sandwich" estimator of the covariance matrix of the parameter estimates. The middle part of the sandwich contains squared OLS (ordinary least squares) or squared weighted WLS (weighted least squares) residuals. HC1 A finite-sample modification of HC0, multiplying it by N/(N-p), where N is the sample size and p is the number of non-redundant parameters in the model. HC2 A modification of HC0 that involves dividing the squared residual by 1-h, where h is the leverage for the case. HC3 A modification of HC0 that approximates a jackknife estimator. Squared residuals are divided by the square of 1-h. HC4 A modification of HC0 that divides the

One Way ANOVA

Image
  Summary of Analysis Technique. ANOVA is used for examining the differences in the mean values of the dependent variable associated with the effect of the controlled independent variables, after taking into account the influence of the uncontrolled independent variables. ANOVA must have a dependent variable which should be metric (measured using an interval or ratio scale).  ANOVA must also have one or more independent variables, which should be categorical in nature.  In ANOVA , categorical independent variables are called factors. A particular combination of factor levels, or categories, is called a treatment. https://www.statisticssolutions.com/anova-in-spss/ To test if the means are different, an ANOVA test compares the explained variance (caused by the input factor) to the unexplained variance (caused by the error source).  If the ratio of explained variance to unexplained variance is high, the means are statistically different. https://www.ibm.com/docs/en/cognos-analytics/11.1.0

Define contrast in SPSS

  The following contrast types are provided in SPSS Indicator Contrasts indicate the presence or absence of category membership. By default,  Reference group  is the first category (represented in the contrast matrix as a row of zeros). Deviation Compares the mean of each level (except a reference category) to the mean of all of the levels (grand mean). The levels of the factor can be in any order.  Reference group  allows you to select either the first or last group as the reference. The Preview pane displays information based on your selection. Simple Compares the mean of each level to the mean of a specified level. This type of contrast is useful when there is a control group.  Reference group  allows you to select either the first or last group as the reference. The Preview pane displays information based on your selection. Difference Compares the mean of each level (except the first) to the mean of previous levels. (sometimes called reverse Helmert contrasts). The Preview pane dis

Chapter 2. Simple Linear Regression

Image
   The simple linear regression model consists of the mean function and the variance function. SLR, mean function E(Y|X=x) = β0 + β1x The value of parameters  are usually unknown and must be estimated using data. SLR, variance function Var(Y|X=x) = σ2 In SLR, the variance function is assumed to be constant, with a positive value of σ2 that is usually unknown. yi is observed value of i the response y, and will typically not equal its expected value E(Y|X=xi) because σ2 > 0. yi = E(Y|X=x) + ei, where ei is statistical error (implicit equation for ei) ei can be defined explicitly as, ei = yi - E(Y|X=x) = yi - (β0 + β1x) The errors, ei, depend on the unknown parameters in the mean function and so are not observable quantities. They are random variables. Assumption of ei, E(ei|xi) = 0 (mean of statistical errors is 0). So, if you draw a scatterplot of the ei vs the xi, we would have a null scatterplot, with no patterns. Assumption of ei, they are independent. Assumption of ei, expected t

Chapter 1. Scatterplots and Regression

  Regression is study of dependence. How Y changes on the average as the value of X is varied . Linear regression is important instance of regression methodology and is most commonly used. Virtually all other regression methods build upon an understanding of how linear regression works. The goal of regression is to understand how the values of Y change as X is varied over  its range of possible values. One important function of the scatterplot is to decide if we might  reasonably assume  that the response on the vertical axis is independent of the predictor on the horizontal axis. The extreme values on the left and right of the horizontal axis are points that are likely to be important in fitting  regression models  and are called  leverage points . The separated points on the vertical axis are potentially  outlier . Outliers are more easily discovered in residual plots. Residual plot gain resolution in the plot. Residual plot is obtained by removing the expected linear/nonlinear trend