Posts

Showing posts from June, 2023

Scope of inference

Image
  The scope of any inferences is constrained based on whether there is a  random sample  (RS) and/or  random assignment  (RA).  Table 1-1 contains the four possible combinations of these two characteristics of a given study.  Random assignment allows for causal inferences for differences that are observed - the different in treatment levels causes differences in the mean responses.  Random sampling (or at least some sort of representative sample) allows inferences to be made to the population of interest.  If we do not have RA, then causal inferences cannot be made.  If we do not have a representative sample, then our inferences are limited to the sampled subjects. Reference Hypothesis testing (general) - Statistics with R (montana.edu)

Snapshot approach to rational subgroups (Xbar, R chart)

Image
 A fundamental idea in the use of control charts is the collection of sample data according to what Shewhart called the rational subgroup concept. To illustrate this concept, suppose that we are using an control chart to detect changes in the process mean . Then the rational subgroup concept means that subgroups or samples should be selected so that if assignable causes are present, the chance for differences between subgroups will be maximized, while the chance for differences due to these assignable causes within a subgroup will be minimized. The rational subgroup concept is very important. The proper selection of samples requires careful consideration of the process, with the objective of obtaining as much useful information as possible from the control chart analysis . In snapshot approach to rational subgroups, each sample consists of units that were produced at the same time (or as closely together as possible), assuming that five consecutive units are selected. This approach is

Random sample approach to rational subgroups (Xbar, R chart)

Image
A fundamental idea in the use of control charts is the collection of sample data according to what Shewhart called the rational subgroup concept. To illustrate this concept, suppose that  we are using an control chart to detect changes in the process mean . Then the rational subgroup concept means that subgroups or samples should be selected so that if assignable causes are present, the chance for differences between subgroups will be maximized, while the chance for differences due to these assignable causes within a subgroup will be minimized. The rational subgroup concept is very important. The  proper selection of samples requires careful consideration of the process, with the objective of obtaining as much useful information as possible from the control chart analysis . In random sample approach to rational subgroups, random sample (of defined number of units, here 5 units) is collected from all process output produced over the sampling interval. This method of rational subgrouping

Whether to pool variance or not..

Image
Copied from, unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test | Behavioral Ecology | Oxford Academic (oup.com) The Student's  t -test performs badly when these variances are actually unequal, both in terms of Type I and Type II errors.  Unequal variances are less problematic if sample sizes are similar. we see that the Type I error rate of the unequal variance  t -test never deviates far from the nominal 5% value, whereas the Type I error rate for the Student's  t -test was over 3 times the nominal rate when the higher variance was associated with the smaller sample size and less than a quarter the nominal rate when the higher variance was associated with the higher sample size. Thus, unequal variance  t -test performs as well as, or better than, the Student's  t -test in terms of control of both Type I and Type II error rates whenever the underlying distributions are normal. The unequal variance  t -test has no perform

Pre-treatment of data (Prior to PCA).

Image
  PCA is a maximum variance projection method, it follows that  a variable with a large variance is more likely to be expressed in the modeling than low-variance variable. In order to give variables, equal weight in the data analysis, we standardize them. Standardization is also known as "Scaling" or "Weighing", and means that the length of each co-ordinate axis in the variable space is regulated according to a pre-determined criterion. The first time a dataset is analyzed, it is recommended to set the length of each variable axis to equal length. The most common criterion is that the length of each variable axis be set to be the same variance (Unit Variance). In Unit Variance (UV) scaling, for each variable (k-column) one calculates standard deviation (Sk) and obtain the scaling weight as the inverse standard deviation (1/Sk). Subsequently, each column of X is multiplied by 1/Sk. Each scaled variable then has equal (unit variance). UV scaling is also called 'Au

Vignettes for BND, BsND

  Reference https://demonstrations.wolfram.com/TheBivariateNormalDistribution/ https://www.statology.org/bivariate-normal-distribution-in-r/ https://blog.revolutionanalytics.com/2016/08/simulating-form-the-bivariate-normal-distribution-in-r-1.html https://bookdown.org/kevin_davisross/probsim-book/bivariate-normal-distributions.html https://online.stat.psu.edu/stat505/book/export/html/656 https://community.jmp.com/t5/Discussions/Generating-random-data-based-on-correlation-matrix/td-p/8918 https://search.r-project.org/CRAN/refmans/fMultivar/html/bvdist-norm2d.html

PCA on Bivariate Data, Comparison of using Covariance and Correlation

Image
  On PCA as Transformation Transformation of original correlated variable, xs, to transformed uncorrelated variables, pcs. Required rotation to explain maximum variance is based on eigen vector. Selecting Type of matrix to calculate the principal components/ transformed variables. Covariance : Use when your variables use the same scale, or when your variables have different scales but you want to give more emphasis to variables with higher variances. Correlation : Use when your variables have different scales and you want to weight all the variables equally.  Using Covariance,  Data Set : Method 1 vs Method 2 in chapter 1 of "A user's Guide to PCA by Edward Jackson". Generate Covariance matrix using original variables/observations, S. Get the eigen value diagonal matrix, L. Get the eigen vector matrix, U. Get the mean centered variable, CentX. Calculate PC's, Y, as a linear combination of the centered variables, CentX, using the entries of the eigen vector, U, as coef

Area of Parallelogram and Determinant of matrix, and linear transformation

Image
  above from ref.11 Below from ref. 12 A determinant is a property of a square matrix. The value of the determinant has many implications for the matrix. A determinant of 0 implies that the matrix is singular, and thus not invertible.  A system of linear equations can be solved by creating a matrix out of the coefficients and taking the determinant; this method is called Cramer's rule, and can only be used when the determinant is not equal to 0. Geometrically, the determinant represents the signed area of the parallelogram formed by the column vectors taken as Cartesian coordinates. There are many methods used for computing the determinant. Some matrices, such as diagonal or triangular matrices, can have their determinants computed by taking the product of the elements on the main diagonal.  For a 2-by-2 matrix, the determinant is calculated by subtracting the reverse diagonal from the main diagonal, which is known as the Leibniz formula. The determinant of the product of matrices

Vignettes for Matrix concepts, related operations

  Geometric Transformations, compute the matrix of a rotation transformation and visualize it. Wolfram|Alpha Examples: Geometric Transformations (wolframalpha.com) Reference Change of Basis – GeoGebra Eigenvalues of a Matrix Calculator - Online Eigen Values λ Finder (dcode.fr) Eigenvectors of a Matrix Calculator (with Eigenvalues) - Online (dcode.fr) Matrix Visualization | Sarah Greer (sygreer.com)   Matrix visualizer (utexas.edu)  gives rotation by angle. Eigenvalue Calculator: Wolfram|Alpha (wolframalpha.com) Determinant Calculator: Wolfram|Alpha (wolframalpha.com)

Covariance matrix as product of rotation matrix and scaling matrix/factor

Image
  Reference A geometric interpretation of the covariance matrix (visiondummy.com)