rdfs:comment
  Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components, and several related procedures principal component analysis (PCA). Robust and L1normbased variants of standard PCA have also been proposed.
 & Williams, L.J. (2010). "Principal component analysis". Wiley Interdisciplinary Reviews: Computational Statistics. 2 (4): 433–459. arXiv:1108.4372. doi:10.1002/wics.101.</ref> The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors;
 In machine learning, principal component analysis (PCA) is a method to project data in a higher dimensional space into a lower dimensional space by maximizing the variance of each dimension. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components.
 In statistics, principal component analysis (PCA) is a method to project data in a higher dimensional space into a lower dimensional space by maximizing the variance of each dimension. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA).
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data.
 Principal component analysis (PCA) is the process of computing the directions (principal components) that align with most of the variation in a set of data points in a multidimensional space, and using some or all of these components to perform a change of basis on the data.Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line, or equivalently that aligns with the largest possible amount of the data variance. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. The ordered collection o
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. Similarly, a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. This process defines an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. Similarly, a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. This process defines an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components and several related procedures principal component analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction; each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared error to the vector; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components orthonormal basis
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared error to the vector; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components form the orthonor
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the distance to the vector for all data points; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components form the
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction (that is, the sum or average of squared distances from the line through the origin in the direction of the direction vector). Each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obta
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction (that is, minimizes the sum or average of squared distances from the line through the origin in the direction of the direction vector). Each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal componen
 The first principal component of a set of data points in a multidimensional space is the direction of a line that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction (that is, minimizes the sum or average of squared distances from the line through the origin in the direction of the direction vector). Each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal compo
 The first principal component of a set of data points in a multidimensional space is the direction of a line that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared distances from points to the line. Each subsequent principal component is a direction of a line that minimizes the sum of squared distances and is orthogonal to the first principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variati
 The first principal component of a set of data points in a multidimensional space is the direction of a line that best fits the data, in that it minimizes the variance of the projected data or minimizes the sum of squared distances from points to the line. Each subsequent principal component is a direction of a line that minimizes the sum of squared distances and is orthogonal to the first principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variati
 The principal components of a collection of points in are a sequence of vectors where the element is the direction of a line that best fits the data and is orthogonal to the first elements. Here, a bestfitting line is defined as a line that minimizes the average squared distance from a point to the line. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. PCA is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in are a sequence of vectors where the element is the direction of a line that best fits the data while being orthogonal to the first elements. Here, a bestfitting line is defined as a line that minimizes the average squared distance from a point to the line. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. PCA is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real nspace are a sequence of direction vectors where the element is the direction of a line that best fits the data while being orthogonal to the first elements. Here, a bestfitting line is defined as a line that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors where the element is the direction of a line that best fits the data while being orthogonal to the first elements. Here, a bestfitting line is defined as a line that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as a line that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as a line that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors, where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors, where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data. In some cases, retaining the first few principal components leads to noise reduction, discovering systematic variation, or approximating latent variable models. As one of the most popula
 The principal components of a collection of points in a real pspace that are a sequence of direction vectors, where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest.
 The principal components of a collection of points in a real pspace are a sequence of direction vectors, where the vector is the direction of a line that best fits the data while being orthogonal to the first vectors. Here, a bestfitting line is defined as one that minimizes the average squared distance from the points to the line. These directions constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Principal component analysis (PCA) is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest.

has abstract
  Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components, and several related procedures principal component analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done in the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix or by singular value decomposition of a design matrix. Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting each data value from its variable's measured mean so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components, and several related procedures principal component analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting each data value from its variable's measured mean so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components, and several related procedures principal component analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 & Williams, L.J. (2010). "Principal component analysis". Wiley Interdisciplinary Reviews: Computational Statistics. 2 (4): 433–459. arXiv:1108.4372. doi:10.1002/wics.101.</ref> The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 In machine learning, principal component analysis (PCA) is a method to project data in a higher dimensional space into a lower dimensional space by maximizing the variance of each dimension. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 In statistics, principal component analysis (PCA) is a method to project data in a higher dimensional space into a lower dimensional space by maximizing the variance of each dimension. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is often used to visualize genetic distance and relatedness between populations. PCA is either done by singular value decomposition of a design matrix or by doing the following 2 steps: 1.
* calculating the data covariance (or correlation) matrix of the original data 2.
* performing eigenvalue decomposition on the covariance matrix Usually the original data is normalized before performing the PCA. The normalization of each attribute consists of mean centering – subtracting its variable's measured mean from each data value so that its empirical mean (average) is zero. Some fields, in addition to normalizing the mean, do so for each variable's variance (to make it equal to 1); see zscores. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction i.e. by projecting each data point onto only the first few principal components. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The ith principal component is the direction that maximizes the variance of the projected data and is orthogonal to the first i1 principal components. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using an eigendecomposition of the data covariance matrix or SVD of the data matrix. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction i.e. by projecting each data point onto only the first few principal components. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The ith principal component can be taken as a direction that maximizes the variance of the projected data and is orthogonal to the first i1 principal components. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using an eigendecomposition of the data covariance matrix or SVD of the data matrix. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score). If component scores are standardized to unit variance, loadings must contain the data variance in them (and that is the magnitude of eigenvalues). If component scores are not standardized (therefore they contain the data variance) then loadings must be unitscaled, ("normalized") and these weights are called eigenvectors; they are the cosines of orthogonal rotation of variables into principal components or back. PCA is the simplest of the true eigenvectorbased multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a is visualised as a set of coordinates in a highdimensional data space (1 axis per variable), PCA can supply the user with a lowerdimensional picture, a projection of this object when viewed from its most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components, and several related procedures Principal Component Analysis (PCA). PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction i.e. by projecting each data point onto only the first few principal components. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction that maximizes the variance of the projected data and is orthogonal to the first principal components. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction i.e. by projecting each data point onto only the first few principal components. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Principal component analysis (PCA) is the process of computing the directions (principal components) that align with most of the variation in a set of data points in a multidimensional space, and using some or all of these components to perform a change of basis on the data.Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line, or equivalently that aligns with the largest possible amount of the data variance. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. The ordered collection of basis vectors are called principal components. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction i.e. by projecting each data point onto only the first few principal components. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. The next bestfitting line can be similarly chosen from directions perpendicular to the first. Repeating this process yields an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared perpendicular distance from a point to the line. Similarly, a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. This process defines an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. Similarly, a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. This process defines an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Often, PCA refers to the process of computing the principal components and using some or all of them to perform a change of basis on the data. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthogonal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses. PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called Principal Components and several related procedures Principal Component Analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions comprise an orthonormal basis in which different individual dimensions of the data are uncorrelated. These basis vectors are called principal components and several related procedures principal component analysis (PCA). Usually, PCA refers to the process of computing the principal components and using them to perform a change of basis on the data, sometimes only using the first few principal components and ignoring the rest. PCA is mostly used as a tool in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The principal component can be taken as a direction orthogonal to the first principal components that maximizes the variance of the projected data. From either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction; each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions form an orthonormal basis in which different individual dimensions of the data are uncorrelated. From either the maximumvariance or minimumsquareresidual objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared error to the vector; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components orthonormal basis in which different individual dimensions of the data are uncorrelated. From either the maximumvariance or minimumsquareresidual objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared error to the vector; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components form the orthonormal basis vectors that define the new space, in which the dimensions are uncorrelated. From either the maximumvariance or minimumsquareresidual objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the distance to the vector for all data points; the subsequent principal components are calculated similarly and are orthogonal to the previous principal components. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. The principal components form the orthonormal basis vectors that define the new space, in which the dimensions are uncorrelated. From either the maximumvariance or minimumsquaredistance objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction (that is, the sum or average of squared distances from the line through the origin in the direction of the direction vector). Each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions form an orthonormal basis in which different individual dimensions of the data are uncorrelated. From either the maximumvariance or minimumsquareresidual objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
 The first principal component of a set of data points in a multidimensional space is the direction vector that best fits the data, in that it maximizes the variance of the projected data or minimizes the sum of squared residuals orthogonal to the direction (that is, minimizes the sum or average of squared distances from the line through the origin in the direction of the direction vector). Each subsequent principal component is a direction orthogonal to the first that best fits the residual. Principal component analysis or PCA is the process of finding or using such components. PCA is a statistical tool used in exploratory data analysis and in predictive modeling. It is commonly used for dimensionality reduction by projecting each data point onto only the first several principal components to obtain lowerdimensional data while preserving as much of the data's variation as possible. Given a collection of points in two, three, or higher dimensional space, a "best fitting" line can be defined as one that minimizes the average squared distance from a point to the line. For a collection of points in and , a direction for the bestfitting line can be chosen from directions perpendicular to the first bestfitting lines. These directions form an orthonormal basis in which different individual dimensions of the data are uncorrelated. From either the maximumvariance or minimumsquareresidual objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, principal components are often computed using eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvectorbased multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the crosscovariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset.Robust and L1normbased variants of standard PCA have also been proposed.
