Transformations

Data transformation is the process of changing the data in some way. More formally, a transformation involves creating a new variable or set of variables from an existing variable or set of variables.

Objectives of transformation

Data transformation is undertaken with the following objectives:

Making it easier to see patterns in the data (e.g., the Log transformations and Principal Components Analysis).
Making it easier to communicate patterns in the data (e.g., the Net Promoter Score).
To address violations of the assumptions of statistical tests (e.g., Ranks, Log transformations).
To improve the validity of regression models (e.g., Basis Functions).
To reduce the amount of data (e.g., Principal Components Analysis).

Standard transformations of a categorical variable

A categorical variable can be transformed in one of two ways:

It can be turned into a numeric variable, by coming up with some rules about the numeric interpretation of categories. For example:
- Replacing the category 18 to 24 with 21 and 25 to 29 with 27 (this is a type of Recoding known as Midpoint Recoding.

The categories of a categorical variable can be combined. Most commonly, small categories are merged into larger categories. For example: When a question asks for reasons for a particular behavior, any reasons that are selected by a small number of respondents can be classified as Other. Variables that collect data on Rating Scales may be converted to Binary Variables to make further analysis simpler.

Standard transformations of numeric variablesUniveriateRanks Log transformations Trimming WindsorizingMultivariatePrincipal Components Analysis Cluster Analysis t-SNE Basis functions, such as: Dummy Variables Polynomials Orthogonal polynomialsSee alsoCoding Recoding

Transformations

Contents

Objectives of transformation

Standard transformations of a categorical variable

Standard transformations of numeric variables

Univeriate

Multivariate

See also

Navigation menu

Transformations

Objectives of transformation

Standard transformations of a categorical variable

Standard transformations of numeric variables

Univeriate

Multivariate

See also

Navigation menu

Search