Data Set Stacking

From Displayr
(Redirected from Data - Data Set - Stack)
Jump to navigation Jump to search

This feature is available in Anything > Data > Data Set > Stack.

This feature stacks a data set in the Displayr cloud drive. The data set to be stacked needs to be uploaded to the cloud drive (accessed via the user icon button > Displayr cloud drive). The stacked data set is also written to the cloud drive.

Specifying stacking can be easy with the use of common labels, which are words in variable labels used to identify which variables to stack together. It is often possible to stack an entire data set with a set or more of common labels. Common labels can be manually specified, automatically deduced from the input data set, or deduced from a specified set of variables.

Variables that cannot be stacked using common labels, can be manually stacked either by specifying the names of the set of variables to be stacked, or the names of variables in each stacking observation. A set of consecutive variables can be specified using a range consisting of the name of the first and last variables separated by a dash (-). For example variables Q1_A, Q1_B, Q1_C, Q1_D can be specified as Q1_A-Q1_D. A set of variables with common prefixes and/or suffixes specified using a wildcard character (*). For example variables Q1_A, Q1_B, Q1_C, Q1_D can be specified as Q1_*.

Example

The output below shows the variables of a data set that have been stacked using common labels (blue) and stacked manually (pink):

Options

Input data set The name of the SPSS .sav data file in the Displayr cloud drive that is to be stacked.

Stacked data set The name of the stacked SPSS data file to be saved to the Displayr cloud drive. This is optional and if no input is supplied, a name is generated from the input data file name.

Stack with common labels A choice between Automatically, Using a set of variables to stack as reference, Using manually input common labels and Disabled. If Automatically is chosen, a set of common labels is automatically chosen based on the variable labels in the input data set and variables with these common labels are stacked together. For Using a set of variables to stack as reference, see option Reference variables to stack below. For Using manually input common labels, see option Common label below. If Disabled is chosen, no stacking is performed using common labels.

Reference variables to stack These are text input controls shown when Using a set of variables to stack as reference is selected for Stack with common labels. Each text input should contain the comma-separated names of the reference variables to be used to determine a set of common labels which are used for stacking. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*). Multiple sets of reference variables can be specified for multiple sets of common labels.

Common label These are text input controls shown when Using manually input common labels is selected for Stack with common labels. These should contain the common labels to be used for stacking. Multiple sets of common labels can be specified.

Manually specify stacking by A choice between Variable (see Manually stacked variable below) and Observation (see Manual stacking observation below). Depending on the variables to be stacked, it can be a lot easier to specify variables using one of the methods compared to the other.

Manually stacked variable These are text input controls shown when Variable is selected for Manually specify stacking by. Each text input should contain the comma-separated names of the variables to be stacked together into one variable. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Manual stacking observation These are text input controls shown when Observation is selected for Manually specify stacking by. Each text input should contain the comma-separated names of the variables to be stacked in an observation. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Non-stacked variables to include These are text input controls which should contain the names of the non-stacked variables to be included in the final output (they would otherwise be excluded). Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Include original case variable in stacked data set Whether to include a variable containing the original case numbers.

Include observation variable in stacked data set Whether to include a variable containing the observation numbers.

Automatic updating Whether to automatically update the stacked data set. This is used when the input data set is regularly updated.

Update period The time unit for regular updates. Shown when Automatic updating is selected.

Frequency The multiple of the Update period for regular updating. Shown when Automatic updating is selected.

Start date and time The date and time of the first update in the format dd-mm-yyyy hh:mm or mm-dd-yyyy hh:mm. Shown when Automatic updating is selected.

US date format Whether the Start date and time is expressed in US format i.e. mm-dd-yyyy hh:mm. Shown when Automatic updating is selected.

Time zone An optional time zone for the Start date and time, or else default of UTC applies. Format must be Continent/City, e.g. America/Los_Angeles. See Wikipedia for a list of time zones. Shown when Automatic updating is selected.