The Advanced
tab of the Categorized
Means with Error Plots dialog contains various additional (to the
Categorized
Means with Error Plots - Quick tab) options for the Means
with Error Plots. Use the options on this tab to specify the variables
and select the type of graph you want to create. More options are available
for computing the graph as well as for its display. Some of the options
on this tab are used
Graph type. Select the type of Mean with Error Plot to be plotted from the Graph type box. Click the desired plot link below to obtain a brief description of that type of graph.
Layout. Select the type of layout for the graph(s).
Separate. Select this option button to produce a Separate plot layout (where each subset of cases is displayed in a separate graph) for the categorized plots.
Overlaid. Select this option button to produce an Overlaid plot layout (where all subsets are overlaid in one graph and identified by patterns and colors) for the categorized plots.
Variables. Click the Variables button to display a standard variable selection dialog in which you can select the Dependent variable, the Grouping variable, and the X- and (optional) Y-Category variables for creating the graph. If more than one dependent variable is selected, then a sequence of graphs (one for each dependent variable) will be produced using the same set of grouping variables. The selection that you make will then be displayed in the area of the dialog below the Variables button.
The dependent variable values will be used in calculating the respective statistics that define the components of the graph (e.g., means, medians, standard deviations, etc.), while the grouping variable will be used to categorize the data, using the method of categorization as selected via the options in the Grouping intervals group. Note that the selected grouping variables do not have to be categorical variables (e.g., contain codes); you can use one of the methods of categorization to categorize continuous variables. The selection of grouping variables is not necessary if the categories are defined via the Multiple Subsets method in the X-Categories, Y-Categories, and Intervals group boxes.
X-Categories / Y-Categories. Categorization is used in two classes of graphs in STATISTICA: categorized graphs (e.g., Categorized Scatterplots) and graphs that include grouping or categorized variables (e.g., 2D Histograms, or 2D Box Plots).
Select Integer mode, Unique values, or Categories to specify that method of categorization for each of the variables selected via the Change Variable button, or use the Boundaries, Codes, or Multiple subsets options. For more information about each of these methods of categorization, click on the links below:
Intervals. Use the options in this group box to choose the method of categorization for the selected Grouping variable. Each of the methods is discussed in methods of categorization.
Graph icon. The graph icon in the lower section, left side of the Advanced tab represents the currently selected Graph type (Whiskers or High-Low Close) and the Middle Point options (see below). It also previews the selected Value (Conf. Interval, Non-outlier range, Min-max, or Constant) that will define the Mean with Error Plot that you are about to create as specified in the Whisker group box.
Middle point. The options in the Middle point group box are used to select the statistic that will be used as middle point in the Means with Error Plots.
Value. Select the statistic
Mean or Median
from the Value drop-down box
that will be used to determine the center (middle) points in the plot
(variable and group).
Style. Use the Style drop-down box to specify how the middle point should be represented in the Whiskers or High-Low Close plot. You can choose the selected middle point to appear as a line (select Line) or as a point (select Point).
Pooled variance. The Pooled
variance check box is available when you select Mean
as the Middle point Value. The
setting of this check box determines how the standard deviations and standard
errors (for the means) are computed from grouped data. When the Pooled Variance check box is selected,
STATISTICA computes the pooled
within-group (category) variance for all groups (categories), and uses
this value as an estimate of σ (Sigma)
in computing the standard errors for the means (see, for example, Milliken
and Johnson, 1984). Specifically, STATISTICA
spooled2 = 1/(n-k) * [s12*(n1 -1) + ... + sk2 *(nk -1)]
In this equation, k refers to the k groups in the plot, s12, refers to the variance in the i'th category or group, n1 refers to number of valid observations in the i'th category or group, and n is the overall number of valid observations in the plot.
The standard error of the mean for the i'th group is then computed as:
s.e.(mean) = spooled / square root(ni)
Whisker. The options in the Whisker group box are used to select the options for computing the range of Whiskers or High-Low Close, i.e., to define the error ranges.
Value. Use the Value
drop-down box to choose how the range of Whiskers
or High-Low Close are computed
(Conf. Interval, Non-outlier
range, Min-max, or Constant). If you select Conf.
Interval, then the range will be displayed as the confidence
interval around the mean value. If you select Non-outlier
range, then STATISTICA
determines which points in the data are outliers (see Outliers
and Extremes), and then uses the highest and lowest data points that
are closest to the outliers (but are not outliers) to determine the range
in the plot. On the other hand, the option Min-Max
uses the minimum and maximum values of the data to determine the range,
without considering whether or not these values are outliers. If you choose
option Constant, then the specified
constant will be added/subtracted from the chosen center point (mean or
median), to define the range around that center point.
Probability/Coefficient. If you select the Value option (see above) as Conf. Interval, then you also need to specify a value between 0.15 and .99 in the Probability edit field. This value will be used to determine the length of the Whiskers or High-Low Close around the Mean value, based on the standard error for the respective means, and the standard normal (z) value associated with the chosen probability. When you select the Value as Non-outlier range or Min-max (see above), you also need to specify a value in the Coefficient edit field by which the selected Value will be multiplied to determine the range. In case of the Value option as Constant, the value of the Coefficient itself determines the range (no multiplier is used). By default the value of the Coefficient is 1.
Connect middle points. Select the Connect middle points check box to connect the selected middle points (Means, Medians, trimmed Means, or trimmed Medians) of the Whiskers or High-Low Close.
Display raw data. Select this check box to display the raw data points.
Jitter. Use the options in this group box to jitter the data points, i.e. modify the original position of the data point from the center of the graph in order to more easily identify/brush overlapping points.
Off. If you select Off, no jitter is applied to the raw data points, outliers, and extremes.
Sequential. If you select Sequential, the jitter is applied sequentially to the raw data points, outliers, and extremes. The jitter is applied such that the first case in the data set is maximally shifted to the left and the last case is shifted maximally to the right.
Random.
If you select Random, the data
point is randomly shifted within the available range.
Width. With this option, you can specify the maximum jitter width defined as percentage of box width. Possible percentages range from 0 to 250.
Outliers. The
Outliers group box is used to
control the display of outliers and extremes. Select either Off,
Outliers, Extreme,
or
Coefficient. If you select Outliers, Extreme,
or
Fit. You can fit an equation to the points in the plots by selecting one of the predefined functions in this dialog.
Trim distr. extremes. Use the Trim distr. extremes box to specify the percent of cases to be "trimmed" from the extremes (i.e., tails) of the distributions of cases for the selected dependent variables. For example, if you specify 10%, then for a variable with 100 cases, STATISTICA removes the 10 cases with the lowest values and the 10 cases with the highest values for the respective variable from the graph, and uses only the 80 remaining ("middle") cases. If you enter a value for Trim distrib. extremes for mean-based Means with Error Plots, then the so-called "trimmed means" will be used in the graph.