N-Way Tabulation
This view classifies the observations in the current sample into cells defined by the series in the group. You can display the cell counts in various forms and examine statistics for independence among the series in the group. which opens the tabulation dialog.
Many of the settings will be familiar from our discussion of one-way tabulation in
“One-Way Tabulation”.
Group into Bins If
If one or more of the series in the group is continuous and takes many distinct values, the number of cells becomes excessively large. This option provides you two ways to automatically bin the values of the series into subgroups.
• Number of values option bins the series if the series takes more than the specified number of distinct values.
• Average count option bins the series if the average count for each distinct value of the series is less than the specified number.
• Maximum number of bins specifies the approximate maximum number of subgroups to bin the series. The number of bins may be chosen to be smaller than this number in order to make the bins approximately the same size.
The default setting is to bin a series into approximately 5 subgroups if the series takes more than 100 distinct values or if the average count is less than 2. If you do not want to bin the series, unmark both options.
NA Handling
By default, EViews drops observations from the contingency table where any of the series in the group has a missing value. Treat NA as category option includes all observations and counts NAs in the contingency table as an explicit category.
Layout
This option controls the display style of the tabulation. The
Table mode displays the categories of the first two series in
tables for each category of the remaining series in the group.
The
List mode displays the table in a more compact, hierarchical form. The
Sparse Labels option omits repeated category labels to make the list less cluttered. Note that some of the conditional
statistics are not displayed in list mode.
Output
To understand the options for output, consider a group with three series. Let (
i,
j,
k) index the bin of the first, second, and third series, respectively. The number of observations in the (
i,
j,
k)-th cell is denoted as
with a total of
observations.
• Overall% is the percentage of the total number of observations accounted for by the cell count.
• Table% is the percentage of the total number of observations in the conditional table accounted for by the cell count.
• Row% is the percentage of the number of observations in the row accounted for by the cell count.
• Column% is the percentage of the number of observations in the column accounted for by the cell count.
The overall expected count in the (i, j, k)-th cell is the number expected if all series in the group were independent of each other. This expectation is estimated by:
| (12.21) |
The
table expected count
is estimated by computing the expected count for the conditional table. For a given table, this expected value is estimated by:
| (12.22) |
where
is the total number of observations in the
table.
Chi-square Tests
If you select the
Chi-square tests option, EViews reports
statistics for testing the independence of the series in the group. The test statistics are based on the distance between the actual cell count and the count expected under independence.
• . EViews reports the following two test statistics for overall independence among all series in the group:
| (12.23) |
where
and
are the actual and overall expected count in each cell. Under the null hypothesis of independence, the two statistics are asymptotically distributed
with
degrees of freedom where
are the number of categories for each series.
These test statistics are reported at the top of the contingency table. For example, the top portion of the tabulation output for the group containing LWAGE, UNION, and MARRIED in the workfile “Cps88.WF1” shows:
The three series LWAGE, UNION, and MARRIED, have
,
, and
categories, respectively. Note the WARNING message: if there are many cells with expected value less than 5, the small sample distribution of the test statistic under the null hypothesis may deviate considerably from the asymptotic
distribution.
• . If you display in table mode, EViews presents measures of association for
each conditional table. These measures are analogous to the correlation coefficient; the larger the measure, the larger the association between the row series and the column series in the table. In addition to the Pearson
for the table, the following three measures of association are reported:
| (12.24) |
| (12.25) |
| (12.26) |
where min
is the smaller of the number of row categories
or column categories
of the table, and
is the number of observations in the table. Note that all three measures are bounded between 0 and 1, a higher number indicating a stronger relation between the two series in the table. While the correlation coefficient only measures the linear association between two series, these nonparametric measures are robust to departures from linearity.
Bear in mind that these measures of association are computed for each two-way table. The conditional tables are presented at the top, and the unconditional tables are reported at the bottom of the view.