|
|
Prof.
Candace Clark |
|
|
Sociology 240, Social Statistics |
|
|
We decide at which level we think a variable is measured by thinking
about its categories. We try to think of
how the categories are related to each other and what patterns we can
find. Sometimes the categories are
numbers, and sometimes they are words. Sometimes the categories
have an inherent order to them, and
sometimes they do not.
the categories of the variable:
| Level | are names | have an inherent order
from more to less or higher to lower |
are numbers with
equal intervals between them |
are numbers that
have a theoretical zero point |
| Nominal
level |
X | |||
| Ordinal
level |
X | X | ||
| Interval
level |
X | X | X | |
| Ratio
level |
X | X | X | X |
Almost any method of measuring attitudes results in ordinal-level variables,
even if the variables include only
two categories. For example, we could categorize the variable
"attitudes toward capital punishment" into those
who favor and those who oppose; and those who favor capital punishment
hold more favorable attitudes, while
those who oppose hold less favorable attitudes.
Statistics that allow us to analyze ordinal-level variables are different
from statistics for nominal-level variables
and for higher-level variables, as we will see later on. In order
to use these more powerful statistics, we might
try to reconceptualize (and rename) a variable so we can consider it
to be ordinal-level.
For example, if our job were to categorize the reasons for calls made
to 911 emergency dispatchers on a
particular day, we might come up the following categories: noisy
neighbors, fender-bender, heart attack. One
way to think of these categories is as just names of problems, and
the name of the variable could be "Type of
Problem." In this case, we would be conceiving of our variable
as nominal-level. But another way to think of
the categories is to order them from least severe (noisy neighbors)
to most severe (heart attack). The name of
the variable, as it is being conceived this time, would be "Severity
of Problem," and it would be an ordinal-level
variable.
___/___/___/___/___/___/___/___/___/___/
0 1 2 3
4 5 6 7
8 9 10
Take "temperature in degrees," the best example of an interval-level
variable. Temperature is measured in
degrees, and the degrees are not words (cold, super-cold, warm, etc.),
but numbers corresponding to levels of
mercury in a thermometer. The distance, or interval, between
1 degree and 2 degrees is exactly equal to the
distance between 2 degrees and 3 degrees, and indeed, between 78 and
79 degrees or 99 and 100 degrees. The
intervals between any two adjacent categories are equal (exactly 1
degree). In addition, what makes
temperature in degrees an interval-level measure is that it does not
have what it takes to be a ratio-level
measure. It does not have a theoretical zero point. Actually,
a thermometer does have a zero, but the zero
does not indicate a lack or absence of the variable, temperature.
Zero indicates "cold." And one method of
measuring temperature (e.g., Fahrenheit) has a different spot for zero
than others (e.g., Celsius). These are
arbitrary zero points that are not intended to indicate a total lack
of temperature. It's really impossible to
imagine a lack of temperature. With no true zero point, temperature
in degrees must be considered only an
interval-level variable.
Almost no variables used in social science are interval-level variables,
with the exception of time measured in
calendar years. The interval between the categories 1902 and
1903 is one year, the same as the interval
between 1766 and 1767 or between 2002 and 2003. So this variable
has equal intervals. But what about a zero
point? When did time start? Can we imagine an absence of
time? Philosophers or astronomers may have
answers for these questions, but in practical terms, there is no zero
point. Hence, time in years would be an
interval-level variable. But for practical purposes, we will
ignore interval-level variables and concentrate on
nominal-, ordinal-, and ratio-level measures.
After the first three criteria are met, we then determine if the variable
has a zero point. If so, we consider the
variable to be ratio-level. The zero point makes a ratio-level
variable more precise than an interval-level variable.
The zero point means we can sensibly multiply and divide the categories
of a ratio-level variable. For instance,
we can say that someone who has $100 has twice as much income as someone
who has $50 and half as much
income as another person who has $200. With age, a person who
is ten years old is twice as old as a five-year-
old and one third as old as someone who is 30. A person who has
one child has half as many children as those
with two children. These statements make sense.
But if we tried to do the same thing with a nominal-level variable,
we would end up with gibberish. It would
not make sense to say that a Protestant had twice as much religious
affiliation as a Jew, or that a Latino had
three times as much ethnicity as an African-American. When the
categories are merely names, we can attach
code numbers to them; but those code numbers cannot be manipulated
mathematically in the same way as the
categories of a ratio-level variable.
The same problem occurs with ordinal-level variables. It
would not make sense to say that an upper-class
person has twice as much social class as a working-class person.
Again, we can attach code numbers to the
categories, but we cannot sensibly multiply and divide the codes.
Even with interval-level variables, we cannot legitimately create ratios
or make precise comparisons. The key
difference between an interval- and a ratio-level variable lies in
the zero point. For this reason, we cannot say
that 100 degrees is twice as warm as 50 degrees or that 20 degrees
is half as warm as 40 degrees. Even though
the categories have equal distances between them, there is no zero
point. Also consider the variable Time,
measured in years. Various calendars have arbitrarily designated
one year or another to be the year 0. But a
true zero point would indicate the absence of time. Now, this
is a concept even Einstein would have trouble
with!
Note that a having a zero point is not the only criterion that makes
a variable ratio-level. With the variable
"Fear of Crime," there could be people who have no fear. So,
this variable could have a zero point. But that fact
would not make "Fear of Crime" a ratio-level variable, because the
intervals between the categories are not
precise enough to be equal. Remember that the categories of "Fear
of Crime" were: very afraid, somewhat
afraid, and not afraid. What are the intervals between these
categories? We cannot say that somewhat afraid is
one "fear unit" above not afraid. We don't know precisely what
the interval or distance between these
categories is. We only know that one category is higher or lower
than the others. Even if we assign code
numbers to these categories (e.g., 1 = not afraid, 2 = somewhat afraid,
and 3 = very afraid), we cannot make
the variable any more precise. It wouldn't make sense to say,
"John's fear of crime is 3" the way we might say,
"John's number of children is 3." The code numbers we assign
to ordinal-level (or nominal-level) variables are
useful for having the computer deal with our data, but we should not
make the mistake of assuming that the
intervals between such code numbers are equal. In sum, to be
considered to be at a particular level of
measurement, a variable's categories must meet all the criteria for
the lower levels too.
Some statistics require us to make ratios with, multiply, and divide
a variable's categories. These statistics can
only be used with ratio-level variables. If a variable is interval-level
or lower, we need different statistics to
summarize the variable. The statistics reserved for ratio-level
variables are more powerful and yield more
information than statistics for nominal- or ordinal-level variables.
Thus researchers try to measure variables at
the ratio level whenever they can. For instance, one could measure
Education in terms of the categories: less
than high school, high school only, some college, college degree, and
advanced degree. But the categories of
this variable do not have equal intervals between them. Thus,
it is an ordinal-level measure of education. If,
however, we asked how many years of school the respondents had completed,
we would have categories such
as 0, 1, . . . 11, 12, . . . 16, and so forth. These categories
have an inherent order from less to more education,
the intervals between the categories are equal, and it is possible
to have 0 years of school. We would have a
ratio-level measure of education, which would be amenable to analysis
with more powerful statistics.
You can see that just because a variable could be measured at the ratio
level does not mean that it has been.
Take, for example, "Family Income." Researchers could interview
a sample of people and ask them to indicate
their annual family income in dollars. Or, at least, they could
try. It is very unlikely that people really know
exactly what their annual family income is, down to the dollar.
Another problem is that many Americans do
not like to tell people their incomes. Obtaining a ratio-level
measure of family income would be quite difficult.
Therefore, most researchers ask respondents to indicate where their
income falls in specified ranges of dollars,
as in this hypothetical example:
A. $0 to $19,999
B. $20,000 to 39,999
C. $40,000 to 59,999
D. $60,000 to 79,999
E. $80,000 or higher
Given that the researcher has used this set of categories, at what level
did s/he measure family income? Look
carefully at the categories. They have names (e.g., "$0 to $19,999"),
and the categories follow an order from
least income to most income. But we cannot say the intervals
between the categories are equal. Although
most of them are equal, the last category could include a family earning
$80,001, and it could also include the
Microsoft magnate Bill Gates, whose income is in the millions every
year. So the highest level of measurement
at which we can think of this variable, as it is measured here, is
ordinal-level. The categories are ordered from
lowest income to highest, but the intervals between the categories
are unequal.
A common practice among statistical analysts is to convert nominal-level
variables to what are called dummy
variables so they can be used as if they were ratio-level. A
dummy variable has two categories, one of
which is coded 0 and the other is coded 1. (Dummy variables are
used only as independent variables, not
as dependent variables.) For instance, with the variable
"Sex," instead of coding males as 1 and females as 2 (or
the other way around), we could code males as 0 and females as 1.
Now we have to rethink our conception of
the variable. It is no longer "sex," but "femaleness."
Males have 0 femaleness.
1. At what level are the following variables measured?
F. Income
less than $10,000
$10,000 to 29,999
$30,000 to 59,999
$60,000 or more
G. Sex Male, Female
H. Attitude toward gun laws
Very favorable
Somewhat favorable
Somewhat unfavorable
Very unfavorable
No answer
I. Ideal number of children
J. Family Income in dollars
K. Candidate voted for in 2000
election
Gore
Bush
Other
Not applicable, didn't vote
No answer
2. For each of the following variables, can you think of
ways to measure it at the ratio level?
If not, why
not, and what is the highest level at which it can be measured?
If so, how?
A. Fear of becoming a victim of crime in the area around one's home
B. Right or left handedness
C. Knowledge of statistics
D. Attitude toward abortion
E. Division of labor in the household
F. Defendants' risk of flight from prosecution