1
1
Lecture 2
Types of Data
2
Types of data
(levels of measurement)
?
D
ata (variables) can be classified into four
main types: nominal, ordinal, interval and ratio.
?
U
nderstanding the type of data you are
using is important because some statistical measures are only meaningful for some types of data.
2
3
Nominal scales
?
N
ominal data is the simplest form of data
in which data falls into unordered categories. Essentially, numbers are used to represent non-numeric categories. Nominal data can have two or more categories.
?
W
here there are only two categories the
variable (or indicator) is generally refe
rre
d
to as dichotomous or binary
.
4
Nominal scales
?
T
he most common example of this type of
variable is SEX. There a
r
e two ca
tegories,
male and female, and there is no logical order in the categories. That is, one category is not higher, bigger, or better than the other. It is standard practice for males to be given the value 1 and females the value 2, although there is no statistical or mathematical reason for this.
3
5
Nominal scales
?
N
ominal variables can have
more than two categories. An example is religion. We could label religions any way we wanted, for example, in order of size or alphabetically.
?
S
ome more examples:
Religion1=Buddhi
s
t
2=Cathol
i
c
3=Protesta
n
t
4=Muslim5=Other6=None
6
…… Liaoning
Unknown
Jilin
Service workers
Heilongj
ia
ng
Farm
ers
Neim
enggu
Commercial employees
Shanxi
Professionals and tec
hnicia
n
s
North
Hebei
Industrial workers
South
Tianjin
Clerical workers
West
Beijing
Governm
e
nt leaders
East
Provin
ce
Occu
pati
o
n
Direction
4
78
Examples from the 1997 Survey
5
9
Nominal scales
?
T
he category ordering helps us learn them,
but means nothing in terms of the variable itself. We could list "W
est" before “East" or
“Shanghai" before “Beijing" and lose no information.
?
F
eminists could place “Women”
before
“Men”, assigning 1 to “Women”
a
nd 2 to
“Men”.
10
Ordinal scales
?
O
rdinal data are similar to nominal data in
that they are labels for non-numeric data. The only difference is that they have a logical order. However, the magnitude of the numeric label is still not important.
6
11
Ordinal scales
?
T
he term “ordinal”
r
efers to order or ordering.
This suggests a quantifiable ranking from most to least or some other logical sequence or ordering of a variable's categories. W
hen the
quantification is by rank order, it suggests a sequence but not yet an exact amount of a variable.
?
T
o say that in terms of population size China
ranks first; India, second; and the former U.S.S.R., third gives us an ordering. But we do not know how much larger China is than India. We still do not know the exact population sizes.
12
Ordinal scales
?
L
evel of education can be an ordinal
variable. We could classify education as
1
=
Less th
an se
co
ndary
2
=
Secon
dary
3=
T
e
r
t
iary
?
T
he category 3 is higher than the
categories 1 and 2 but the differences between 1 and 2 or 2 and 3 are not equal.
7
13
?
I
ndividuals can be placed into ranked
categories ordered highest to lowest (or lowest to highest)
Very short Short Medium Tall Very tall Height
Poor Modest income Lower-middle income Upper-middle income Wealthy Economic status
14
Sex-selection should be banned
Do you:
Strongly disagree? Disagree? Unsure? Agree? Strongly agree?
8
15
In the 1997 Survey
16
Ordinal scales
?
B
oth nominal and ordinal data are referred
to as categorical variables. Neither type of data can be used in mathematical calculations or transformations.
9
17
Interval scales
?
F
or interval data both ordering and
magnitude are important. The number given to interval data represents actual measurable quantities. In addition, interval data can only take on specified values. Examples are age, children ever born, the number of road fatalities in a given period or the number of doctors in an area.
18
Interval scales
?
F
or example the number of times a
woman has given birth provides a variable with a range starting from 0 where the difference between 1 and 2 is the same as the difference between 4 and 5, that is, one child. A larger number indicates that a woman has had more children.
10
19
Examples from the 1997 Survey
201. At what age did you have your fi
rst
menstruation?
313. Number of month into pregnancy at last
induced abortion
315. Number of days of rest after you had induced
abortion
20
Years of schooling
16 12 12
9 6 0
11
21
Ratio scales
?
E
ducational level as measured in years of
schooling would appear to be measured at the interval level, but
it is really an even
higher level of measurement known as the ratio lev
e
l
.
22
Ratio scales
?
I
nterval and ratio levels are nearly identical.
The difference between the two is the nature of the meaning of zero. In interval data, zero is an arbitrary point, whereas in ratio data, zero is an
absolute zero
,
meaning a complete lack of the variable being measured.
12
23
The temperature example
?
I
n the commonly used Fahrenheit and
Celsius scale
s, zero i
s
a
r
bitrarily selected
and it is possible to have below-zero temperatures. In the Ke
lvin scale, however,
zero
。
K (-273
。
C) is called absolute zero.
In theory, it cannot get any colder than absolute zero, so th
ere are no below-zero
readings on the Kelvin scale.
24
Ratio scales
?
I
n the so
cial scie
n
ce
s, m
any vari
ables
above the ordinal level of measurement are actually ratio level. We encounter few examples where zero
is arbitrarily
assigned.
?
M
ore examples of ratio data are monetary
units such as dollars, cents, baht, lira, measu
r
ement o
f
size su
ch a
s
weight,
height, or age.
13
25
Discr
ete and conti
nuous
variables
?
T
he four types of measurement scales
above can be grouped into two broader categories, discrete and continuous.
?
Discrete
variables are those that are
measured on a nominal or ordinal scale. They cla
ssif
y
per
son
s
, ob
jects o
r
event
s
according to the quality of their attributes. Discrete
variable
s are o
f
t
en called
catego
rical va
riables.
26
Discr
ete and conti
nuous
variables
?
Continuous
variables are those that are
measured on an interval or ratio scale. They cla
ssif
y
per
son
s
, ob
jects o
r
event
s
according to the magnitude or quantity of their attributes.
14
27
?
N
ote: Interval data can be recoded into
categorical data which have two or more categories. Be careful that you do not recode interval data into categories which make them lose their meaning or become less useful in the analysis.
28
?
If you have a mixture of different types of data, you may have to record interval data into categorical data in order to use certain statistical procedures
which require data to
be categorical. However, there are some procedures that will work with a mixture of numeric and categorical data. You have to decide what procedures are appropriate for your analysis, based on the topic of your study, the type of data you are analysing and whether the meaning of the data will be affected by recoding.
15
29
?
Y
ou can
also co
mbine categorie
s
in
ordinal or categorical variables into two categories to form dichotomous variables. Certain statistical procedures require the dependent variable to be dichotomous. Categorical variables also have to be recode
d
into dichotomo
u
s varia
b
les kno
w
n
as
dummy variables
for statistical
procedures that require independent variables to be all interval data.