1 1 Lecture 2 Types of Data 2 Types of data (levels of measurement) ? D ata (variables) can be classified into four main types: nominal, ordinal, interval and ratio. ? U nderstanding the type of data you are using is important because some statistical measures are only meaningful for some types of data. 2 3 Nominal scales ? N ominal data is the simplest form of data in which data falls into unordered categories. Essentially, numbers are used to represent non-numeric categories. Nominal data can have two or more categories. ? W here there are only two categories the variable (or indicator) is generally refe rre d to as dichotomous or binary . 4 Nominal scales ? T he most common example of this type of variable is SEX. There a r e two ca tegories, male and female, and there is no logical order in the categories. That is, one category is not higher, bigger, or better than the other. It is standard practice for males to be given the value 1 and females the value 2, although there is no statistical or mathematical reason for this. 3 5 Nominal scales ? N ominal variables can have more than two categories. An example is religion. We could label religions any way we wanted, for example, in order of size or alphabetically. ? S ome more examples: Religion1=Buddhi s t 2=Cathol i c 3=Protesta n t 4=Muslim5=Other6=None 6 …… Liaoning Unknown Jilin Service workers Heilongj ia ng Farm ers Neim enggu Commercial employees Shanxi Professionals and tec hnicia n s North Hebei Industrial workers South Tianjin Clerical workers West Beijing Governm e nt leaders East Provin ce Occu pati o n Direction 4 78 Examples from the 1997 Survey 5 9 Nominal scales ? T he category ordering helps us learn them, but means nothing in terms of the variable itself. We could list "W est" before “East" or “Shanghai" before “Beijing" and lose no information. ? F eminists could place “Women” before “Men”, assigning 1 to “Women” a nd 2 to “Men”. 10 Ordinal scales ? O rdinal data are similar to nominal data in that they are labels for non-numeric data. The only difference is that they have a logical order. However, the magnitude of the numeric label is still not important. 6 11 Ordinal scales ? T he term “ordinal” r efers to order or ordering. This suggests a quantifiable ranking from most to least or some other logical sequence or ordering of a variable's categories. W hen the quantification is by rank order, it suggests a sequence but not yet an exact amount of a variable. ? T o say that in terms of population size China ranks first; India, second; and the former U.S.S.R., third gives us an ordering. But we do not know how much larger China is than India. We still do not know the exact population sizes. 12 Ordinal scales ? L evel of education can be an ordinal variable. We could classify education as 1 = Less th an se co ndary 2 = Secon dary 3= T e r t iary ? T he category 3 is higher than the categories 1 and 2 but the differences between 1 and 2 or 2 and 3 are not equal. 7 13 ? I ndividuals can be placed into ranked categories ordered highest to lowest (or lowest to highest) Very short Short Medium Tall Very tall Height Poor Modest income Lower-middle income Upper-middle income Wealthy Economic status 14 Sex-selection should be banned Do you: Strongly disagree? Disagree? Unsure? Agree? Strongly agree? 8 15 In the 1997 Survey 16 Ordinal scales ? B oth nominal and ordinal data are referred to as categorical variables. Neither type of data can be used in mathematical calculations or transformations. 9 17 Interval scales ? F or interval data both ordering and magnitude are important. The number given to interval data represents actual measurable quantities. In addition, interval data can only take on specified values. Examples are age, children ever born, the number of road fatalities in a given period or the number of doctors in an area. 18 Interval scales ? F or example the number of times a woman has given birth provides a variable with a range starting from 0 where the difference between 1 and 2 is the same as the difference between 4 and 5, that is, one child. A larger number indicates that a woman has had more children. 10 19 Examples from the 1997 Survey 201. At what age did you have your fi rst menstruation? 313. Number of month into pregnancy at last induced abortion 315. Number of days of rest after you had induced abortion 20 Years of schooling 16 12 12 9 6 0 11 21 Ratio scales ? E ducational level as measured in years of schooling would appear to be measured at the interval level, but it is really an even higher level of measurement known as the ratio lev e l . 22 Ratio scales ? I nterval and ratio levels are nearly identical. The difference between the two is the nature of the meaning of zero. In interval data, zero is an arbitrary point, whereas in ratio data, zero is an absolute zero , meaning a complete lack of the variable being measured. 12 23 The temperature example ? I n the commonly used Fahrenheit and Celsius scale s, zero i s a r bitrarily selected and it is possible to have below-zero temperatures. In the Ke lvin scale, however, zero 。 K (-273 。 C) is called absolute zero. In theory, it cannot get any colder than absolute zero, so th ere are no below-zero readings on the Kelvin scale. 24 Ratio scales ? I n the so cial scie n ce s, m any vari ables above the ordinal level of measurement are actually ratio level. We encounter few examples where zero is arbitrarily assigned. ? M ore examples of ratio data are monetary units such as dollars, cents, baht, lira, measu r ement o f size su ch a s weight, height, or age. 13 25 Discr ete and conti nuous variables ? T he four types of measurement scales above can be grouped into two broader categories, discrete and continuous. ? Discrete variables are those that are measured on a nominal or ordinal scale. They cla ssif y per son s , ob jects o r event s according to the quality of their attributes. Discrete variable s are o f t en called catego rical va riables. 26 Discr ete and conti nuous variables ? Continuous variables are those that are measured on an interval or ratio scale. They cla ssif y per son s , ob jects o r event s according to the magnitude or quantity of their attributes. 14 27 ? N ote: Interval data can be recoded into categorical data which have two or more categories. Be careful that you do not recode interval data into categories which make them lose their meaning or become less useful in the analysis. 28 ? If you have a mixture of different types of data, you may have to record interval data into categorical data in order to use certain statistical procedures which require data to be categorical. However, there are some procedures that will work with a mixture of numeric and categorical data. You have to decide what procedures are appropriate for your analysis, based on the topic of your study, the type of data you are analysing and whether the meaning of the data will be affected by recoding. 15 29 ? Y ou can also co mbine categorie s in ordinal or categorical variables into two categories to form dichotomous variables. Certain statistical procedures require the dependent variable to be dichotomous. Categorical variables also have to be recode d into dichotomo u s varia b les kno w n as dummy variables for statistical procedures that require independent variables to be all interval data.