Univariate Analysis
Univariate analysis, or analysis of a single variable, refers to a set of statistical
techniques that can describe the general properties of one variable. Univariate statistics
include: (1) frequency distribution, (2) central tendency, and (3) dispersion. The frequency
distribution of a variable is a summary of the frequency (or percentages) of individual values
or ranges of values for that variable. For instance, we can measure how many times a sample of
respondents attend religious services (as a measure of their “religiosity”) using a categorical
scale: never, once per year, several times per year, about once a month, several times per
month, several times per week, and an optional category for “did not answer.” If we count the
number (or percentage) of observations within each category (except “did not answer” which is
really a missing value rather than a category), and display it in the form of a table as shown in
Figure 14.1, what we have is a frequency distribution. This distribution can also be depicted in
the form of a bar chart, as shown on the right panel of Figure 14.1, with the horizontal axis
representing each category of that variable and the vertical axis representing the frequency or
percentage of observations within each category.
Figure 14.1. Frequency distribution of religiosity
With very large samples where observations are independent and random, the
frequency distribution tends to follow a plot that looked like a bell-shaped curve (a smoothed
bar chart of the frequency distribution) similar to that shown in Figure 14.2, where most
observations are clustered toward the center of the range of values, and fewer and fewer
observations toward the extreme ends of the range. Such a curve is called a normal distribution.
Central tendency is an estimate of the center of a distribution of values. There are
three major estimates of central tendency: mean, median, and mode. The arithmetic mean
(often simply called the “mean”) is the simple average of all values in a given distribution.
Consider a set of eight test scores: 15, 22, 21, 18, 36, 15, 25, 15. The arithmetic mean of these
values is (15 + 20 + 21 + 20 + 36 + 15 + 25 + 15)/8 = 20.875. Other types of means include
geometric mean (nth root of the product of n numbers in a distribution) and harmonic mean (the
reciprocal of the arithmetic means of the reciprocal of each value in a distribution), but these
means are not very popular for statistical analysis of social research data.
122 | S o c i a l S c i e n c e R e s e a r c h
The second measure of central tendency, the median, is the middle value within a range
of values in a distribution. This is computed by sorting all values in a distribution in increasing
order and selecting the middle value. In case there are two middle values (if there is an even
number of values in a distribution), the average of the two middle values represent the median.
In the above example, the sorted values are: 15, 15, 15, 18, 22, 21, 25, 36. The two middle
values are 18 and 22, and hence the median is (18 + 22)/2 = 20.
Lastly, the mode is the most frequently occurring value in a distribution of values. In
the previous example, the most frequently occurring value is 15, which is the mode of the above
set of test scores. Note that any value that is estimated from a sample, such as mean, median,
mode, or any of the later estimates are called a statistic.
Dispersion refers to the way values are spread around the central tendency, for
example, how tightly or how widely are the values clustered around the mean. Two common
measures of dispersion are the range and standard deviation. The range is the difference
between the highest and lowest values in a distribution. The range in our previous example is
36-15 = 21.
The range is particularly sensitive to the presence of outliers. For instance, if the
highest value in the above distribution was 85 and the other vales remained the same, the range
would be 85-15 = 70. Standard deviation, the second measure of dispersion, corrects for such
outliers by using a formula that takes into account how close or how far each value from the
distribution mean:
where σ is the standard deviation, xi is the ith observation (or value), µ is the arithmetic mean, n
is the total number of observations, and Σ means summation across all observations. The
square of the standard deviation is called the variance of a distribution. In a normally
distributed frequency distribution, it is seen that 68% of the observations lie within one
standard deviation of
Add Your Gadget Here
HIGHLIGHT OF THE WEEK
-
Survey Research Survey research a research method involving the use of standardized questionnaires or interviews to collect data about peop...
-
Inter-rater reliability. Inter-rater reliability, also called inter-observer reliability, is a measure of consistency between two or more i...
-
discriminant validity is exploratory factor analysis. This is a data reduction technique which aggregates a given set of items to a smalle...
-
can estimate parameters of this line, such as its slope and intercept from the GLM. From highschool algebra, recall that straight lines can...
-
Positivist Case Research Exemplar Case research can also be used in a positivist manner to test theories or hypotheses. Such studies are ra...
-
Quantitative Analysis: Descriptive Statistics Numeric data collected in a research project can be analyzed quantitatively using statistical...
-
Probability Sampling Probability sampling is a technique in which every unit in the population has a chance (non-zero probability) of being...
-
Experimental Research Experimental research, often considered to be the “gold standard” in research designs, is one of the most rigorous of...
-
Bivariate Analysis Bivariate analysis examines how two variables are related to each other. The most common bivariate statistic is the biva...
-
Case Research Case research, also called case study, is a method of intensively studying a phenomenon over time within its natural setting ...
Sunday, 13 March 2016
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment