Bivariate Analysis
Bivariate analysis examines how two variables are related to each other. The most
common bivariate statistic is the bivariate correlation (often, simply called “correlation”),
which is a number between -1 and +1 denoting the strength of the relationship between two
variables. Let’s say that we wish to study how age is related to self-esteem in a sample of 20
respondents, i.e., as age increases, does self-esteem increase, decrease, or remains unchanged.
If self-esteem increases, then we have a positive correlation between the two variables, if selfesteem
decreases, we have a negative correlation, and if it remains the same, we have a zero
correlation. To calculate the value of this correlation, consider the hypothetical dataset shown
in Table 14.1.
Q u a n t i t a t i v e A n a l y s i s : D e s c r i p t i v e S t a t i s t i c s | 123
Figure 14.2. Normal distribution
Table 14.1. Hypothetical data on age and self-esteem
The two variables in this dataset are age (x) and self-esteem (y). Age is a ratio-scale
variable, while self-esteem is an average score computed from a multi-item self-esteem scale
measured using a 7-point Likert scale, ranging from “strongly disagree” to “strongly agree.” The
histogram of each variable is shown on the left side of Figure 14.3. The formula for calculating
bivariate correlation is:
where rxy is the correlation, x and y are the sample means of x and y, and sx and sy are
the standard deviations of x and y. The manually computed value of correlation between age
and self-esteem, using the above formula as shown in Table 14.1, is 0.79. This figure indicates
124 | S o c i a l S c i e n c e R e s e a r c h
that age has a strong positive correlation with self-esteem, i.e., self-esteem tends to increase
with increasing age, and decrease with decreasing age. Such pattern can also be seen from
visually comparing the age and self-esteem histograms shown in Figure 14.3, where it appears
that the top of the two histograms generally follow each other. Note here that the vertical axes
in Figure 14.3 represent actual observation values, and not the frequency of observations (as
was in Figure 14.1), and hence, these are not frequency distributions but rather histograms.
The bivariate scatter plot in the right panel of Figure 14.3 is essentially a plot of self-esteem on
the vertical axis against age on the horizontal axis. This plot roughly resembles an upward
sloping line (i.e., positive slope), which is also indicative of a positive correlation. If the two
variables were negatively correlated, the scatter plot would slope down (negative slope),
implying that an increase in age would be related to a decrease in self-esteem and vice versa. If
the two variables were uncorrelated, the scatter plot would approximate a horizontal line (zero
slope), implying than an increase in age would have no systematic bearing on self-esteem.
Figure 14.3. Histogram and correlation plot of age and self-esteem
After computing bivariate correlation, researchers are often interested in knowing
whether the correlation is significant (i.e., a real one) or caused by mere chance. Answering
such a question would require testing the following hypothesis:
H0: r = 0
H1: r ≠ 0
H0 is called the null hypotheses, and H1 is called the alternative hypothesis (sometimes,
also represented as Ha). Although they may seem like two hypotheses, H0 and H1 actually
represent a single hypothesis since they are direct opposites of each other. We are interested in
testing H1 rather than H0. Also note that H1 is a non-directional hypotheses since it does not
specify whether r is greater than or less than zero. Directional hypotheses will be specified as
H0: r ≤ 0; H1: r > 0 (if we are testing for a positive correlation). Significance testing of directional
hypothesis is done using a one-tailed t-test, while that for non-directional hypothesis is done
using a two-tailed t-test.
Q u a n t i t a t i v e A n a l y s i s : D e s c r i p t i v e S t a t i s t i c s | 125
In statistical testing, the alternative hypothesis cannot be tested directly. Rather, it is
tested indirectly by rejecting the null hypotheses with a certain level of probability. Statistical
testing is always probabilistic, because we are never sure if our inferences, based on sample
data, apply to the population, since our sample never equals the population. The probability
that a statistical inference is caused pure chance is called the p-value. The p-value is compared
with the significance level (α), which represents the maximum level of risk that we are willing
to take that our inference is incorrect. For most statistical analysis, α is set to 0.05. A p-value
less than α=0.05 indicates that we have enough statistical evidence to reject the null hypothesis,
and thereby, indirectly accept the alternative hypothesis. If p>0.05, then we do not have
adequate statistical evidence to reject the null hypothesis or accept the alternative hypothesis
Add Your Gadget Here
HIGHLIGHT OF THE WEEK
-
Survey Research Survey research a research method involving the use of standardized questionnaires or interviews to collect data about peop...
-
Inter-rater reliability. Inter-rater reliability, also called inter-observer reliability, is a measure of consistency between two or more i...
-
discriminant validity is exploratory factor analysis. This is a data reduction technique which aggregates a given set of items to a smalle...
-
can estimate parameters of this line, such as its slope and intercept from the GLM. From highschool algebra, recall that straight lines can...
-
Positivist Case Research Exemplar Case research can also be used in a positivist manner to test theories or hypotheses. Such studies are ra...
-
Quantitative Analysis: Descriptive Statistics Numeric data collected in a research project can be analyzed quantitatively using statistical...
-
Probability Sampling Probability sampling is a technique in which every unit in the population has a chance (non-zero probability) of being...
-
Experimental Research Experimental research, often considered to be the “gold standard” in research designs, is one of the most rigorous of...
-
Bivariate Analysis Bivariate analysis examines how two variables are related to each other. The most common bivariate statistic is the biva...
-
Case Research Case research, also called case study, is a method of intensively studying a phenomenon over time within its natural setting ...
Sunday, 13 March 2016
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment