can estimate parameters of this line, such as its slope and intercept from the GLM. From highschool
algebra, recall that straight lines can be represented using the mathematical equation y =
mx + c, where m is the slope of the straight line (how much does y change for unit change in x)
and c is the intercept term (what is the value of y when x is zero). In GLM, this equation is
represented formally as:
y = β0 + β1 x + ε
where β0 is the slope, β1 is the intercept term, and ε is the error term. ε represents the deviation
of actual observations from their estimated values, since most observations are close to the line
but do not fall exactly on the line (i.e., the GLM is not perfect). Note that a linear model can have
more than two predictors. To visualize a linear model with two predictors, imagine a threedimensional
cube, with the outcome (y) along the vertical axis, and the two predictors (say, x1
and x2) along the two horizontal axes along the base of the cube. A line that describes the
relationship between two or more variables is called a regression line, β0 and β1 (and other beta
values) are called regression coefficients, and the process of estimating regression coefficients is
called regression analysis. The GLM for regression analysis with n predictor variables is:
y = β0 + β1 x1 + β2 x2 + β3 x3 + … + βn xn + ε
In the above equation, predictor variables xi may represent independent variables or
covariates (control variables). Covariates are variables that are not of theoretical interest but
may have some impact on the dependent variable y and should be controlled, so that the
residual effects of the independent variables of interest are detected more precisely. Covariates
capture systematic errors in a regression equation while the error term (ε) captures random
errors. Though most variables in the GLM tend to be interval or ratio-scaled, this does not have
to be the case. Some predictor variables may even be nominal variables (e.g., gender: male or
female), which are coded as dummy variables. These are variables that can assume one of only
two possible values: 0 or 1 (in the gender example, “male” may be designated as 0 and “female”
as 1 or vice versa). A set of n nominal variables is represented using n–1 dummy variables. For
instance, industry sector, consisting of the agriculture, manufacturing, and service sectors, may
be represented using a combination of two dummy variables (x1, x2), with (0, 0) for agriculture,
(0, 1) for manufacturing, and (1, 1) for service. It does not matter which level of a nominal
variable is coded as 0 and which level as 1, because 0 and 1 values are treated as two distinct
groups (such as treatment and control groups in an experimental design), rather than as
numeric quantities, and the statistical parameters of each group are estimated separately.
The GLM is a very powerful statistical tool because it is not one single statistical method,
but rather a family of methods that can be used to conduct sophisticated analysis with different
types and quantities of predictor and outcome variables. If we have a dummy predictor
variable, and we are comparing the effects of the two levels (0 and 1) of this dummy variable on
the outcome variable, we are doing an analysis of variance (ANOVA). If we are doing ANOVA
while controlling for the effects of one or more covariate, we have an analysis of covariance
(ANCOVA). We can also have multiple outcome variables (e.g., y1, y1, … yn), which are
represented using a “system of equations” consisting of a different equation for each outcome
variable (each with its own unique set of regression coefficients). If multiple outcome variables
are modeled as being predicted by the same set of predictor variables, the resulting analysis is
called multivariate regression. If we are doing ANOVA or ANCOVA analysis with multiple
outcome variables, the resulting analysis is a multivariate ANOVA (MANOVA) or multivariate
ANCOVA (MANCOVA) respectively. If we model the outcome in one regression equation as a
132 | S o c i a l S c i e n c e R e s e a r c h
predictor in another equation in an interrelated system of regression equations, then we have a
very sophisticated type of analysis called structural equation modeling. The most important
problem in GLM is model specification, i.e., how to specify a regression equation (or a system of
equations) to best represent the phenomenon of interest. Model specification should be based
on theoretical considerations about the phenomenon being studied, rather than what fits the
observed data best. The role of data is in validating the model, and not in its specification.
Two-Group Comparison
One of the simplest inferential analyses is comparing the post-test outcomes of
treatment and control group subjects in a randomized post-test only control group design, such
as whether students enrolled to a special program in mathematics perform better than those in
a traditional math curriculum. In this case, the predictor variable is a dummy variable
(1=treatment group, 0=control group), and the outcome variable, performance, is ratio scaled
(e.g., score of a math test following the special program). The analytic technique for this simple
design is a one-way ANOVA (one-way because it involves only one predictor variable), and the
statistical test used is called a Student’s t-test (or t-test, in short).
The t-test was introduced in 1908 by William Sealy Gosset, a chemist working for the
Guiness Brewery in Dublin, Ireland to monitor the quality of stout – a dark beer popular with
19th century porters in London. Because his employer did not want to reveal the fact that it was
using statistics for quality control, Gosset published the test in Biometrika using his pen name
“Student” (he was a student of Sir Ronald Fisher), and the test involved calculating the value of
t, which was a letter used frequently by Fisher to denote the difference between two groups.
Hence, the name Student’s t-test, although Student’s identity was known to fellow statisticians.
The t-test examines whether the means of two groups are statistically different from
each other (non-directional or two-tailed test), or whether one group has a statistically larger
(or smaller) mean than the other (directional or one-tailed test). In our example, if we wish to
examine whether students in the special math curriculum perform better than those in
traditional curriculum, we have a one-tailed test. This hypothesis can be stated as:
H0: μ1 ≤ μ2 (null hypothesis)
H1: μ1 > μ2 (alternative hypothesis)
where μ1 represents the mean population performance of students exposed to the special
curriculum (treatment group) and μ2 is the mean population performance of students with
traditional curriculum (control group). Note that the null hypothesis is always the one with the
“equal” sign, and the goal of all statistical significance tests is to reject the null hypothesis.
How can we infer about the difference in population means using data from samples
drawn from each population? From the hypothetical frequency distributions of the treatment
and control group scores in Figure 15.2, the control group appears to have a bell-shaped
(normal) distribution with a mean score of 45 (on a 0-100 scale), while the treatment group
appear to have a mean score of 65. These means look different, but they are really sample
means ( ), which may differ from their corresponding population means (μ) due to sampling
error. Sample means are probabilistic estimates of population means within a certain
confidence interval (95% CI is sample mean + two standard errors, where standard error is the
standard deviation of the distribution in sample means as taken from infinite samples of the
population. Hence, statistical significance of population means depends not only on sample
Q u a n t i t a t i v e A n a l y s i s : I n f e r e n t i a l S t a t i s t i c s | 133
mean scores, but also on the standard error or the degree of spread in the frequency
distribution of the sample means. If the spread is large (i.e., the two bell-shaped curves have a
lot of overlap), then the 95% CI of the two means may also be overlapping, and we cannot
conclude with high probability (p<0.05) that that their corresponding population means are
significantly different. However, if the curves have narrower spreads (i.e., they are less
overlapping), then the CI of each mean may not overlap, and we reject the null hypothesis and
say that the population means of the two groups are significantly different at p<0
Add Your Gadget Here
HIGHLIGHT OF THE WEEK
-
Survey Research Survey research a research method involving the use of standardized questionnaires or interviews to collect data about peop...
-
Inter-rater reliability. Inter-rater reliability, also called inter-observer reliability, is a measure of consistency between two or more i...
-
discriminant validity is exploratory factor analysis. This is a data reduction technique which aggregates a given set of items to a smalle...
-
can estimate parameters of this line, such as its slope and intercept from the GLM. From highschool algebra, recall that straight lines can...
-
Positivist Case Research Exemplar Case research can also be used in a positivist manner to test theories or hypotheses. Such studies are ra...
-
Quantitative Analysis: Descriptive Statistics Numeric data collected in a research project can be analyzed quantitatively using statistical...
-
Probability Sampling Probability sampling is a technique in which every unit in the population has a chance (non-zero probability) of being...
-
Experimental Research Experimental research, often considered to be the “gold standard” in research designs, is one of the most rigorous of...
-
Bivariate Analysis Bivariate analysis examines how two variables are related to each other. The most common bivariate statistic is the biva...
-
Case Research Case research, also called case study, is a method of intensively studying a phenomenon over time within its natural setting ...
Sunday, 13 March 2016
Quantitative Analysis:
Inferential Statistics
Inferential statistics are the statistical procedures that are used to reach conclusions
about associations between variables. They differ from descriptive statistics in that they are
explicitly designed to test hypotheses. Numerous statistical procedures fall in this category,
most of which are supported by modern statistical software such as SPSS and SAS. This chapter
provides a short primer on only the most basic and frequent procedures; readers are advised to
consult a formal text on statistics or take a course on statistics for more advanced procedures.
Basic Concepts
British philosopher Karl Popper said that theories can never be proven, only disproven.
As an example, how can we prove that the sun will rise tomorrow? Popper said that just
because the sun has risen every single day that we can remember does not necessarily mean
that it will rise tomorrow, because inductively derived theories are only conjectures that may or
may not be predictive of future phenomenon. Instead, he suggested that we may assume a
theory that the sun will rise every day without necessarily proving it, and if the sun does not
rise on a certain day, the theory is falsified and rejected. Likewise, we can only reject
hypotheses based on contrary evidence but can never truly accept them because presence of
evidence does not mean that we may not observe contrary evidence later. Because we cannot
truly accept a hypothesis of interest (alternative hypothesis), we formulate a null hypothesis as
the opposite of the alternative hypothesis, and then use empirical evidence to reject the null
hypothesis to demonstrate indirect, probabilistic support for our alternative hypothesis.
A second problem with testing hypothesized relationships in social science research is
that the dependent variable may be influenced by an infinite number of extraneous variables
and it is not plausible to measure and control for all of these extraneous effects. Hence, even if
two variables may seem to be related in an observed sample, they may not be truly related in
the population, and therefore inferential statistics are never certain or deterministic, but always
probabilistic.
How do we know whether a relationship between two variables in an observed sample
is significant, and not a matter of chance? Sir Ronald A. Fisher, one of the most prominent
statisticians in history, established the basic guidelines for significance testing. He said that a
statistical result may be considered significant if it can be shown that the probability of it being
rejected due to chance is 5% or less. In inferential statistics, this probability is called the p-
130 | S o c i a l S c i e n c e R e s e a r c h
value, 5% is called the significance level (α), and the desired relationship between the p-value
and α is denoted as: p≤0.05. The significance level is the maximum level of risk that we are
willing to accept as the price of our inference from the sample to the population. If the p-value
is less than 0.05 or 5%, it means that we have a 5% chance of being incorrect in rejecting the
null hypothesis or having a Type I error. If p>0.05, we do not have enough evidence to reject
the null hypothesis or accept the alternative hypothesis.
We must also understand three related statistical concepts: sampling distribution,
standard error, and confidence interval. A sampling distribution is the theoretical
distribution of an infinite number of samples from the population of interest in your study.
However, because a sample is never identical to the population, every sample always has some
inherent level of error, called the standard error. If this standard error is small, then statistical
estimates derived from the sample (such as sample mean) are reasonably good estimates of the
population. The precision of our sample estimates is defined in terms of a confidence interval
(CI). A 95% CI is defined as a range of plus or minus two standard deviations of the mean
estimate, as derived from different samples in a sampling distribution. Hence, when we say that
our observed sample estimate has a CI of 95%, what we mean is that we are confident that 95%
of the time, the population parameter is within two standard deviations of our observed sample
estimate. Jointly, the p-value and the CI give us a good idea of the probability of our result and
how close it is from the corresponding population parameter.
General Linear Model
Most inferential statistical procedures in social science research are derived from a
general family of statistical models called the general linear model (GLM). A model is an
estimated mathematical equation that can be used to represent a set of data, and linear refers to
a straight line. Hence, a GLM is a system of equations that can be used to represent linear
patterns of relationships in observed data.
Figure 15.1. Two-variable linear model
The simplest type of GLM is a two-variable linear model that examines the relationship
between one independent variable (the cause or predictor) and one dependent variable (the
effect or outcome). Let us assume that these two variables are age and self-esteem respectively.
The bivariate scatterplot for this relationship is shown in Figure 15.1, with age (predictor)
along the horizontal or x-axis and self-esteem (outcome) along the vertical or y-axis. From the
scatterplot, it appears that individual observations representing combinations of age and selfesteem
generally seem to be scattered around an imaginary upward sloping straight line.
The easiest way to test for the above hypothesis is to look up critical values of r from
statistical tables available in any standard text book on statistics or on the Internet (most
software programs also perform significance testing). The critical value of r depends on our
desired significance level (α = 0.05), the degrees of freedom (df), and whether the desired test is
a one-tailed or two-tailed test. The degree of freedom is the number of values that can vary
freely in any calculation of a statistic. In case of correlation, the df simply equals n – 2, or for the
data in Table 14.1, df is 20 – 2 = 18. There are two different statistical tables for one-tailed and
two-tailed test. In the two-tailed table, the critical value of r for α = 0.05 and df = 18 is 0.44. For
our computed correlation of 0.79 to be significant, it must be larger than the critical value of
0.44 or less than -0.44. Since our computed value of 0.79 is greater than 0.44, we conclude that
there is a significant correlation between age and self-esteem in our data set, or in other words,
the odds are less than 5% that this correlation is a chance occurrence. Therefore, we can reject
the null hypotheses that r ≤ 0, which is an indirect way of saying that the alternative hypothesis
r > 0 is probably correct.
Most research studies involve more than two variables. If there are n variables, then we
will have a total of n*(n-1)/2 possible correlations between these n variables. Such correlations
are easily computed using a software program like SPSS, rather than manually using the
formula for correlation (as we did in Table 14.1), and represented using a correlation matrix, as
shown in Table 14.2. A correlation matrix is a matrix that lists the variable names along the
first row and the first column, and depicts bivariate correlations between pairs of variables in
the appropriate cell in the matrix. The values along the principal diagonal (from the top left to
the bottom right corner) of this matrix are always 1, because any variable is always perfectly
correlated with itself. Further, since correlations are non-directional, the correlation between
variables V1 and V2 is the same as that between V2 and V1. Hence, the lower triangular matrix
(values below the principal diagonal) is a mirror reflection of the upper triangular matrix
(values above the principal diagonal), and therefore, we often list only the lower triangular
matrix for simplicity. If the correlations involve variables measured using interval scales, then
this specific type of correlations are called Pearson product moment correlations.
Another useful way of presenting bivariate data is cross-tabulation (often abbreviated
to cross-tab, and sometimes called more formally as a contingency table). A cross-tab is a table
that describes the frequency (or percentage) of all combinations of two or more nominal or
categorical variables. As an example, let us assume that we have the following observations of
gender and grade for a sample of 20 students, as shown in Figure 14.3. Gender is a nominal
variable (male/female or M/F), and grade is a categorical variable with three levels (A, B, and
C). A simple cross-tabulation of the data may display the joint distribution of gender and grades
(i.e., how many students of each gender are in each grade category, as a raw frequency count or
as a percentage) in a 2 x 3 matrix. This matrix will help us see if A, B, and C grades are equally
126 | S o c i a l S c i e n c e R e s e a r c h
distributed across male and female students. The cross-tab data in Table 14.3 shows that the
distribution of A grades is biased heavily toward female students: in a sample of 10 male and 10
female students, five female students received the A grade compared to only one male students.
In contrast, the distribution of C grades is biased toward male students: three male students
received a C grade, compared to only one female student. However, the distribution of B grades
was somewhat uniform, with six male students and five female students. The last row and the
last column of this table are called marginal totals because they indicate the totals across each
category and displayed along the margins of the table.
Table 14.2. A hypothetical correlation matrix for eight variables
Table 14.3. Example of cross-tab analysis
Although we can see a distinct pattern of grade distribution between male and female
students in Table 14.3, is this pattern real or “statistically significant”? In other words, do the
above frequency counts differ from that that may be expected from pure chance? To answer
this question, we should compute the expected count of observation in each cell of the 2 x 3
cross-tab matrix. This is done by multiplying the marginal column total and the marginal row
total for each cell and dividing it by the total number of observations. For example, for the
male/A grade cell, expected count = 5 * 10 / 20 = 2.5. In other words, we were expecting 2.5
male students to receive an A grade, but in reality, only one student received the A grade.
Whether this difference between expected and actual count is significant can be tested using a
chi-square test. The chi-square statistic can be computed as the average difference between
Bivariate Analysis
Bivariate analysis examines how two variables are related to each other. The most
common bivariate statistic is the bivariate correlation (often, simply called “correlation”),
which is a number between -1 and +1 denoting the strength of the relationship between two
variables. Let’s say that we wish to study how age is related to self-esteem in a sample of 20
respondents, i.e., as age increases, does self-esteem increase, decrease, or remains unchanged.
If self-esteem increases, then we have a positive correlation between the two variables, if selfesteem
decreases, we have a negative correlation, and if it remains the same, we have a zero
correlation. To calculate the value of this correlation, consider the hypothetical dataset shown
in Table 14.1.
Q u a n t i t a t i v e A n a l y s i s : D e s c r i p t i v e S t a t i s t i c s | 123
Figure 14.2. Normal distribution
Table 14.1. Hypothetical data on age and self-esteem
The two variables in this dataset are age (x) and self-esteem (y). Age is a ratio-scale
variable, while self-esteem is an average score computed from a multi-item self-esteem scale
measured using a 7-point Likert scale, ranging from “strongly disagree” to “strongly agree.” The
histogram of each variable is shown on the left side of Figure 14.3. The formula for calculating
bivariate correlation is:
where rxy is the correlation, x and y are the sample means of x and y, and sx and sy are
the standard deviations of x and y. The manually computed value of correlation between age
and self-esteem, using the above formula as shown in Table 14.1, is 0.79. This figure indicates
124 | S o c i a l S c i e n c e R e s e a r c h
that age has a strong positive correlation with self-esteem, i.e., self-esteem tends to increase
with increasing age, and decrease with decreasing age. Such pattern can also be seen from
visually comparing the age and self-esteem histograms shown in Figure 14.3, where it appears
that the top of the two histograms generally follow each other. Note here that the vertical axes
in Figure 14.3 represent actual observation values, and not the frequency of observations (as
was in Figure 14.1), and hence, these are not frequency distributions but rather histograms.
The bivariate scatter plot in the right panel of Figure 14.3 is essentially a plot of self-esteem on
the vertical axis against age on the horizontal axis. This plot roughly resembles an upward
sloping line (i.e., positive slope), which is also indicative of a positive correlation. If the two
variables were negatively correlated, the scatter plot would slope down (negative slope),
implying that an increase in age would be related to a decrease in self-esteem and vice versa. If
the two variables were uncorrelated, the scatter plot would approximate a horizontal line (zero
slope), implying than an increase in age would have no systematic bearing on self-esteem.
Figure 14.3. Histogram and correlation plot of age and self-esteem
After computing bivariate correlation, researchers are often interested in knowing
whether the correlation is significant (i.e., a real one) or caused by mere chance. Answering
such a question would require testing the following hypothesis:
H0: r = 0
H1: r ≠ 0
H0 is called the null hypotheses, and H1 is called the alternative hypothesis (sometimes,
also represented as Ha). Although they may seem like two hypotheses, H0 and H1 actually
represent a single hypothesis since they are direct opposites of each other. We are interested in
testing H1 rather than H0. Also note that H1 is a non-directional hypotheses since it does not
specify whether r is greater than or less than zero. Directional hypotheses will be specified as
H0: r ≤ 0; H1: r > 0 (if we are testing for a positive correlation). Significance testing of directional
hypothesis is done using a one-tailed t-test, while that for non-directional hypothesis is done
using a two-tailed t-test.
Q u a n t i t a t i v e A n a l y s i s : D e s c r i p t i v e S t a t i s t i c s | 125
In statistical testing, the alternative hypothesis cannot be tested directly. Rather, it is
tested indirectly by rejecting the null hypotheses with a certain level of probability. Statistical
testing is always probabilistic, because we are never sure if our inferences, based on sample
data, apply to the population, since our sample never equals the population. The probability
that a statistical inference is caused pure chance is called the p-value. The p-value is compared
with the significance level (α), which represents the maximum level of risk that we are willing
to take that our inference is incorrect. For most statistical analysis, α is set to 0.05. A p-value
less than α=0.05 indicates that we have enough statistical evidence to reject the null hypothesis,
and thereby, indirectly accept the alternative hypothesis. If p>0.05, then we do not have
adequate statistical evidence to reject the null hypothesis or accept the alternative hypothesis
Univariate Analysis
Univariate analysis, or analysis of a single variable, refers to a set of statistical
techniques that can describe the general properties of one variable. Univariate statistics
include: (1) frequency distribution, (2) central tendency, and (3) dispersion. The frequency
distribution of a variable is a summary of the frequency (or percentages) of individual values
or ranges of values for that variable. For instance, we can measure how many times a sample of
respondents attend religious services (as a measure of their “religiosity”) using a categorical
scale: never, once per year, several times per year, about once a month, several times per
month, several times per week, and an optional category for “did not answer.” If we count the
number (or percentage) of observations within each category (except “did not answer” which is
really a missing value rather than a category), and display it in the form of a table as shown in
Figure 14.1, what we have is a frequency distribution. This distribution can also be depicted in
the form of a bar chart, as shown on the right panel of Figure 14.1, with the horizontal axis
representing each category of that variable and the vertical axis representing the frequency or
percentage of observations within each category.
Figure 14.1. Frequency distribution of religiosity
With very large samples where observations are independent and random, the
frequency distribution tends to follow a plot that looked like a bell-shaped curve (a smoothed
bar chart of the frequency distribution) similar to that shown in Figure 14.2, where most
observations are clustered toward the center of the range of values, and fewer and fewer
observations toward the extreme ends of the range. Such a curve is called a normal distribution.
Central tendency is an estimate of the center of a distribution of values. There are
three major estimates of central tendency: mean, median, and mode. The arithmetic mean
(often simply called the “mean”) is the simple average of all values in a given distribution.
Consider a set of eight test scores: 15, 22, 21, 18, 36, 15, 25, 15. The arithmetic mean of these
values is (15 + 20 + 21 + 20 + 36 + 15 + 25 + 15)/8 = 20.875. Other types of means include
geometric mean (nth root of the product of n numbers in a distribution) and harmonic mean (the
reciprocal of the arithmetic means of the reciprocal of each value in a distribution), but these
means are not very popular for statistical analysis of social research data.
122 | S o c i a l S c i e n c e R e s e a r c h
The second measure of central tendency, the median, is the middle value within a range
of values in a distribution. This is computed by sorting all values in a distribution in increasing
order and selecting the middle value. In case there are two middle values (if there is an even
number of values in a distribution), the average of the two middle values represent the median.
In the above example, the sorted values are: 15, 15, 15, 18, 22, 21, 25, 36. The two middle
values are 18 and 22, and hence the median is (18 + 22)/2 = 20.
Lastly, the mode is the most frequently occurring value in a distribution of values. In
the previous example, the most frequently occurring value is 15, which is the mode of the above
set of test scores. Note that any value that is estimated from a sample, such as mean, median,
mode, or any of the later estimates are called a statistic.
Dispersion refers to the way values are spread around the central tendency, for
example, how tightly or how widely are the values clustered around the mean. Two common
measures of dispersion are the range and standard deviation. The range is the difference
between the highest and lowest values in a distribution. The range in our previous example is
36-15 = 21.
The range is particularly sensitive to the presence of outliers. For instance, if the
highest value in the above distribution was 85 and the other vales remained the same, the range
would be 85-15 = 70. Standard deviation, the second measure of dispersion, corrects for such
outliers by using a formula that takes into account how close or how far each value from the
distribution mean:
where σ is the standard deviation, xi is the ith observation (or value), µ is the arithmetic mean, n
is the total number of observations, and Σ means summation across all observations. The
square of the standard deviation is called the variance of a distribution. In a normally
distributed frequency distribution, it is seen that 68% of the observations lie within one
standard deviation of
Quantitative Analysis:
Descriptive Statistics
Numeric data collected in a research project can be analyzed quantitatively using
statistical tools in two different ways. Descriptive analysis refers to statistically describing,
aggregating, and presenting the constructs of interest or associations between these constructs.
Inferential analysis refers to the statistical testing of hypotheses (theory testing). In this
chapter, we will examine statistical techniques used for descriptive analysis, and the next
chapter will examine statistical techniques for inferential analysis. Much of today’s quantitative
data analysis is conducted using software programs such as SPSS or SAS. Readers are advised
to familiarize themselves with one of these programs for understanding the concepts described
in this chapter.
Data Preparation
In research projects, data may be collected from a variety of sources: mail-in surveys,
interviews, pretest or posttest experimental data, observational data, and so forth. This data
must be converted into a machine-readable, numeric format, such as in a spreadsheet or a text
file, so that they can be analyzed by computer programs like SPSS or SAS. Data preparation
usually follows the following steps.
Data coding. Coding is the process of converting data into numeric format. A codebook
should be created to guide the coding process. A codebook is a comprehensive document
containing detailed description of each variable in a research study, items or measures for that
variable, the format of each item (numeric, text, etc.), the response scale for each item (i.e.,
whether it is measured on a nominal, ordinal, interval, or ratio scale; whether such scale is a
five-point, seven-point, or some other type of scale), and how to code each value into a numeric
format. For instance, if we have a measurement item on a seven-point Likert scale with anchors
ranging from “strongly disagree” to “strongly agree”, we may code that item as 1 for strongly
disagree, 4 for neutral, and 7 for strongly agree, with the intermediate anchors in between.
Nominal data such as industry type can be coded in numeric form using a coding scheme such
as: 1 for manufacturing, 2 for retailing, 3 for financial, 4 for healthcare, and so forth (of course,
nominal data cannot be analyzed statistically). Ratio scale data such as age, income, or test
scores can be coded as entered by the respondent. Sometimes, data may need to be aggregated
into a different form than the format used for data collection. For instance, for measuring a
construct such as “benefits of computers,” if a survey provided respondents with a checklist of
120 | S o c i a l S c i e n c e R e s e a r c h
benefits that they could select from (i.e., they could choose as many of those benefits as they
wanted), then the total number of checked items can be used as an aggregate measure of
benefits. Note that many other forms of data, such as interview transcripts, cannot be
converted into a numeric format for statistical analysis. Coding is especially important for large
complex studies involving many variables and measurement items, where the coding process is
conducted by different people, to help the coding team code data in a consistent manner, and
also to help others understand and interpret the coded data.
Data entry. Coded data can be entered into a spreadsheet, database, text file, or
directly into a statistical program like SPSS. Most statistical programs provide a data editor for
entering data. However, these programs store data in their own native format (e.g., SPSS stores
data as .sav files), which makes it difficult to share that data with other statistical programs.
Hence, it is often better to enter data into a spreadsheet or database, where they can be
reorganized as needed, shared across programs, and subsets of data can be extracted for
analysis. Smaller data sets with less than 65,000 observations and 256 items can be stored in a
spreadsheet such as Microsoft Excel, while larger dataset with millions of observations will
require a database. Each observation can be entered as one row in the spreadsheet and each
measurement item can be represented as one column. The entered data should be frequently
checked for accuracy, via occasional spot checks on a set of items or observations, during and
after entry. Furthermore, while entering data, the coder should watch out for obvious evidence
of bad data, such as the respondent selecting the “strongly agree” response to all items
irrespective of content, including reverse-coded items. If so, such data can be entered but
should be excluded from subsequent analysis.
Missing values. Missing data is an inevitable part of any empirical data set.
Respondents may not answer certain questions if they are ambiguously worded or too
sensitive. Such problems should be detected earlier during pretests and corrected before the
main data collection process begins. During data entry, some statistical programs automatically
treat blank entries as missing values, while others require a specific numeric value such as -1 or
999 to be entered to denote a missing value. During data analysis, the default mode of handling
missing values in most software programs is to simply drop the entire observation containing
even a single missing value, in a technique called listwise deletion. Such deletion can
significantly shrink the sample size and make it extremely difficult to detect small effects.
Hence, some software programs allow the option of replacing missing values with an estimated
value via a process called imputation. For instance, if the missing value is one item in a multiitem
scale, the imputed value may be the average of the respondent’s responses to remaining
items on that scale. If the missing value belongs to a single-item scale, many researchers use the
average of other respondent’s responses to that item as the imputed value. Such imputation
may be biased if the missing value is of a systematic nature rather than a random nature. Two
methods that can produce relatively unbiased estimates for imputation are the maximum
likelihood procedures and multiple imputation methods, both of which are supported in
popular software programs such as SPSS and SAS.
Data transformation. Sometimes, it is necessary to transform data values before they
can be meaningfully interpreted. For instance, reverse coded items, where items convey the
opposite meaning of that of their underlying construct, should be reversed (e.g., in a 1-7 interval
scale, 8 minus the observed value will reverse the value) before they can be compared or
combined with items that are not reverse coded. Other kinds of transformations may include
creating scale measures by adding individual scale items, creating a weighted index from a set
Q u a n t i t a t i v e A n a l y s i s : D e s c r i p t i v e S t a t i s t i c s | 121
of observed measures, and collapsing multiple values into fewer categories (e.g., collapsing
incomes into income ranges).
Hermeneutic Analysis
Hermeneutic analysis is a special type of content analysis where the researcher tries to
“interpret” the subjective meaning of a given text within its socio-historic context. Unlike
grounded theory or content analysis, which ignores the context and meaning of text documents
during the coding process, hermeneutic analysis is a truly interpretive technique for analyzing
qualitative data. This method assumes that written texts narrate an author’s experience within
a socio-historic context, and should be interpreted as such within that context. Therefore, the
researcher continually iterates between singular interpretation of the text (the part) and a
holistic understanding of the context (the whole) to develop a fuller understanding of the
phenomenon in its situated context, which German philosopher Martin Heidegger called the
20 Schilling, J. (2006). “On the Pragmatics of Qualitative Assessment: Designing the Process for Content
Analysis,” European Journal of Psychological Assessment (22:1), 28-37.
Q u a l i t a t i v e A n a l y s i s | 117
hermeneutic circle. The word hermeneutic (singular) refers to one particular method or strand
of interpretation.
More generally, hermeneutics is the study of interpretation and the theory and practice
of interpretation. Derived from religious studies and linguistics, traditional hermeneutics, such
as biblical hermeneutics, refers to the interpretation of written texts, especially in the areas of
literature, religion and law (such as the Bible). In the 20th century, Heidegger suggested that a
more direct, non-mediated, and authentic way of understanding social reality is to experience it,
rather than simply observe it, and proposed philosophical hermeneutics, where the focus shifted
from interpretation to existential understanding. Heidegger argued that texts are the means by
which readers can not only read about an author’s experience, but also relive the author’s
experiences. Contemporary or modern hermeneutics, developed by Heidegger’s students such
as Hans-Georg Gadamer, further examined the limits of written texts for communicating social
experiences, and went on to propose a framework of the interpretive process, encompassing all
forms of communication, including written, verbal, and non-verbal, and exploring issues that
restrict the communicative ability of written texts, such as presuppositions, language structures
(e.g., grammar, syntax, etc.), and semiotics (the study of written signs such as symbolism,
metaphor, analogy, and sarcasm). The term hermeneutics is sometimes used interchangeably
and inaccurately with exegesis, which refers to the interpretation or critical explanation of
written text only and especially religious texts.
Conclusions
Finally, standard software programs, such as ATLAS.ti.5, NVivo, and QDA Miner, can be
used to automate coding processes in qualitative research methods. These programs can
quickly and efficiently organize, search, sort, and process large volumes of text data using userdefined
rules. To guide such automated analysis, a coding schema should be created, specifying
the keywords or codes to search for in the text, based on an initial manual examination of
sample text data. The schema can be organized in a hierarchical manner to organize codes into
higher-order codes or constructs. The coding schema should be validated using a different
sample of texts for accuracy and adequacy. However, if the coding schema is biased or
incorrect, the resulting analysis of the entire population of text may be flawed and noninterpretable.
However, software programs cannot decipher the meaning behind the certain
words or phrases or the context within which these words or phrases are used (such as those in
sarcasms or metaphors),
selectively sampled to validate the central category and its relationships to other categories
(i.e., the tentative theory). Selective coding limits the range of analysis, and makes it move fast.
At the same time, the coder must watch out for other categories that may emerge from the new
data that may be related to the phenomenon of interest (open coding), which may lead to
further refinement of the initial theory. Hence, open, axial, and selective coding may proceed
simultaneously. Coding of new data and theory refinement continues until theoretical
saturation is reached, i.e., when additional data does not yield any marginal change in the core
categories or the relationships.
The “constant comparison” process implies continuous rearrangement, aggregation, and
refinement of categories, relationships, and interpretations based on increasing depth of
understanding, and an iterative interplay of four stages of activities: (1) comparing
incidents/texts assigned to each category (to validate the category), (2) integrating categories
and their properties, (3) delimiting the theory (focusing on the core concepts and ignoring less
relevant concepts), and (4) writing theory (using techniques like memoing, storylining, and
diagramming that are discussed in the next chapter). Having a central category does not
necessarily mean that all other categories can be integrated nicely around it. In order to
identify key categories that are conditions, action/interactions, and consequences of the core
category, Strauss and Corbin (1990) recommend several integration techniques, such as
storylining, memoing, or concept mapping. In storylining, categories and relationships are
used to explicate and/or refine a story of the observed phenomenon. Memos are theorized
write-ups of ideas about substantive concepts and their theoretically coded relationships as
they evolve during ground theory analysis, and are important tools to keep track of and refine
ideas that develop during the analysis. Memoing is the process of using these memos to
discover patterns and relationships between categories using two-by-two tables, diagrams, or
figures, or other illustrative displays. Concept mapping is a graphical representation of
concepts and relationships between those concepts (e.g., using boxes and arrows). The major
concepts are typically laid out on one or more sheets of paper, blackboards, or using graphical
software programs, linked to each other using arrows, and readjusted to best fit the observed
data.
After a grounded theory is generated, it must be refined for internal consistency and
logic. Researchers must ensure that the central construct has the stated characteristics and
dimensions, and if not, the data analysis may be repeated. Researcher must then ensure that
the characteristics and dimensions of all categories show variation. For example, if behavior
frequency is one such category, then the data must provide evidence of both frequent
performers and infrequent performers of the focal behavior. Finally, the theory must be
validated by comparing it with raw data. If the theory contradicts with observed evidence, the
coding process may be repeated to reconcile such contradictions or unexplained variations.
Content Analysis
Content analysis is the systematic analysis of the content of a text (e.g., who says what,
to whom, why, and to what extent and with what effect) in a quantitative or qualitative manner.
Content analysis typically conducted as follows. First, when there are many texts to analyze
(e.g., newspaper stories, financial reports, blog postings, online reviews, etc.), the researcher
begins by sampling a selected set of texts from the population of texts for analysis. This process
is not random, but instead, texts that have more pertinent content should be chosen selectively.
Second, the researcher identifies and applies rules to divide each text into segments or “chunks”
that can be treated as separate units of analysis. This process is called unitizing. For example,
116 | S o c i a l S c i e n c e R e s e a r c h
assumptions, effects, enablers, and barriers in texts may constitute such units. Third, the
researcher constructs and applies one or more concepts to each unitized text segment in a
process called coding. For coding purposes, a coding scheme is used based on the themes the
researcher is searching for or uncovers as she classifies the text. Finally, the coded data is
analyzed, often both quantitatively and qualitatively, to determine which themes occur most
frequently, in what contexts, and how they are related to each other.
A simple type of content analysis is sentiment analysis – a technique used to capture
people’s opinion or attitude toward an object, person, or phenomenon. Reading online
messages about a political candidate posted on an online forum and classifying each message as
positive, negative, or neutral is an example of such an analysis. In this case, each message
represents one unit of analysis. This analysis will help identify whether the sample as a whole
is positively or negatively disposed or neutral towards that candidate. Examining the content of
online reviews in a similar manner is another example. Though this analysis can be done
manually, for very large data sets (millions of text records), natural language processing and
text analytics based software programs are available to automate the coding process, and
maintain a record of how people sentiments fluctuate with time.
A frequent criticism of content analysis is that it lacks a set of systematic procedures
that would allow the analysis to be replicated by other researchers. Schilling (2006)20
addressed this criticism by organizing different content analytic procedures into a spiral model.
This model consists of five levels or phases in interpreting text: (1) convert recorded tapes into
raw text data or transcripts for content analysis, (2) convert raw data into condensed protocols,
(3) convert condensed protocols into a preliminary category system, (4) use the preliminary
category system to generate coded protocols, and (5) analyze coded protocols to generate
interpretations about the phenomenon of interest.
Content analysis has several limitations. First, the coding process is restricted to the
information available in text form. For instance, if a researcher is interested in studying
people’s views on capital punishment, but no such archive of text documents is available, then
the analysis cannot be done. Second, sampling must be done carefully to avoid sampling bias.
For instance, if your population is the published research literature on a given topic, then you
have systematically omitted unpublished research or the most recent work that is yet to be
published.
Qualitative Analysis
Qualitative analysis is the analysis of qualitative data such as text data from interview
transcripts. Unlike quantitative analysis, which is statistics driven and largely independent of
the researcher, qualitative analysis is heavily dependent on the researcher’s analytic and
integrative skills and personal knowledge of the social context where the data is collected. The
emphasis in qualitative analysis is “sense making” or understanding a phenomenon, rather than
predicting or explaining. A creative and investigative mindset is needed for qualitative analysis,
based on an ethically enlightened and participant-in-context attitude, and a set of analytic
strategies. This chapter provides a brief overview of some of these qualitative analysis
strategies. Interested readers are referred to more authoritative and detailed references such
as Miles and Huberman’s (1984)17 seminal book on this topic.
Grounded Theory
How can you analyze a vast set qualitative data acquired through participant
observation, in-depth interviews, focus groups, narratives of audio/video recordings, or
secondary documents? One of these techniques for analyzing text data is grounded theory –
an inductive technique of interpreting recorded data about a social phenomenon to build
theories about that phenomenon. The technique was developed by Glaser and Strauss (1967)18
in their method of constant comparative analysis of grounded theory research, and further
refined by Strauss and Corbin (1990)19 to further illustrate specific coding techniques – a
process of classifying and categorizing text data segments into a set of codes (concepts),
categories (constructs), and relationships. The interpretations are “grounded in” (or based on)
observed empirical data, hence the name. To ensure that the theory is based solely on observed
evidence, the grounded theory approach requires that researchers suspend any preexisting
theoretical expectations or biases before data analysis, and let the data dictate the formulation
of the theory.
Strauss and Corbin (1998) describe three coding techniques for analyzing text data:
open, axial, and selective. Open coding is a process aimed at identifying concepts or key ideas
17 Miles M. B., Huberman A. M. (1984). Qualitative Data Analysis: A Sourcebook of New Methods. Newbury
Park, CA: Sage Publications.
18 Glaser, B. and Strauss, A. (1967). The Discovery of Grounded Theory: Strategies for Qualitative Research,
Chicago: Aldine.
19 Strauss, A. and Corbin, J. (1990). Basics of Qualitative Research: Grounded Theory Procedures and
Techniques, Beverly Hills, CA: Sage Publications.
114 | S o c i a l S c i e n c e R e s e a r c h
that are hidden within textual data, which are potentially related to the phenomenon of interest.
The researcher examines the raw textual data line by line to identify discrete events, incidents,
ideas, actions, perceptions, and interactions of relevance that are coded as concepts (hence
called in vivo codes). Each concept is linked to specific portions of the text (coding unit) for later
validation. Some concepts may be simple, clear, and unambiguous while others may be
complex, ambiguous, and viewed differently by different participants. The coding unit may vary
with the concepts being extracted. Simple concepts such as “organizational size” may include
just a few words of text, while complex ones such as “organizational mission” may span several
pages. Concepts can be named using the researcher’s own naming convention or standardized
labels taken from the research literature. Once a basic set of concepts are identified, these
concepts can then be used to code the remainder of the data, while simultaneously looking for
new concepts and refining old concepts. While coding, it is important to identify the
recognizable characteristics of each concept, such as its size, color, or level (e.g., high or low), so
that similar concepts can be grouped together later. This coding technique is called “open”
because the researcher is open to and actively seeking new concepts relevant to the
phenomenon of interest.
Next, similar concepts are grouped into higher order categories. While concepts may
be context-specific, categories tend to be broad and generalizable, and ultimately evolve into
constructs in a grounded theory. Categories are needed to reduce the amount of concepts the
researcher must work with and to build a “big picture” of the issues salient to understanding a
social phenomenon. Categorization can be done is phases, by combining concepts into
subcategories, and then subcategories into higher order categories. Constructs from the
existing literature can be used to name these categories, particularly if the goal of the research
is to extend current theories. However, caution must be taken while using existing constructs,
as such constructs may bring with them commonly held beliefs and biases. For each category,
its characteristics (or properties) and dimensions of each characteristic should be identified.
The dimension represents a value of a characteristic along a continuum. For example, a
“communication media” category may have a characteristic called “speed”, which can be
dimensionalized as fast, medium, or slow. Such categorization helps differentiate between
different kinds of communication media and enables researchers identify patterns in the data,
such as which communication media is used for which types of tasks.
The second phase of grounded theory is axial coding, where the categories and
subcategories are assembled into causal relationships or hypotheses that can tentatively
explain the phenomenon of interest. Although distinct from open coding, axial coding can be
performed simultaneously with open coding. The relationships between categories may be
clearly evident in the data or may be more subtle and implicit. In the latter instance,
researchers may use a coding scheme (often called a “coding paradigm”, but different from the
paradigms discussed in Chapter 3) to understand which categories represent conditions (the
circumstances in which the phenomenon is embedded), actions/interactions (the responses of
individuals to events under these conditions), and consequences (the outcomes of actions/
interactions). As conditions, actions/interactions, and consequences are identified, theoretical
propositions start to emerge, and researchers can start explaining why a phenomenon occurs,
under what conditions, and with what consequences.
The third and final phase of grounded theory is selective coding, which involves
identifying a central category or a core variable and systematically and logically relating this
central category to other categories. The central category can evolve from existing categories
or can be a higher order category that subsumes previously coded categories.
Rigor in Interpretive Research
While positivist research employs a “reductionist” approach by simplifying social reality
into parsimonious theories and laws, interpretive research attempts to interpret social reality
through the subjective viewpoints of the embedded participants within the context where the
reality is situated. These interpretations are heavily contextualized, and are naturally less
generalizable to other contexts. However, because interpretive analysis is subjective and
sensitive to the experiences and insight of the embedded researcher, it is often considered less
rigorous by many positivist (functionalist) researchers. Because interpretive research is based
on different set of ontological and epistemological assumptions about social phenomenon than
positivist research, the positivist notions of rigor, such as reliability, internal validity, and
generalizability, do not apply in a similar manner. However, Lincoln and Guba (1985)16 provide
an alternative set of criteria that can be used to judge the rigor of interpretive research.
Dependability. Interpretive research can be viewed as dependable or authentic if two
researchers assessing the same phenomenon using the same set of evidence independently
arrive at the same conclusions or the same researcher observing the same or a similar
phenomenon at different times arrives at similar conclusions. This concept is similar to that of
reliability in positivist research, with agreement between two independent researchers being
similar to the notion of inter-rater reliability, and agreement between two observations of the
same phenomenon by the same researcher akin to test-retest reliability. To ensure
dependability, interpretive researchers must provide adequate details about their phenomenon
of interest and the social context in which it is embedded so as to allow readers to
independently authenticate their interpretive inferences.
Credibility. Interpretive research can be considered credible if readers find its
inferences to be believable. This concept is akin to that of internal validity in functionalistic
research. The credibility of interpretive research can be improved by providing evidence of the
researcher’s extended engagement in the field, by demonstrating data triangulation across
subjects or data collection techniques, and by maintaining meticulous data management and
analytic procedures, such as verbatim transcription of interviews, accurate records of contacts
and interviews, and clear notes on theoretical and methodological decisions, that can allow an
independent audit of data collection and analysis if needed.
Confirmability. Confirmability refers to the extent to which the findings reported in
interpretive research can be independently confirmed by others (typically, participants). This
is similar to the notion of objectivity in functionalistic research. Since interpretive research
rejects the notion of an objective reality, confirmability is demonstrated in terms of “inter-
16 Lincoln, Y. S., and Guba, E. G. (1985). Naturalistic Inquiry. Beverly Hills, CA: Sage Publications.
I n t e r p r e t i v e R e s e a r c h | 111
subjectivity”, i.e., if the study’s participants agree with the inferences derived by the researcher.
For instance, if a study’s participants generally agree with the inferences drawn by a researcher
about a phenomenon of interest (based on a review of the research paper or report), then the
findings can be viewed as confirmable.
Transferability. Transferability in interpretive research refers to the extent to which
the findings can be generalized to other settings. This idea is similar to that of external validity
in functionalistic research. The researcher must provide rich, detailed descriptions of the
research context (“thick description”) and thoroughly describe the structures, assumptions, and
processes revealed from the data so that readers can independently assess whether and to what
extent are the reported findings transferable to other settings.
Interpretive Data Collection
Data is collected in interpretive research using a variety of techniques. The most
frequently used technique is interviews (face-to-face, telephone, or focus groups). Interview
types and strategies are discussed in detail in a previous chapter on survey research. A second
technique is observation. Observational techniques include direct observation, where the
researcher is a neutral and passive external observer and is not involved in the phenomenon of
interest (as in case research), and participant observation, where the researcher is an active
I n t e r p r e t i v e R e s e a r c h | 107
participant in the phenomenon and her inputs or mere presence influence the phenomenon
being studied (as in action research). A third technique is documentation, where external and
internal documents, such as memos, electronic mails, annual reports, financial statements,
newspaper articles, websites, may be used to cast further insight into the phenomenon of
interest or to corroborate other forms of evidence.
Interpretive Research Designs
Case research. As discussed in the previous chapter, case research is an intensive
longitudinal study of a phenomenon at one or more research sites for the purpose of deriving
detailed, contextualized inferences and understanding the dynamic process underlying a
phenomenon of interest. Case research is a unique research design in that it can be used in an
interpretive manner to build theories or in a positivist manner to test theories. The previous
chapter on case research discusses both techniques in depth and provides illustrative
exemplars. Furthermore, the case researcher is a neutral observer (direct observation) in the
social setting rather than an active participant (participant observation). As with any other
interpretive approach, drawing meaningful inferences from case research depends heavily on
the observational skills and integrative abilities of the researcher.
Action research. Action research is a qualitative but positivist research design aimed
at theory testing rather than theory building (discussed in this chapter due to lack of a proper
space). This is an interactive design that assumes that complex social phenomena are best
understood by introducing changes, interventions, or “actions” into those phenomena and
observing the outcomes of such actions on the phenomena of interest. In this method, the
researcher is usually a consultant or an organizational member embedded into a social context
(such as an organization), who initiates an action in response to a social problem, and examines
how her action influences the phenomenon while also learning and generating insights about
the relationship between the action and the phenomenon. Examples of actions may include
organizational change programs, such as the introduction of new organizational processes,
procedures, people, or technology or replacement of old ones, initiated with the goal of
improving an organization’s performance or profitability in its business environment. The
researcher’s choice of actions must be based on theory, which should explain why and how such
actions may bring forth the desired social change. The theory is validated by the extent to
which the chosen action is successful in remedying the targeted problem. Simultaneous
problem solving and insight generation is the central feature that distinguishes action research
from other research methods (which may not involve problem solving) and from consulting
(which may not involve insight generation). Hence, action research is an excellent method for
bridging research and practice.
There are several variations of the action research method. The most popular of these
method is the participatory action research, designed by Susman and Evered (1978)13. This
method follows an action research cycle consisting of five phases: (1) diagnosing, (2) action
planning, (3) action taking, (4) evaluating, and (5) learning (see Figure 10.1). Diagnosing
involves identifying and defining a problem in its social context. Action planning involves
identifying and evaluating alternative solutions to the problem, and deciding on a future course
of action (based on theoretical rationale). Action taking is the implementation of the planned
course of action. The evaluation stage examines the extent to which the initiated action is
13 Susman, G.I. and Evered, R.D. (1978). “An Assessment of the Scientific Merits of Action Research,”
Administrative Science Quarterly, (23), 582-603.
108 | S o c i a l S c i e n c e R e s e a r c h
successful in resolving the original problem, i.e., whether theorized effects are indeed realized
in practice. In the learning phase, the experiences and feedback from action evaluation are used
to generate insights about the problem and suggest future modifications or improvements to
the action. Based on action evaluation and learning, the action may be modified or adjusted to
address the problem better, and the action research cycle is repeated with the modified action
sequence. It is suggested that the entire action research cycle be traversed at least twice so that
learning from the first cycle can be implemented in the second cycle. The primary mode of data
collection is participant observation, although other techniques such as interviews and
documentary evidence may be used to corroborate the researcher’s observations.
Figure 10.1. Action research cycle
Ethnography. The ethnographic research method, derived largely from the field of
anthropology, emphasizes studying a phenomenon within the context of its culture. The
researcher must be deeply immersed in the social culture over an extended period of time
(usually 8 months to 2 years) and should engage, observe, and record the daily life of the
studied culture and its social participants within their natural setting. The primary mode of
data collection is participant observation, and data analysis involves a “sense-making”
approach. In addition, the researcher must take extensive field notes, and narrate her
experience in descriptive detail so that readers may experience the same culture as the
researcher. In this method, the researcher has two roles: rely on her unique knowledge and
engagement to generate insights (theory), and convince the scientific community of the transsituational
nature of the studied phenomenon.
The classic example of ethnographic research is Jane Goodall’s study of primate
behaviors, where she lived with chimpanzees in their natural habitat at Gombe National Park in
Tanzania, observed their behaviors, interacted with them, and shared their lives. During that
process, she learnt and chronicled how chimpanzees seek food and shelter, how they socialize
with each other, their communication patterns, their mating behaviors, and so forth. A more
contemporary example of ethnographic research is Myra Bluebond-Langer’s (1996)14 study of
decision making in families with children suffering from life-threatening illnesses, and the
physical, psychological, environmental, ethical, legal, and cultural issues that influence such
decision-making. The researcher followed the experiences of approximately 80 children with
14 Bluebond-Langer, M. (1996). In the Shadow of Illness: Parents and Siblings of the Chronically Ill Child.
Princeton, NJ: Princeton University Press.
I n t e r p r e t i v e R e s e a r c h | 109
incurable illnesses and their families for a period of over two years. Data collection involved
participant observation and formal/informal conversations with children, their parents and
relatives, and health care providers to document their lived experience.
Phenomenology. Phenomenology is a research method that emphasizes the study of
conscious experiences as a way of understanding the reality around us. It is based on the ideas
of German philosopher Edmund Husserl in the early 20th century who believed that human
experience is the source of all knowledge. Phenomenology is concerned with the systematic
reflection and analysis of phenomena associated with conscious experiences, such as human
judgment, perceptions, and actions, with the goal of (1) appreciating and describing social
reality from the diverse subjective perspectives of the participants involved, and (2)
understanding the symbolic meanings (“deep structure”) underlying these subjective
experiences. Phenomenological inquiry requires that researchers eliminate any prior
assumptions and personal biases, empathize with the participant’s situation, and tune into
existential dimensions of that situation, so that they can fully understand the deep structures
that drives the conscious thinking, feeling, and behavior of the studied participants
Benefits and Challenges of Interpretive Research
Interpretive research has several unique advantages. First, they are well-suited for
exploring hidden reasons behind complex, interrelated, or multifaceted social processes, such
as inter-firm relationships or inter-office politics, where quantitative evidence may be biased,
inaccurate, or otherwise difficult to obtain. Second, they are often helpful for theory
construction in areas with no or insufficient a priori theory. Third, they are also appropriate for
studying context-specific, unique, or idiosyncratic events or processes. Fourth, interpretive
research can also help uncover interesting and relevant research questions and issues for
follow-up research.
At the same time, interpretive research also has its own set of challenges. First, this
type of research tends to be more time and resource intensive than positivist research in data
collection and analytic efforts. Too little data can lead to false or premature assumptions, while
too much data may not be effectively processed by the researcher. Second, interpretive
research requires well-trained researchers who are capable of seeing and interpreting complex
social phenomenon from the perspectives of the embedded participants and reconciling the
diverse perspectives of these participants, without injecting their personal biases or
preconceptions into their inferences. Third, all participants or data sources may not be equally
credible, unbiased, or knowledgeable about the phenomenon of interest, or may have
undisclosed political agendas, which may lead to misleading or false impressions. Inadequate
trust between participants and researcher may hinder full and honest self-representation by
participants, and such trust building takes time. It is the job of the interpretive researcher to
“see through the smoke” (hidden or biased agendas) and understand the true nature of the
problem. Fourth, given the heavily contextualized nature of inferences drawn from interpretive
research, such inferences do not lend themselves well to replicability or generalizability.
Finally, interpretive research may sometimes fail to answer the research questions of interest
or predict future behaviors.
Characteristics of Interpretive Research
All interpretive research must adhere to a common set of principles, as described below.
Naturalistic inquiry: Social phenomena must be studied within their natural setting.
Because interpretive research assumes that social phenomena are situated within and cannot
106 | S o c i a l S c i e n c e R e s e a r c h
be isolated from their social context, interpretations of such phenomena must be grounded
within their socio-historical context. This implies that contextual variables should be observed
and considered in seeking explanations of a phenomenon of interest, even though context
sensitivity may limit the generalizability of inferences.
Researcher as instrument: Researchers are often embedded within the social context
that they are studying, and are considered part of the data collection instrument in that they
must use their observational skills, their trust with the participants, and their ability to extract
the correct information. Further, their personal insights, knowledge, and experiences of the
social context is critical to accurately interpreting the phenomenon of interest. At the same
time, researchers must be fully aware of their personal biases and preconceptions, and not let
such biases interfere with their ability to present a fair and accurate portrayal of the
phenomenon.
Interpretive analysis: Observations must be interpreted through the eyes of the
participants embedded in the social context. Interpretation must occur at two levels. The first
level involves viewing or experiencing the phenomenon from the subjective perspectives of the
social participants. The second level is to understand the meaning of the participants’
experiences in order to provide a “thick description” or a rich narrative story of the
phenomenon of interest that can communicate why participants acted the way they did.
Use of expressive language: Documenting the verbal and non-verbal language of
participants and the analysis of such language are integral components of interpretive analysis.
The study must ensure that the story is viewed through the eyes of a person, and not a machine,
and must depict the emotions and experiences of that person, so that readers can understand
and relate to that person. Use of imageries, metaphors, sarcasm, and other figures of speech is
very common in interpretive analysis.
Temporal nature: Interpretive research is often not concerned with searching for
specific answers, but with understanding or “making sense of” a dynamic social process as it
unfolds over time. Hence, such research requires an immersive involvement of the researcher
at the study site for an extended period of time in order to capture the entire evolution of the
phenomenon of interest.
Hermeneutic circle: Interpretive interpretation is an iterative process of moving back
and forth from pieces of observations (text) to the entirety of the social phenomenon (context)
to reconcile their apparent discord and to construct a theory that is consistent with the diverse
subjective viewpoints and experiences of the embedded participants. Such iterations between
the understanding/meaning of a phenomenon and observations must continue until
“theoretical saturation” is reached, whereby any additional iteration does not yield any more
insight into the phenomenon of interes
The last chapter introduced interpretive research, or more specifically, interpretive case
research. This chapter will explore other kinds of interpretive research. Recall that positivist
or deductive methods, such as laboratory experiments and survey research, are those that are
specifically intended for theory (or hypotheses) testing, while interpretive or inductive
methods, such as action research and ethnography, are intended for theory building. Unlike a
positivist method, where the researcher starts with a theory and tests theoretical postulates
using empirical data, in interpretive methods, the researcher starts with data and tries to derive
a theory about the phenomenon of interest from the observed data.
The term “interpretive research” is often used loosely and synonymously with
“qualitative research”, although the two concepts are quite different. Interpretive research is a
research paradigm (see Chapter 3) that is based on the assumption that social reality is not
singular or objective, but is rather shaped by human experiences and social contexts (ontology),
and is therefore best studied within its socio-historic context by reconciling the subjective
interpretations of its various participants (epistemology). Because interpretive researchers
view social reality as being embedded within and impossible to abstract from their social
settings, they “interpret” the reality though a “sense-making” process rather than a hypothesis
testing process. This is in contrast to the positivist or functionalist paradigm that assumes that
the reality is relatively independent of the context, can be abstracted from their contexts, and
studied in a decomposable functional manner using objective techniques such as standardized
measures. Whether a researcher should pursue interpretive or positivist research depends on
paradigmatic considerations about the nature of the phenomenon under consideration and the
best way to study it.
However, qualitative versus quantitative research refers to empirical or data-oriented
considerations about the type of data to collect and how to analyze them. Qualitative research
relies mostly on non-numeric data, such as interviews and observations, in contrast to
quantitative research which employs numeric data such as scores and metrics. Hence,
qualitative research is not amenable to statistical procedures such as regression analysis, but is
coded using techniques like content analysis. Sometimes, coded qualitative data is tabulated
quantitatively as frequencies of codes, but this data is not statistically analyzed. Many puritan
interpretive researchers reject this coding approach as a futile effort to seek consensus or
objectivity in a social phenomenon which is essentially subjective.
Although interpretive research tends to rely heavily on qualitative data, quantitative
data may add more precision and clearer understanding of the phenomenon of interest than
104 | S o c i a l S c i e n c e R e s e a r c h
qualitative data. For example, Eisenhardt (1989), in her interpretive study of decision making n
high-velocity firms (discussed in the previous chapter on case research), collected numeric data
on how long it took each firm to make certain strategic decisions (which ranged from 1.5
months to 18 months), how many decision alternatives were considered for each decision, and
surveyed her respondents to capture their perceptions of organizational conflict. Such numeric
data helped her clearly distinguish the high-speed decision making firms from the low-speed
decision makers, without relying on respondents’ subjective perceptions, which then allowed
her to examine the number of decision alternatives considered by and the extent of conflict in
high-speed versus low-speed firms. Interpretive research should attempt to collect both
qualitative and quantitative data pertaining to their phenomenon of interest, and so should
positivist research as well. Joint use of qualitative and quantitative data, often called “mixedmode
designs”, may lead to unique insights and are highly prized in the scientific community.
Interpretive research has its roots in anthropology, sociology, psychology, linguistics,
and semiotics, and has been available since the early 19th century, long before positivist
techniques were developed. Many positivist researchers view interpretive research as
erroneous and biased, given the subjective nature of the qualitative data collection and
interpretation process employed in such research. However, the failure of many positivist
techniques to generate interesting insights or new knowledge have resulted in a resurgence of
interest in interpretive research since the 1970’s, albeit with exacting methods and stringent
criteria to ensure the reliability and validity of interpretive inferences.
Distinctions from Positivist Research
In addition to fundamental paradigmatic differences in ontological and epistemological
assumptions discussed above, interpretive and positivist research differ in several other ways.
First, interpretive research employs a theoretical sampling strategy, where study sites,
respondents, or cases are selected based on theoretical considerations such as whether they fit
the phenomenon being studied (e.g., sustainable practices can only be studied in organizations
that have implemented sustainable practices), whether they possess certain characteristics that
make them uniquely suited for the study (e.g., a study of the drivers of firm innovations should
include some firms that are high innovators and some that are low innovators, in order to draw
contrast between these firms), and so forth. In contrast, positivist research employs random
sampling (or a variation of this technique), where cases are chosen randomly from a population,
for purposes of generalizability. Hence, convenience samples and small samples are considered
acceptable in interpretive research as long as they fit the nature and purpose of the study, but
not in positivist research.
Second, the role of the researcher receives critical attention in interpretive research. In
some methods such as ethnography, action research, and participant observation, the
researcher is considered part of the social phenomenon, and her specific role and involvement
in the research process must be made clear during data analysis. In other methods, such as case
research, the researcher must take a “neutral” or unbiased stance during the data collection and
analysis processes, and ensure that her personal biases or preconceptions does not taint the
nature of subjective inferences derived from interpretive research. In positivist research,
however, the researcher is considered to be external to and independent of the research context
and is not presumed to bias the data collection and analytic procedures.
Third, interpretive analysis is holistic and contextual, rather than being reductionist and
isolationist. Interpretive interpretations tend to focus on language, signs, and meanings from
Positivist Case Research Exemplar
Case research can also be used in a positivist manner to test theories or hypotheses.
Such studies are rare, but Markus (1983)12 provides an exemplary illustration in her study of
technology implementation at the Golden Triangle Company (a pseudonym). The goal of this
study was to understand why a newly implemented financial information system (FIS),
12 Markus, M. L. (1983). “Power, Politics, and MIS Implementation,” Communications of the ACM (26:6),
430-444.
100 | S o c i a l S c i e n c e R e s e a r c h
intended to improve the productivity and performance of accountants at GTC was supported by
accountants at GTC’s corporate headquarters but resisted by divisional accountants at GTC
branches. Given the uniqueness of the phenomenon of interest, this was a single-case research
study.
To explore the reasons behind user resistance of FIS, Markus posited three alternative
explanations: (1) system-determined theory: resistance was caused by factors related to an
inadequate system, such as its technical deficiencies, poor ergonomic design, or lack of user
friendliness, (2) people-determined theory: resistance was caused by factors internal to users,
such as the accountants’ cognitive styles or personality traits that were incompatible with using
the system, and (3) interaction theory: resistance was not caused not by factors intrinsic to the
system or the people, but by the interaction between the two set of factors. Specifically,
interaction theory suggested that the FIS engendered a redistribution of intra-organizational
power, and accountants who lost organizational status, relevance, or power as a result of FIS
implementation resisted the system while those gaining power favored it.
In order to test the three theories, Markus predicted alternative outcomes expected
from each theoretical explanation and analyzed the extent to which those predictions matched
with her observations at GTC. For instance, the system-determined theory suggested that since
user resistance was caused by an inadequate system, fixing the technical problems of the
system would eliminate resistance. The computer running the FIS system was subsequently
upgraded with a more powerful operating system, online processing (from initial batch
processing, which delayed immediate processing of accounting information), and a simplified
software for new account creation by managers. One year after these changes were made, the
resistant users were still resisting the system and felt that it should be replaced. Hence, the
system-determined theory was rejected.
The people-determined theory predicted that replacing individual resistors or co-opting
them with less resistant users would reduce their resistance toward the FIS. Subsequently, GTC
started a job rotation and mobility policy, moving accountants in and out of the resistant
divisions, but resistance not only persisted, but in some cases increased! In one specific
instance, one accountant, who was one of the system’s designers and advocates when he
worked for corporate accounting, started resisting the system after he was moved to the
divisional controller’s office. Failure to realize the predictions of the people-determined theory
led to the rejection of this theory.
Finally, the interaction theory predicted that neither changing the system or the people
(i.e., user education or job rotation policies) will reduce resistance as long as the power
imbalance and redistribution from the pre-implementation phase were not addressed. Before
FIS implementation, divisional accountants at GTC felt that they owned all accounting data
related to their divisional operations. They maintained this data in thick, manual ledger books,
controlled others’ access to the data, and could reconcile unusual accounting events before
releasing those reports. Corporate accountants relied heavily on divisional accountants for
access to the divisional data for corporate reporting and consolidation. Because the FIS system
automatically collected all data at source and consolidated them into a single corporate
database, it obviated the need for divisional accountants, loosened their control and autonomy
over their division’s accounting data, and making their job somewhat irrelevant. Corporate
accountants could now query the database and access divisional data directly without going
through the divisional accountants, analyze and compare the performance of individual
divisions, and report unusual patterns and activities to the executive committee, resulting in
C a s e R e s e a r c h | 101
further erosion of the divisions’ power. Though Markus did not empirically test this theory, her
observations about the redistribution of organizational power, coupled with the rejection of the
two alternative theories, led to the justification of interaction theory.
Comparisons with Traditional Research
Positivist case research, aimed at hypotheses testing, is often criticized by natural
science researchers as lacking in controlled observations, controlled deductions, replicability,
and generalizability of findings – the traditional principles of positivist research. However,
these criticisms can be overcome through appropriate case research designs. For instance, the
problem of controlled observations refers to the difficulty of obtaining experimental or
statistical control in case research. However, case researchers can compensate for such lack of
controls by employing “natural controls.” This natural control in Markus’ (1983) study was the
corporate accountant who was one of the system advocates initially, but started resisting it
once he moved to controlling division. In this instance, the change in his behavior may be
attributed to his new divisional position. However, such natural controls cannot be anticipated
in advance, and case researchers may overlook then unless they are proactively looking for
such controls. Incidentally, natural controls are also used in natural science disciplines such as
astronomy, geology, and human biology, such as wait for comets to pass close enough to the
earth in order to make inferences about comets and their composition.
The problem of controlled deduction refers to the lack of adequate quantitative
evidence to support inferences, given the mostly qualitative nature of case research data.
Despite the lack of quantitative data for hypotheses testing (e.g., t-tests), controlled deductions
can still be obtained in case research by generating behavioral predictions based on theoretical
considerations and testing those predictions over time. Markus employed this strategy in her
study by generating three alternative theoretical hypotheses for user resistance, and rejecting
two of those predictions when they did not match with actual observed behavior. In this case,
the hypotheses were tested using logical propositions rather than using mathematical tests,
which are just as valid as statistical inferences since mathematics is a subset of logic.
Third, the problem of replicability refers to the difficulty of observing the same
phenomenon given the uniqueness and idiosyncrasy of a given case site. However, using
Markus’ three theories as an illustration, a different researcher can test the same theories at a
different case site, where three different predictions may emerge based on the idiosyncratic
nature of the new case site, and the three resulting predictions may be tested accordingly. In
other words, it is possible to replicate the inferences of case research, even if the case research
site or context may not be replicable.
Fourth, case research tends to examine unique and non-replicable phenomena that may
not be generalized to other settings. Generalizability in natural sciences is established through
additional studies. Likewise, additional case studies conducted in different contexts with
different predictions can establish generalizability of findings if such findings are observed to
be consistent across studies.
Lastly, British philosopher Karl Popper described four requirements of scientific
theories: (1) theories should be falsifiable, (2) they should be logically consistent, (3) they
should have adequate predictive ability, and (4) they should provide better explanation than
rival theories. In case research, the first three requirements can be increased by increasing the
degrees of freedom of observed findings, such as by increasing the number of case sites, the
102 | S o c i a l S c i e n c e R e s e a r c h
number of alternative predictions, and the number of levels of analysis examined. This was
accomplished in Markus’ study by examining the behavior of multiple groups (divisional
accountants and corporate accountants) and providing multiple (three) rival explanations.
Popper’s fourth condition was accomplished in this study when one hypothesis was found to
match observed evidence better than the two rival hypothese
Reviewing the prior literature on executive decision-making, Eisenhardt found several
patterns, although none of these patterns were specific to high-velocity environments. The
literature suggested that in the interest of expediency, firms that make faster decisions obtain
input from fewer sources, consider fewer alternatives, make limited analysis, restrict user
participation in decision-making, centralize decision-making authority, and has limited internal
conflicts. However, Eisenhardt contended that these views may not necessarily explain how
decision makers make decisions in high-velocity environments, where decisions must be made
quickly and with incomplete information, while maintaining high decision quality.
To examine this phenomenon, Eisenhardt conducted an inductive study of eight firms in
the personal computing industry. The personal computing industry was undergoing dramatic
changes in technology with the introduction of the UNIX operating system, RISC architecture,
and 64KB random access memory in the 1980’s, increased competition with the entry of IBM
into the personal computing business, and growing customer demand with double-digit
demand growth, and therefore fit the profile of the high-velocity environment. This was a
multiple case design with replication logic, where each case was expected to confirm or
disconfirm inferences from other cases. Case sites were selected based on their access and
proximity to the researcher; however, all of these firms operated in the high-velocity personal
computing industry in California’s Silicon Valley area. The collocation of firms in the same
industry and the same area ruled out any “noise” or variance in dependent variables (decision
speed or performance) attributable to industry or geographic differences.
The study employed an embedded design with multiple levels of analysis: decision
(comparing multiple strategic decisions within each firm), executive teams (comparing
different teams responsible for strategic decisions), and the firm (overall firm performance).
Data was collected from five sources:
Initial interviews with Chief Executive Officers: CEOs were asked questions about their
firm’s competitive strategy, distinctive competencies, major competitors, performance,
and recent/ongoing major strategic decisions. Based on these interviews, several
strategic decisions were selected in each firm for further investigation. Four criteria
were used to select decisions: (1) the decisions involved the firm’s strategic positioning,
(2) the decisions had high stakes, (3) the decisions involved multiple functions, and (4)
the decisions were representative of strategic decision-making process in that firm.
Interviews with divisional heads: Each divisional head was asked sixteen open-ended
questions, ranging from their firm’s competitive strategy, functional strategy, top
management team members, frequency and nature of interaction with team, typical
decision making processes, how each of the previously identified decision was made,
and how long it took them to make those decisions. Interviews lasted between 1.5 and 2
hours, and sometimes extended to 4 hours. To focus on facts and actual events rather
than respondents’ perceptions or interpretations, a “courtroom” style questioning was
employed, such as when did this happen, what did you do, etc. Interviews were
conducted by two people, and the data was validated by cross-checking facts and
impressions made by the interviewer and note-taker. All interview data was recorded,
however notes were also taken during each interview, which ended with the
interviewer’s overall impressions. Using a “24-hour rule”, detailed field notes were
completed within 24 hours of the interview, so that some data or impressions were not
lost to recall.
C a s e R e s e a r c h | 99
Questionnaires: Executive team members at each firm were completed a survey
questionnaire that captured quantitative data on the extent of conflict and power
distribution in their firm.
Secondary data: Industry reports and internal documents such as demographics of the
executive teams (responsible for strategic decisions), financial performance of firms,
and so forth, were examined.
Personal observation: Lastly, the researcher attended a 1-day strategy session and a
weekly executive meeting at two firms in her sample.
Data analysis involved a combination of quantitative and qualitative techniques.
Quantitative data on conflict and power were analyzed for patterns across firms/decisions.
Qualitative interview data was combined into decision climate profiles, using profile traits (e.g.,
impatience) mentioned by more than one executive. For within-case analysis, decision stories
were created for each strategic decision by combining executive accounts of the key decision
events into a timeline. For cross-case analysis, pairs of firms were compared for similarities
and differences, categorized along variables of interest such as decision speed and firm
performance. Based on these analyses, tentative constructs and propositions were derived
inductively from each decision story within firm categories. Each decision case was revisited to
confirm the proposed relationships. The inferred propositions were compared with findings
from the existing literature to reconcile examine differences with the extant literature and to
generate new insights from the case findings. Finally, the validated propositions were
synthesized into an inductive theory of strategic decision-making by firms in high-velocity
environments.
Inferences derived from this multiple case research contradicted several decisionmaking
patterns expected from the existing literature. First, fast decision makers in highvelocity
environments used more information, and not less information as suggested by the
previous literature. However, these decision makers used more real-time information (an
insight not available from prior research), which helped them identify and respond to problems,
opportunities, and changing circumstances faster. Second, fast decision makers examined more
(not fewer) alternatives. However, they considered these multiple alternatives in a
simultaneous manner, while slower decision makers examined fewer alternatives in a sequential
manner. Third, fast decision makers did not centralize decision making or restrict inputs from
others, as the literature suggested. Rather, these firms used a two-tiered decision process in
which experienced counselors were asked for inputs in the first stage, following by a rapid
comparison and decision selection in the second stage. Fourth, fast decision makers did not
have less conflict, as expected from the literature, but employed better conflict resolution
techniques to reduce conflict and improve decision-making speed. Finally, fast decision makers
exhibited superior firm performance by virtue of their built-in cognitive, emotional, and
political processes that led to rapid closure of major decisions
Conducting Case Research
Most case research studies tend to be interpretive in nature. Interpretive case research
is an inductive technique where evidence collected from one or more case sites is systematically
analyzed and synthesized to allow concepts and patterns to emerge for the purpose of building
new theories or expanding existing ones. Eisenhardt (1989)10 propose a “roadmap” for
building theories from case research, a slightly modified version of which is described below.
For positivist case research, some of the following stages may need to be rearranged or
modified; however sampling, data collection, and data analytic techniques should generally
remain the same.
Define research questions. Like any other scientific research, case research must also
start with defining research questions that are theoretically and practically interesting, and
identifying some intuitive expectations about possible answers to those research questions or
preliminary constructs to guide initial case design. In positivist case research, the preliminary
constructs are based on theory, while no such theory or hypotheses should be considered ex
ante in interpretive research. These research questions and constructs may be changed in
interpretive case research later on, if needed, but not in positivist case research.
Select case sites. The researcher should use a process of “theoretical sampling” (not
random sampling) to identify case sites. In this approach, case sites are chosen based on
theoretical, rather than statistical, considerations, for instance, to replicate previous cases, to
extend preliminary theories, or to fill theoretical categories or polar types. Care should be
taken to ensure that the selected sites fit the nature of research questions, minimize extraneous
variance or noise due to firm size, industry effects, and so forth, and maximize variance in the
dependent variables of interest. For instance, if the goal of the research is to examine how some
firms innovate better than others, the researcher should select firms of similar size within the
10 Eisenhardt, K. M. (1989). “Building Theories from Case Research,” Academy of Management Review
(14:4), 532-550.
96 | S o c i a l S c i e n c e R e s e a r c h
same industry to reduce industry or size effects, and select some more innovative and some less
innovative firms to increase variation in firm innovation. Instead of cold-calling or writing to a
potential site, it is better to contact someone at executive level inside each firm who has the
authority to approve the project or someone who can identify a person of authority. During
initial conversations, the researcher should describe the nature and purpose of the project, any
potential benefits to the case site, how the collected data will be used, the people involved in
data collection (other researchers, research assistants, etc.), desired interviewees, and the
amount of time, effort, and expense required of the sponsoring organization. The researcher
must also assure confidentiality, privacy, and anonymity of both the firm and the individual
respondents.
Create instruments and protocols. Since the primary mode of data collection in case
research is interviews, an interview protocol should be designed to guide the interview process.
This is essentially a list of questions to be asked. Questions may be open-ended (unstructured)
or closed-ended (structured) or a combination of both. The interview protocol must be strictly
followed, and the interviewer must not change the order of questions or skip any question
during the interview process, although some deviations are allowed to probe further into
respondent’s comments that are ambiguous or interesting. The interviewer must maintain a
neutral tone, not lead respondents in any specific direction, say by agreeing or disagreeing with
any response. More detailed interviewing techniques are discussed in the chapter on surveys.
In addition, additional sources of data, such as internal documents and memorandums, annual
reports, financial statements, newspaper articles, and direct observations should be sought to
supplement and validate interview data.
Select respondents. Select interview respondents at different organizational levels,
departments, and positions to obtain divergent perspectives on the phenomenon of interest. A
random sampling of interviewees is most preferable; however a snowball sample is acceptable,
as long as a diversity of perspectives is represented in the sample. Interviewees must be
selected based on their personal involvement with the phenomenon under investigation and
their ability and willingness to answer the researcher’s questions accurately and adequately,
and not based on convenience or access.
Start data collection. It is usually a good idea to electronically record interviews for
future reference. However, such recording must only be done with the interviewee’s consent.
Even when interviews are being recorded, the interviewer should take notes to capture
important comments or critical observations, behavioral responses (e.g., respondent’s body
language), and the researcher’s personal impressions about the respondent and his/her
comments. After each interview is completed, the entire interview should be transcribed
verbatim into a text document for analysis.
Conduct within-case data analysis. Data analysis may follow or overlap with data
collection. Overlapping data collection and analysis has the advantage of adjusting the data
collection process based on themes emerging from data analysis, or to further probe into these
themes. Data analysis is done in two stages. In the first stage (within-case analysis), the
researcher should examine emergent concepts separately at each case site and patterns
between these concepts to generate an initial theory of the problem of interest. The researcher
can interview data subjectively to “make sense” of the research problem in conjunction with
using her personal observations or experience at the case site. Alternatively, a coding strategy
such as Glasser and Strauss’ (1967) grounded theory approach, using techniques such as open
coding, axial coding, and selective coding, may be used to derive a chain of evidence and
C a s e R e s e a r c h | 97
inferences. These techniques are discussed in detail in a later chapter. Homegrown techniques,
such as graphical representation of data (e.g., network diagram) or sequence analysis (for
longitudinal data) may also be used. Note that there is no predefined way of analyzing the
various types of case data, and the data analytic techniques can be modified to fit the nature of
the research project.
Conduct cross-case analysis. Multi-site case research requires cross-case analysis as the
second stage of data analysis. In such analysis, the researcher should look for similar concepts
and patterns between different case sites, ignoring contextual differences that may lead to
idiosyncratic conclusions. Such patterns may be used for validating the initial theory, or for
refining it (by adding or dropping concepts and relationships) to develop a more inclusive and
generalizable theory. This analysis may take several forms. For instance, the researcher may
select categories (e.g., firm size, industry, etc.) and look for within-group similarities and
between-group differences (e.g., high versus low performers, innovators versus laggards).
Alternatively, she can compare firms in a pair-wise manner listing similarities and differences
across pairs of firms.
Build and test hypotheses. Based on emergent concepts and themes that are
generalizable across case sites, tentative hypotheses are constructed. These hypotheses should
be compared iteratively with observed evidence to see if they fit the observed data, and if not,
the constructs or relationships should be refined. Also the researcher should compare the
emergent constructs and hypotheses with those reported in the prior literature to make a case
for their internal validity and generalizability. Conflicting findings must not be rejected, but
rather reconciled using creative thinking to generate greater insight into the emergent theory.
When further iterations between theory and data yield no new insights or changes in the
existing theory, “theoretical saturation” is reached and the theory building process is complete.
Write case research report. In writing the report, the researcher should describe very
clearly the detailed process used for sampling, data collection, data analysis, and hypotheses
development, so that readers can independently assess the reasonableness, strength, and
consistency of the reported inferences. A high level of clarity in research methods is needed to
ensure that the findings are not biased by the researcher’s preconceptions.
Interpretive Case Research Exemplar
Perhaps the best way to learn about interpretive case research is to examine an
illustrative example. One such example is Eisenhardt’s (1989)11 study of how executives make
decisions in high-velocity environments (HVE). Readers are advised to read the original paper
published in Academy of Management Journal before reading the synopsis in this chapter. In
this study, Eisenhardt examined how executive teams in some HVE firms make fast decisions,
while those in other firms cannot, and whether faster decisions improve or worsen firm
performance in such environments. HVE was defined as one where demand, competition, and
technology changes so rapidly and discontinuously that the information available is often
inaccurate, unavailable or obsolete. The implicit assumptions were that (1) it is hard to make
fast decisions with inadequate information in HVE, and (2) fast decisions may not be efficient
and may result in poor firm performance.
Subscribe to:
Posts (Atom)