Inter-rater reliability. Inter-rater reliability, also called inter-observer reliability, is a
measure of consistency between two or more independent raters (observers) of the same
construct. Usually, this is assessed in a pilot study, and can be done in two ways, depending on
the level of measurement of the construct. If the measure is categorical, a set of all categories is
defined, raters check off which category each observation falls in, and the percentage of
agreement between the raters is an estimate of inter-rater reliability. For instance, if there are
two raters rating 100 observations into one of three possible categories, and their ratings match
for 75% of the observations, then inter-rater reliability is 0.75. If the measure is interval or
ratio scaled (e.g., classroom activity is being measured once every 5 minutes by two raters on 1
to 7 response scale), then a simple correlation between measures from the two raters can also
serve as an estimate of inter-rater reliability.
Test-retest reliability. Test-retest reliability is a measure of consistency between two
measurements (tests) of the same construct administered to the same sample at two different
points in time. If the observations have not changed substantially between the two tests, then
the measure is reliable. The correlation in observations between the two tests is an estimate of
test-retest reliability. Note here that the time interval between the two tests is critical.
Generally, the longer is the time gap, the greater is the chance that the two observations may
change during this time (due to random error), and the lower will be the test-retest reliability.
Split-half reliability. Split-half reliability is a measure of consistency between two
halves of a construct measure. For instance, if you have a ten-item measure of a given
construct, randomly split those ten items into two sets of five (unequal halves are allowed if the
total number of items is odd), and administer the entire instrument to a sample of respondents.
Then, calculate the total score for each half for each respondent, and the correlation between
the total scores in each half is a measure of split-half reliability. The longer is the instrument,
the more likely it is that the two halves of the measure will be similar (since random errors are
minimized as more items are added), and hence, this technique tends to systematically
overestimate the reliability of longer instruments.
Internal consistency reliability. Internal consistency reliability is a measure of
consistency between different items of the same construct. If a multiple-item construct
measure is administered to respondents, the extent to which respondents rate those items in a
similar manner is a reflection of internal consistency. This reliability can be estimated in terms
of average inter-item correlation, average item-to-total correlation, or more commonly,
Cronbach’s alpha. As an example, if you have a scale with six items, you will have fifteen
different item pairings, and fifteen correlations between these six items. Average inter-item
correlation is the average of these fifteen correlations. To calculate average item-to-total
correlation, you have to first create a “total” item by adding the values of all six items, compute
the correlations between this total item and each of the six individual items, and finally, average
the six correlations. Neither of the two above measures takes into account the number of items
in the measure (six items in this example). Cronbach’s alpha, a reliability measure designed by
Lee Cronbach in 1951, factors in scale size in reliability estimation, calculated using the
following formula:
58 | S o c i a l S c i e n c e R e s e a r c h
where K is the number of items in the measure, is the variance (square of standard
deviation) of the observed total scores, and is the observed variance for item i. The
standardized Cronbach’s alpha can be computed using a simpler formula:
where K is the number of items, is the average inter-item correlation, i.e., the mean of K(K-
1)/2 coefficients in the upper triangular (or lower triangular) correlation matrix.
Add Your Gadget Here
HIGHLIGHT OF THE WEEK
-
Survey Research Survey research a research method involving the use of standardized questionnaires or interviews to collect data about peop...
-
Inter-rater reliability. Inter-rater reliability, also called inter-observer reliability, is a measure of consistency between two or more i...
-
discriminant validity is exploratory factor analysis. This is a data reduction technique which aggregates a given set of items to a smalle...
-
can estimate parameters of this line, such as its slope and intercept from the GLM. From highschool algebra, recall that straight lines can...
-
Positivist Case Research Exemplar Case research can also be used in a positivist manner to test theories or hypotheses. Such studies are ra...
-
Quantitative Analysis: Descriptive Statistics Numeric data collected in a research project can be analyzed quantitatively using statistical...
-
Probability Sampling Probability sampling is a technique in which every unit in the population has a chance (non-zero probability) of being...
-
Experimental Research Experimental research, often considered to be the “gold standard” in research designs, is one of the most rigorous of...
-
Bivariate Analysis Bivariate analysis examines how two variables are related to each other. The most common bivariate statistic is the biva...
-
Case Research Case research, also called case study, is a method of intensively studying a phenomenon over time within its natural setting ...
Sunday, 13 March 2016
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment