Content Analysis: A Methodology for Structuring and Analyzing Written Material--Transfer Paper 10.1.3

Published by the Government Accountability Office on 1989-03-01.

                          United   States   General   Accounting   Office


March   1989
                          Content Analysis:
                          A Methodology for
                          Structuring and
                          Analyzing Written

Transfer   Paper 10.1.3

          In this paper, we define and describe the evaluation method called “con-
          tent analysis.” It is a set of procedures for transforming nonstructured
          information into a format that allows analysis. Prom reading this paper,
          GAO analysts should gain an understanding of the basic concepts and
          procedures used in content analysis and also an ability to recognize the
           appropriate circumstances for using this evaluation method in their

          Although we have focused on techniques that make quantitative analy-
          sis possible! this is not necessarily the objective of all content analyses.
          We have presented the techniques that are the most applicable to GAO'S
          work. In chapter 1, we define content analysis and compdre it to similar
          procedures already used in GAO. In chapter 2, we discuss the procedures
          for using content analysis. In chapter 3, we explain the advantages and
          disadvantages of content analysis and describe some of its potential
          applications in program evaluation.

          The paper is designed to be self-instructional. References are provided
          throughout the text for readers who want more information on specific
          topics, and these references are keyed to the bibliography.

          Research for this document began with a survey of the numerous books
          and articles on content analysis and its past applications. We also inter-
          viewed users of content analysis to gain information about its advan-
          tages and disadvantages, and we interviewed selected GAO staff who
          have participated in evaluations in which content analysis might have
          been appropriate. The foundation for this document is a paper written
          by William Carter while a student intern with GAO. The document was
          prepared by Teresa Spisak, formerly of the Institute for Program Evalu-
          ation (now PEMD), and was originally published in 1982 as Transfer
          Paper 3. It is being reissued now with only minor changes, including
          some updating of bibliographic materials.

          Content Analysis is one of a series of papers issued by PEMD. The pur-
          pose of the series is to provide GAO evaluators with a clear and compre-
          hensive background of the basic concepts of audit and evaluation
          methodology. Additionally, transfer papers explain both general and

specific applications and procedures for using the evaluation methodol-
ogy. Other papers in this series include Causal Analysis, Designing Eval-

Questionnaires, Using Statistical Sampling, and Case Study Evaluations.

Eleanor Chelimsky
Assistant Comptroller General for Program Evaluation and

Chapter 1                                                                                            6
What Is Content
Chapter 2                                                                                             8

What Are the            Deciding to Use Content Analysis
                        Determining What Material Should Be Included
Procedures in Content   Selecting
                               Unitsof    Analysis
Analysis?               Developing Coding Categories                                                 11
                        Coding the Material                                                          18
                        Analyzing and Interpreting the Results                                       20
                        Writing the Report                                                           22
                        SummarY                                                                      23

Chapter 3                                                                                            25
Why Should GAO          What Content Analysis Can Do                                                 25
                        Pitfalls in Using Content Analysis                                           26
Analysts Use Content    Potential Applications in Program Evaluation                                 27
Analysis?               Conclusion                                                                   28

Figures                 Figure 2.1: Steps in Content Analysis                                         8
                        Figure 2.2: Requirements for Content Categories                              12
                        Figure 2.3: Matrix Category Format                                           13
                        Figure 2.4: Category Format Measuring Space                                  14
                        Figure 2.5: Two Category Formats Measuring Frequency                         14
                             of Statements
                        Figure 2.6: Measuring Frequency of and Position Taken                        15
                             on Specific Proposals
                        Figure 2.7: Category Format Measuring Attitude Intensity                     17
                        Figure 2.8: Guidelines for Contents of Coding Instructions                   18
                             for Trained Coders
                        Figure 2.9: Issues Addressed by HUD’s Evaluation Units                       21
                        Figure 2.10: Minimum Documentation for a Content                             23
                             Analysis Study

GAO        U.S. General Accounting Office
           U.S. Department of Housing and Urban Development
PEMD       Program Evaluation and Methodology Division

Chapter 1

What Is Content Analysis?

               GAO staff often collect large quantities of written material during their
               jobs. Workpapers, agency documents, transcripts of meetings, previous
                evaluations, and the like all contain useful information that is difficult
               to combine and analyze because it is diverse and unstructured. Content
                analysis is a set of procedures for collecting and organizing this

               One way to begin structuring written material so that it can be analyzed
               is to summarize and list the major issues that are contained in it. Then
               the frequency with which these issues occur can be counted. Both activi-
               ties are usually performed at some point in GAO jobs, and both are part
               of content analysis.

               For example, in assessing HUD'S evaluation system to determine whether
               program offices were duplicating efforts, GAO analysts collected budget
               information, interviews, and evaluation reports. (GAO, 1978)’ They
               began analyzing the information by identifying 31 major issues for hous-
               ing and urban development. Then they reviewed 38 HUD evaluation
               reports from two offices, categorizing the issues addressed in each
               report and looking for overlaps between the offices. Simplifying and cat-
               egorizing written information are part of content analysis.

               In addition to requiring summaries of written material and enumera-
               tions of the frequency of statements or issues, GAO projects often require
               more complex analyses. Sometimes trends have to be examined over
               time, across different situations, or among different groups. The infor-
               mation that is needed to make these types of analysis may not exist in
               computer files. With content analysis, information from written material
               can be structured so that these types of analysis can be made even with-
               out computer files.

               Content analysis is a set of procedures for collecting and organizing
               information in a standardized format that allows analysts to make infer-
               ences about the characteristics and meaning of written and other
               recorded material. Simple formats can be developed for summarizing
               information or counting the frequency of statements. More complex for-
               mats can be created for analyzing trends or detecting subtle differences
               in the intensity of statements.

               Among the procedures of content analysis that we discuss in the next
               chapter are defining and sampling the written or recorded material to be

               lInt.erlinear bibliographic references are cited in full in the bibliography.

Chapter 1
What L4 Content Analyshg?

analyzed, developing standardized categories, coding the material with
rigorous reliability checks, analyzing and interpreting the information,
and validating and reporting the results. Although in this paper we have
focused on procedures that make quantitative analysis possible, this is
not necessarily the objective of all forms of content analysis.

What Are the Procedures in Content Analysis?

                                        The steps to be followed in content analysis are summarized in figure
                                        2.1. Steps 1, 2, and 6-deciding whether or not the methodology is
                                        appropriate, determining what material should be analyzed, and analyz-
                                        ing and interpreting the results-are integral aspects of all projects.
                                        However, steps 3,4, and 5-choosing the units of analysis, developing
                                        coding categories, and coding the material-are unique to content analy-
                                        sis, and therefore we will explain these in greater detail.

Flgure 2.1: Steps in Content Analysis
                                        1. Decide to use content analysis.

                                        2. Determine what material should be included in content analysis.

                                        3. Select units of analysis.

                                        4. Develop coding categories.

                                        5. Code the material.

                                        6. Analyze and interpret the results.

                                        At step 1, analystsshould consider a number of factors in deciding
Deciding to Use                         whether or not to use content analysis, These include a project’s objec-
Content Analysis                        tives, data availability, and the kinds of analyses required.

                                        Objectives are precisely worded questions that the project staff are try-
                                        ing to answer. (GAO, December 1988, p. 10-4) The questions should be
                                        based on a clear understanding of project needs and the available data.
                                        Precisely worded questions provide the focus for data collection, analy-
                                        sis, and reporting. In general, content analysis can be used to answer
                                        “What?” but not “Why?” That is, it helps analysts describe or summa-
                                        rize the content of written material, the attitudes or perceptions of its
                                        writer, or its effects on its audience.

                          Chapter 2
                          Content Analysis?

                          The content of material can be summarized by listing or by counting the
                          issues or statements within it, as we indicated in chapter 1. The author’s
                          attitudes and perceptions can also be described. For example, if analysts
                          wanted to assess the effects of various programs on the lives of older
                          people, content analysis of open-ended interview responses could be
                          used to identify their outlook on life and their attitudes about loneliness
                          or security. Content analysis can also be useful in describing the effects
                          of messages on their recipients. For example, the effect of Voice of
                          America broadcasts has been assessed by analyzing Soviet newspapers
                          and transcripts of radio broadcasts. (Inkeles, 1952)

The Kinds of Material     Content analysis can be used to study any recorded material as long as
                          the information is available to be reanalyzed for reliability checks.
Available                 Although it is used most frequently to analyze written material, content
                          analysis can be used to study any recorded communication, including
                          television programs, movies, and photographs. It can be used to analyze
                          congressional testimony, legislation, regulations, other public docu-
                          ments, workpapers, case studies, reports, answers to survey questions,
                          news releases, newspapers, books, journal articles, and letters. A speech
                          or a discussion, however, cannot be analyzed unless it has been tran-
                          scribed or taped.

                          Before using content analysis, project staff should assess the written
                          material’s quality. Does the available material accurately represent
                          what was written or said? A garbled tape recording or written material
                          with sections missing is not a sound basis for content analysis. Findings
                          and conclusions from content analysis can never be more accurate than
                          the material that has been analyzed.

The Kinds of Comparison   Content analysis can be used for making numerical comparisons among
                          and within documents. For example, staff who want to describe or sum-
Required                  marize the content of written material can use content analysis to com-
                          pare documents derived from a single source, such as from one federal
                          agency, by comparing issues or statements over time, in different situa-
                          tions, or across differing groups. The relationship of two or more state-
                          ments or issues within a single document or set of documents can also be
                          analyzed. Alternatively, statements or issues from two or more different
                          sources can be compared.       -

                     Chapter 2
                     Content   Analysis?

                     Sampling is necessary if the body of material, the “universe,” is too
Determining What     extensive to be analyzed in its entirety. Thus, at step 2, analysts who
Material Should Be   want to make valid conclusions and generalizations about a universe
Included             should select from that universe a sample that is representative of it.1

                     Selecting samples for content analysis usually involves sampling docu-
                     ments. For example, in a hypothetical project evaluating changes in the
                     eligibility requirements in a food stamp program, more than 500 partici-
                     pants might be interviewed. By arranging the interview transcripts
                     alphabetically and then selecting every tenth transcript for content
                     analysis, the project staff might be able to draw a systematic sample.
                     Other types of sampling design may also be used. (Babbie, 1973, pp. 91-

                     In content analysis, the researcher designates the units of analysis,
Selecting Units of   called “recording units,” and the units of context. This is step 3. Context
Analysis             units set limits on the portion of written material that is to be examined
                     for categories of words or statements. Context units can be the same as
                     the units sampled, although they are not always the same.

                     Since it is not always practical to use long documents as context units,
                     chapters, sections, paragraphs, or even sentences may be better choices.
                     This is especially true when attempts are made to identify subtle differ-
                     ences in content. For example, a meeting transcript can be analyzed to
                     determine the extent to which the meeting’s participants supported or
                     opposed various issues. In this case, the analysts would choose
                     sentences as the context unit if entire statements were relatively long
                     and tended, as sometimes happens, to contain conflicting information. It
                     may be typical for a given speaker to oppose an issue at the beginning of
                     a statement but to shift to support of it at the end. To identify such
                     shifts in position, analysts need to examine a small content unit such as
                     the sentence.

                     A recording unit is the specific segment of the context unit in the writ-
                     ten material that is placed in a category. It may be a word, a group of
                     words (such as those that identify a theme), a sentence, a paragraph, or
                     an entire document. It can never be larger than the context unit. In the
                     nun study we cited earlier, analysts used the groups of words that

                     ‘Readers unhmlliar with basis sampling
                                                          theoryand methods should refer to GAO, December 1988,
                     pp.ll-16tn  ll-19and ll-26to 1136.

                    Chapter 2
                    What Are the Procedures   in
                    Content Analysis?

                    embodied the discussion of the issues as recording units. Their context
                    units were the evaluation studies.

                    Categories provide the structure for grouping recording units. Step 4,
Developing Coding   formulating categories, is the heart of content analysis. Berelson, an
Categories          early user of content analysis, emphasized the importance of this step
                    when he cautioned that

                    “Content analysis stands or falls by its categories. Particular studies have been pro-
                    ductive to the extent that the categories were clearly formulated and well adapted
                    to the problem and to the content.” (Berelson, 1962, p. 147)

                    Figure 2.2 lists standard requirements that categories should meet.
                    Adhering to these requirements helps keep an analysis systematic and
                    objective, which leads to results that are amenable to statistical

                                       Chapter 2
                                       What Are the Procedures in
                                       Content Analysis?

Figure 2.2: Requirements for Content
                                       1. Categories should be exhaustive-so that all relevant items in the
                                       material being studied can be placed within a category.

                                       2. Categories should be mutually exclusive-so                     that no item can be
                                       coded in more than one category.

                                       3. Categories should be independent-so that a recording unit’s
                                       category assignment is not affected by the category assignment
                                       of other recording units.

Category Formats                       Categories can be conceptualized in numerous ways. Some common cate-
                                       gory formats are groupings, scales, and matrices.2 Structured category
                                       formats increase coding efficiency, especially when the number of cate-
                                       gories is large.

                                       In our HUD example, analysts chose groups of issues as categories. They
                                       grouped 31 issues into three general categories. For example, issues such
                                       as dispersion of housing, block grants, and public housing modernization
                                       were placed in the category “Housing Assistance Issues.”

                                       Scales provide for the rank ordering of information. In the HUD example,
                                       had the analysts wanted to know the extent to which the reports they
                                       were examining supported the issues, they could have used a scale such
                                       as “supports, is ‘uncommitted, opposes.”

                                       Matrices are useful formats when analysts seek more information about
                                       issues than simply whether they are present or absent. The group and
                                       scale categories we discussed above could be combined into a matrix for-
                                       mat such as that shown in figure 2.3.

                                        ‘Krippendorff discusses these and more sophisticated formats such as trees, loops, chains, cubes, and
                                        partition lattices. (Krippendorff, 1980, pp. 91-98)

                                     Chapter 2
                                     What Are the Procedures in
                                     Content Analysis?

Figure 2.3: Matrix Category Format
                                                                                    Degree of support for issue
                                     Issue                               SUPPOrtS        Opposes            Uncommitted
                                     I Housing assistance

                                       A. Block grants

                                       6. Houslng dispersion

                                       C. Public housinq modernization

Quantification Levels                Categories can be used to measure three quantification levels-space,
                                     frequency, and intensity. To explain the differences between these
                                     quantification levels and how they relate to constructing categories, we
                                     use a hypothetical analysis of handgun control legislation for which the
                                     analyst has as major sources of information newspaper articles, public
                                     documents, and transcripts of interviews with public officials.

                                     At the least rigorous level of quantification, the hypothetical analyst
                                     can measure the amount of space in the newspaper articles devoted to
                                     positions supporting or opposing the issue. The analyst then can use this
                                     measurement to compare the relative strength of issues supporting and
                                     opposing handgun control.

                                     In selecting newspapers, the analyst also has to control for factors that
                                     may influence the articles’ content or editorial viewpoint. The category
                                     format shown in figure 2.4 uses the newspapers’ location (rural versus
                                     urban) for this purposed. For each issue of each newspaper in the sam-
                                     ple, the analyst adds together the number of column inches from all
                                     news articles and editorials to find the total amount of space for each
                                     position. By also coding the name, location, and date of each newspaper,
                                     the analyst can examine trends across time and can compare rural and
                                     urban viewpoints.

                                      Page 13                                            Transfer Paper 10.18 Content Analysis
                                        chapter 2
                                        What Are the Procedures in
                                        Content   Analysis?

Figure 2.4: Category Format Measuring
Space                                                                                            Number of column inches
                                        Newspaper         Date            Location       Supporting     Opposing      Uncommitted
                                        “Times”           1l/12/81        Urban                     4              0                     2

                                        “Examiner”        11/18/81        Rural                     0              5                     2

                                        Such measurement is rapid and relatively easy, but it provides only very
                                        general information. Furthermore, analysts who use this level of quanti-
                                        fication have to assume that the differences they find in amounts of
                                        space are valid indicators of relative emphasis or impo&+.nce.

                                        At the next level of quantification, the analyst can code the frequency
                                        of recording units by tallying the number of times each issue or state-
                                        ment occurs in the text. Formats for measuring frequency can be very
                                        simple, as in figure 2.5, or more complex, as in figure 2.6, depending on
                                        the information needs of the project.

Figure 2.5: Two Category Formats
Measuring Frequency of Statements       Format 1
                                                                                  Number of column inches
                                        Newspaper             Date        Location      Supporting      Opposing          Uncommitted
                                        “Times”               11/21/81    Urban                   2             0                        1

                                        “Examiner”            1l/18/81    Rural                         0          4                     0

                                        Format 2
                                        Newsoaoer             Date            Location     Statement attribution          Position
                                        “Times”               1 l/12/81       Urban         State politician              Supports

                                        “Times”               1l/12/81        Urban         Editorial                     Supports

                                        “Times”               11/12/81        Urban         U.S. Senator                  Uncommitted

                                        “Examiner”            1 l/18/81       Rural         Citizens’ group               Opposes

                                         “Examiner”           1l/18/81        Rural         State politician              Opposes

                                         Chapter 2
                                         Content Analysis?

Figure 2.6: Measuring Frequency of and
Position Taken on Specific Proposals     Cateaorv Format
                                                                                          Opposes       Uncommitted/no
                                         Proposals for handgun control                        (02)         position (03)
                                         Bannina handaun sales                    (011

                                         Banning importation of unassembled
                                           aun Darts

                                         Handgun registration                      (03)

                                         Stricter controls on handoun purchases    (041

                                         Stronger penalties for using handguns
                                            to commit crimes                       (05)

                                         More stringent enforcement of existing
                                           control                                 (061

                                         Other                                     (07)

                                         Codina Format
                                         Source                                                 Date     Column     Row
                                         Presidential advisory panel                                           01        02

                                         Presidential advisory panel                                           01        07

                                         Presidential advisorv oanel                           a/6/81          01        04

                                         Figure 2.5 presents two simple formats for measuring the number of
                                         statements supporting, opposing, and uncommitted to handgun control.
                                         Format 1 is similar to the format for measuring space but instead meas-
                                         ures the number of articles that appear over a given period of time. For-
                                         mat 2 identifies the speaker and allows the analyst to compare positions
                                         by different individuals over time and by different locations.

                                         Figure 2.6 shows a more elaborate means of measuring frequency, with
                                         separate formats for category and for coding. This approach could be
                                         used to analyze information from all three data sources in the hypotheti-
                                         cal example-newspapers,       public documents, and interview transcripts,
                                         In the figure, the categories describe positions on specific proposals for
                                         handgun control. The positions can be coded by assigning them four dig-
                                         its that indicate the positions taken (columns) on the proposals (rows).

                                         Page 18                                     Transfer Paper 10.1.3 Content Analysis
Chapter 2
Content Analysis?

To show how this works, we can examine the recommendations in the
following statement from a New York Times article published on August
6, 1981, coded as shown in figure 2.6.

“The eight-member (Presidential advisory) panel . . . recommended legislation for-
bidding the importing of pistol parts, requiring citizens to report the theft or loss of
a pistol, and establishing a waiting period before a pistol is purchased to permit the
authorities to determine if the purchaser has a criminal record.”

The recommendation for legislation forbidding the importing of pistol
parts is coded as column 01 (“supports”), row 02 (“banning importation
of unassembled gun parts”). The second recommendation, “requiring cit-
izens to report the theft or loss of a pistol,” is coded as “other” (0107)
since it is not in the list of specific proposals.

In general, analysts incorporate two assumptions in their research
designs when they construct frequency measures. First, they assume
that the frequency with which a statement occurs in the text is a valid
indication of value or importance. Second, they assume that all content
units can be given equal weight and therefore that each one can be com-
pared directly with every other.

At the third level of quantification, analysts code for intensity. Frequen-
cies are counted, but each coded statement or issue is also adjusted by a
weight that measures relative intensity.3 This measurement level allows
much more sensitive data analysis.

One drawback of intensity coding, however, is that it requires coders to
recognize more subtle differences in the material than they need to
when coding for space or frequency. Furthermore, it is difficult to list all
criteria that coders have to consider in making their decisions. For
example, coders may have to consider the relative intensity of the mean-
ing of verbs (“disagree” versus “doubt”) or their tenses (past, present,
future), of the meaning of adverbial modifiers (“often” versus “some-
times”), or of the meaning of statements that express what is probable
(using “may”) versus what is imperative (using “must”).

Since it helps analysts compare subtle differences in words, this level of
quantification is the most useful for analyzing direct quotations and the
contents of official documents, such as public laws and regulations, in
which words are understood to have been chosen carefully to convey a

3Three methods of calculating and assigning weights are discus& in North et al., 1963,pp. M-103.

                                        Chapter 2
                                        Content Analysis?

                                        precise message. In the gun control example, therefore, only the inter-
                                        view transcripts would be analyzed at this level.

                                        Figure 2.7 illustrates how attitude intensity can be coded. Using two
                                        hypothetical interview responses, it shows how replies can be fitted into
                                        the category form “subject, verb, common meaning term.” Each reply
                                        may contain more than one statement-or recording unit-to be coded.
                                        Therefore, values ranging from +3 to -3, depending on direction and
                                        intensity, are assigned to the verb and the common meaning term in
                                        each statement. In this case, a plus is assigned to verbs and common
                                        meaning terms that appear to support gun control. Each statement’s two
                                        values-the value of its verb and the value of its common meaning
                                        term-are multiplied, and then the products for all the statements in the
                                        response are summed, yielding a total score for each response.

Figure 2.7: Category rormat Measuring
Attitude Intensity                      Response 1
                                        “Personally, I’m for gun control, but I doubt that a general gun control bill would meet with
                                        verv much success.”
                                        Subiect          Verb              Value Common meaning term                 Value     Product
                                        I                am                    +3 for gun control                        +3             +9

                                        I                doubt                 -2   bill would meet with very            +3             -6
                                                                                    much success

                                        Total                                                                                           +3
                                        Response 2
                                        “I urge the government to tighten its controls on handguns sold to residents.”
                                        Subieot          Verb              Value Common meaning term                 Value      Product
                                        I                urge                  +3 government to tighten Its              +3             +9

                                        Total                                                                                           +9

                                         In the example in Figure 2.7, response 1 contains two statements while
                                         response 2 contains only one. The qualifying statement in the first
                                         response lowers its intensity so that, overall, the second response is
                                         given a higher intensity rating.

                                         Page 17                                                 Transfer Paper 10.1.3 Content Analysis
                                         Chapter 2
                                         Content Analysis?

                                         Material can be coded either manually or by computers, depending on
Coding the Material                      the resources available and the format of the material. This is step 5 in
                                         content analysis. If the material is already computerized, the analyst
                                         should explore the possibility of obtaining a computer program to do the
                                         coding. After deciding how the material will be coded, the analyst writes
                                         the necessary instructions. Figure 2.8 spells out the minimum require-
                                         ments for instructions for trained coders.

Figure 2.6: Guidelines for Contents of
Coding instructions for Trained Coders
                                         1. Definition of recording units, including procedures for identifying

                                         2. Descriptions of the variables and categories.

                                         3. Outline of the cognitive procedures used in placing data in

                                         4. Instructions for using and administering data sheets.
                                         Source: Adapted from K Knpendorff, Content Analysis, An Introduction to Its Methodology   (Beverly
Hills, Calif Sage Publications. 1980), p 174.
                                         Hills, Calif Sage Publications. 1980), p 174.

Pretesting                               Pretesting is an important step before actual coding begins. It involves
                                         coding a small portion of the material to be analyzed or some other simi-
                                         lar material. From the pretests, the analyst tests and revises the coding
                                         categories and instructions, and does this several times in some cases.
                                         Pretesting is necessary whether computers are used for content analysis
                                         or the analysis is done by hand. Computer analysis requires test com-
                                         puter runs to ensure that the program is functioning as planned.

                                         A pretest enables the analyst to determine whether (1) the categories
                                          are clearly specified and meet the requirements in figure 2.2, (2) the
                                          coding instructions are adequate, and (3) the coders are suitable for the
                                         job. These determinations are made by assessing reliability among cod-
                                          ers and consistency in individual coding decisions (as we discuss below).
                                          Once the analyst has been assured that the material can be coded with
                                          high reliability, the pretests are over, and the coding can begin.

                           Chapter 2
                           Content Analysis?

                           Data can, of course, be coded with the help of computer programs.
                           (Weber, 1986) This solves the reliability problem but generates others.
                           For one, all the material to be coded must be entered on a computer tape
                           or disk, even though this may be impractical. For another, computer
                           programs that perform content analysis require very specific categories.

                           For example, using a computer usually confines analysts to words as
                           recording units, but this means that every word being coded has to be
                           listed in the computer’s memory as in a dictionary. Preparing a diction-
                           ary, however, may be far more difficult than formulating categories.
                           Furthermore, because a word takes on different meanings in different
                           contexts-a subtlety which computers cannot discern but people can-
                           the results of computer coding may lack validity.

                           Computers should not be completely discounted, however, because they
                           do have advantages. They are valuable in a number of situations. Com-
                           puters can save time and permit analysis of large amounts of data when
                           the word is the optimal unit of analysis. Because computers can
                           “remember” many more definitions than people can, they are useful
                           when categories are numerous. They are also valuable when data will be
                           reused. Thus, the cost of preparing a data base for a series of studies for
                           computer analysis may be offset by the benefit of having easily manage-
                           able data in the future. (Holsti, 1969, pp, 161-64)

Checking for Reliability   A check for reliability tells analysts the extent to which a measuring
                           procedure can produce the same results on repeated trials. (Carmines
                           and Zeller, 1979, p. 11) In content analysis, this means determining the
                           similarity with which two or more people categorize the same material.
                           Analysts have to assess reliability while pretesting the coding categories
                           and instructions and also throughout the coding process.

                           To check for reliability, an analyst compares the way independent cod-
                           ers have coded the same mater-M4 For example, two coders might be
                           given ten items to code individually. The analyst compares their coding
                           decisions and determines the extent to which they agree.

                           4Many reliability formulas have been developed for computing the percentage agreement among cod-
                           ers. See Kaplan and Golden, 1949; Krippendorff, 1980; Robinson, 1967; and Spiegelman et al., 1967.
                           Scott’s formula is considered useful for two coders because it takes into account the extent of
                           intercoder agreement that may result by chance. See Scott, 1966; see also Holsti, 1969, pp. 140-41.

                               Chapter 2
                               Content hulyeis?

                               What constitutes acceptable reliability is best decided case by case,
                               although analysts generally consider nothing lower than 80 to 90 per-
                               cent agreement as acceptable. Low reliability estimates do not reveal
                               whether the fault lies with the categories or with the coders. During the
                               pretest, therefore, it is important for the analyst to identify major
                               sources of discrepant coding and to learn the reasons for them. If the
                               coders are assumed to be competent, low reliability estimates indicate
                               that they are being asked to make finer discriminations than is possible
                               with their training and understanding of the categories.

                               One way to resolve this problem is to contrast data known to have been
                               coded reliably with the data that have not. This tells the analyst
                               whether errors are concentrated in a few categories or cut across all
                               categories. If the latter, the analyst should seriously reconsider the
                               entire design, including the decision to use content analysis. If only a
                               few areas are causing problems, then revising these categories (or the
                               instructions) may solve the problem. (Fox, 1969, pp. 670-72)

                               The main objective of content analysis is to analyze information whose
Analyzing and                  format has been transformed into one that is useful. This constitutes
Interpreting the               step 6 and involves
                           l summarizing the coded data,
                           . discovering patterns and relationships within the data,
                           . testing hypotheses about the patterns and relationships, and
                           l relating the results to data obtained from other methods or situations or
                             from assessing the validity of the analysis.

                               Neither these tasks nor the analytical techniques for accomplishing
                               them are unique to content analysis. Depending on the coding design, an
                               analyst can use a variety of statistical methods.

Summarizing Data and           The most common means of summarizing data is by looking at frequen-
Examining Their Patterns       ties among them. Absolute frequency might be the number of times
                               statements or issues are found in the sample; a relative frequency might
                               be represented by a percentage of the sample size. Analysts can compare
                               one category’s frequency to the average frequency for all categories, or
                               they can note changes in frequencies over time.

                                          Chapter 2
                                          Content Analysis?

Figure 2.9: Issues Addressed by HUD’s Evaluation Units


                                          Source: U.S. General Accounting Office, HUD's Evaluation System-An   Assessment, PAD-7844 (Wash.
Ington, D.C.: 1978), p 7
                                          Ington, D.C.: 1978), p 7

                                           In the assessment of HUD'S evaluation system, for example, after the GAO
                                           analysts had categorized the issues addressed in 38 evaluation reports
                                           from two offices, they summarized the number of studies discussing
                                           each issue. They used absolute frequencies, and we show their grand
                                           total in figure 2.9. Within this summary, the analysts reported that 20 of
                                           the 38 documents they reviewed were not directed toward any major
                                           housing and urban development issue and that 16 issues were not
                                           addressed at all. (GAO, 1978, p. 22)

                                           Another way of analyzing content analysis data is to examine relations
                                           among variables by cross-tabulating the co-occurrence of variables. Fig-
                                           ure 2.9, for example, shows the relationship between the issues
                                           addressed in various reports and the evaluation units that produced the

                                           Page 21                                                   Transfer Paper 10.1.3 Content Analysis
                     Chapter 2
                     Content Analysis?

                     reports. Prom this information, the GAO analysts identified little duplica-
                     tion in the way the two offices addressed the issues.

                     Cross-tabulations need not be limited to two or three variables. Mul-
                     tivariate techniques can be used to analyze complex structures. (Reyn-
                     olds, 1977) Other techniques for discovering patterns and relationships
                     in data include contingency analysis, clustering, and factor analysis;
                     Krippendorff discusses these and others. (Krippendorff, 1980, pp. 109-

Assessing Validity   Whatever the technique used, a final and important task is to assess the
                     validity of the results by relating them to other data that are known to
                     be reasonably valid. Validity is the extent to which an instrument meas-
                     ures what it is intended to measure. Reliability and adequate sampling
                     are necessary but not sufficient conditions for validating inferences
                     made through content analysis. In addition, analysts have to corrobo-
                     rate the results of content analysis with other data or by other proce-
                     dures that are known to be valid indicators of the phenomena they are

                     An example of validity assessment is provided in Ramallo’s analysis of
                     volunteers’ written reports of their experiences in Crossroads Africa, a
                     Peace Corps program. (Ramallo, 1966) He hypothesized that content
                     analysis of reports could distinguish successful volunteers from unsuc-
                     cessful ones, assuming that the unsuccessful volunteers would exhibit
                     greater alienation from their experiences. Ramallo compared his results
                     with supervisors’ ratings for the same volunteers and found a high cor-
                     relation between the two, concluding that his own analysis had pro-
                     duced a valid measure of success.

                     Other equally appropriate measures could have been used to validate
                     Ramallo’s findings. Surveying the Africans with whom the volunteers
                     had worked is one. Measuring increases in food production or decreases
                     in infant mortality for each volunteer’s assigned village are others. The
                     use of plentiful and generally acceptable corroborating measures
                     reduces the risk of producing misleading evaluation findings.

                     As in writing any GAO report, analysts should explain the scope and
Writing the Report   nature of their work to indicate to their readers what they covered and
                     what the frame of reference is for their findings. (GAO, July 1988, chap
                     ter 12.8) Readers should be given a clear idea of what was done, why it

                                         Chapter 2
                                         What Are the Procedures in
                                         Content Analysis?

                                         was done, and why the results provide a sound basis for conclusions and
                                         recommendations. Figure 2.10 outlines the record of information that
                                         analysts should maintain when they use content analysis.

Figure 2.10: Minimum Documentation for
a Content Analysis Study
                                         1. The study’s objectives, which governed the choice of data, methods,
                                         and study design.

                                         2. A justification    of the choice of data, methods, and design.

                                         3. A description of the procedures (so that the research can be repli-
                                         cated), including descriptions of the

                                         . sampling plans,
                                         l units of analysis,
                                         l coding instructions,
                                         l results of reliability tests,
                                         l procedures for data handling and analysis, and
                                         l efforts at validating parts of or the entire procedure.

                                             4. The findings and their statistical significance.

                                             Content analysis results should be firm enough to withstand critical
                                             scrutiny. The information represented in the items mentioned in figure
                                             2.10 may be included in the main body of the report or in appendixes, or
                                             it may remain only in the workpapers.

                                             In either case, it should be documented well enough to enable critical
                                             readers to estimate how much they can rely on the reported results.

                                             Content analysis is a set of procedures for transforming nonstructured,
Summary                                      written material into a format for analysis. In this chapter, we have
                                             described those procedures. They are summarized as follows:

    Chapter 2
    Content Analysis?

l   deciding to use content analysis based on a project’s objectives, the
    material that is available, and the kinds of comparison that are required;
l   determining what material should be included in content analysis, which
    may involve sampling;
l   selecting context units and recording units;
l   developing coding categories, quantification levels, and coding
l   pretesting the categories and then coding the material either manually
    or by computer;
l   checking reliability during retests and throughout the coding;
l   analyzing and interpreting the coded data, and
l   assessing the validity of the findings.

Chapter 3

Why Should GAO Analysts Use
Content Analysis?

                          In this chapter, we conclude our discussion by presenting some reasons
                          both for using and not using content analysis. We discuss some advan-
                          tages and disadvantages of content analysis and give brief hypothetical
                          cases of potential application in GAO'S work.

                          All researchers who want to analyze written material systematically
What Content              should consider content analysis. It is a means of extracting insights
Analysis Can Do           from already existing data sources. Therefore, it is potentially applica-
                          ble to at least part of almost every project.

It Can Provide            Content analysis of existing written or otherwise recorded material
Unobtrusive Measures      yields unobtrusive and nonreactive measures. One problem with some
                          experimental methods, as with surveys, is that interactions between
                          analysts and their subjects can cause the subjects to react to the situa-
                          tion rather than in their more “natural” manner, and this may introduce
                          bias into the results. Additionally, survey questions that are considered
                          inappropriate because they invade a respondent’s privacy may have to
                          be eliminated from analysis. Content analysis of existing documents
                          avoids both problems.

It Can Cope With Large    Large volumes of written material can be analyzed with the help of con-
Volumes of Written        tent analysis because explicit coding instructions, precise categories, and
                          extensive reliability checks make it possible to use any number of
Material                  trained individuals to code the material. Furthermore, it allows two or
                          more sets of coders to work on the same kind of data in different loca-
                          tions, such as at headquarters and in regional offices.

It Helps Analysts Learn   Content analysis can help analysts learn more about the programs they
About the Substantive     are investigating and their issues. This benefit results from two charac-
                          teristics. Content analysis is systematic in nature, and its task of devis-
Area                      ing reliable and useful categories is rigorous.

It Can Validate Other     In chapter 2, we discussed how to validate content analysis findings by
Methods                   corroborating them with findings from other methods. Validation can
                          also move in the opposite direction. That is, findings from content analy-
                          sis can be used to test the validity of findings from other measures, such
                          as survey data and econometric proxies. Webb and others have

                              Chapter 3
                              Why Should GAO Analyets Use
                              Content Analysis?

                              described how investigators can use “multiple operations” to increase
                              confidence in their findings. (Webb et al., 1981)

                              We have explained some of the many reasons for using content analysis,
Pitfalls in Using             but analysts planning to undertake content analysis should also be
Content Analysis              aware of some pitfalls that await them. The ready availability of rele-
                              vant material may tempt analysts into aimless and expensive “fishing
                              expeditions” motivated by the hope of turning up something interesting.
                              Quantifying documentary information may produce important and
                              interesting data, but not resisting the temptation to count things for the
                              sake of counting is likely to produce precise but meaningless or trivial

It Can Be Costly              Content analysis is relatively costly and time consuming. Interviewing
                              users of content analysis and reviewing the literature on the method
                              reveal three potential contributions to prohibitive cost.

                              1. Formulating categories that can be reliably coded is problematic,
                              repetitive, and time consuming. The time it takes to structure and
                              pretest categories may range from a few days to two or three months.

                              2. Staff have to train coders if they intend to analyze more data than
                              they can handle themselves. Preparing a coding manual and training
                              and supervising the coders can add a significant length of time to a pro-
                              ject. Content analysis can be especially expensive in regard to time
                              expended if the categorization scheme requires subtle coding decisions.

                              3. Coding substantial amounts of written material takes a great deal of
                              staff time if the recording unit is small (for example, when it is words or
                              themes), and even more time when the context unit is large (for exam-
                              ple, when it is lengthy reports). Since coding must be systematic, it may
                              also be tedious and arduous. Using a computer trades the coding prob-
                              lem for that of computerizing the text or preparing a dictionary, which
                              can also be time consuming and therefore expensive.

It Can Pose Reliability and   Reliability and validity are interdependent concepts. Generally, trade-
Validity Problems             offs have to be made between them because precisely defined categories
                              can produce results that are highly reliable and statistically significant
                              but that lack practical significance. The need for objective and replicable
                              results may force analysts to forego coding what they are interested in

                              and to code instead what can be done mechanically, thus threatening
                              validity. Redefining categories to increase their reliability can lead to a
                              loss of relevance-that    is, a loss of validity-and,   therefore, of useful-
                              ness. Because of this dilemma, validity has to be     assessed  after catego-
                              ries have been developed.

Potential Applications        terms of three factors-a project’s objectives, the material to be ana-
in PrOsa       Evaluation iyzed,
                              and the kinds         of analysis required. We give brief cases of hype-
                              thetical application that focus on three program evaluation objectives,
                              showing how content analysis could be used to study them.

Identifying   Program Goals   One objective of a program evaluation might be to identify the pro-
                              gram’s goals. To do this, an analyst might gather written or tape-
                              recorded information on the program’s legislative history from its
                              authorizing legislation and congres&@ committee reports, from pro-
                              gram policy documents, and from transcripts of interviews with agency
                              officials. With content analysis, the analyst’s review of this material
                              could be made objective and systematic. Besides providing analysts with
                              a structured format for identifying the program’s goals, this technique
                              can facilitate determination of whether those goals are congruent with
                              legislative intent because it allows, for example, comparison of agency
                              documents with congressional committee reports.

Describing Program            A program evaluation might have MJan objective a description of the
Activities                    program’s activities. To achieve this objective, an analyst could develop
                              case studies, attend agency me&ngs, or interview program managers.
                              Information gathered in these ways would then be documented in staff
                              workpapers. These, in turn, can be examined by means of content

                              F’rom such analysis, concise, objective summaries of the material can be
                              produced, or more complex analyses can be designed. An example would
                              be an analysis of trends in program activities across time. The targeting
                              of program activities couhi also be kr~&@I&         with content analysis.
                              Recipients of program e          CQUWbe mewed            and transcripts
                              could be made of their responses, afkr which their eligibility for receiv-
                              ing services could be examined by comparing information obtained from
                              the interviews with established eligibility criteria.

                      Chapter 3
                      Why Should GAO Analysts Use
                      Content Analysis?

Determining Program   A program evaluation might have the ascertaining of the program’s
                      results as an objective. In this situation, analysts might gather informa-
Results               tion by studying earlier evaluation reports or by surveying program par-
                      ticipants. In surveys, open-ended questions could be appropriate for
                      gaining information about issues, perceptions, or attitudes that cannot
                      otherwise be identified. Analysts who do not want to impose their own
                      concepts on survey respondents may, therefore, be unable to formulate
                      appropriate closed questions. Using content analysis on open-ended sur-
                      vey data, such analysts can examine trends in program outcomes across
                      time and compare them to changes in program activities. Alternatively,
                      they could examine trends across groups of program participants distin-
                      guished by geographical location, age, income, and the like.

                      We hope we have given readers of this paper a realistic sense of both the
Conclusion            advantages and disadvantages of content analysis. The method does
                      have limitations. Without clear objectives, content analysis can produce
                      very precise information that is, however, meaningless. The method can
                      be costly in that formulating categories that can be reliably coded, pre-
                      paring coding instructions, and training and supervising coders can all
                      be time consuming. Additionally, complex coding schemes, which usu-
                      ally yield the most interesting findings, may produce the least reliable
                      results because they entail a substantial element of coder judgment.
                      Content analysis, therefore, requires rigorous reliability and validity
                      checks if its results are to withstand critical scrutiny. Moreover, the
                      results also depend on the quality of information contained in the docu-
                      ments being analyzed. If these are not reliable or valid, even the most
                      rigorous content analysis will have limited value.

                      Nonetheless, content analysis is potentially applicable to at least part of
                       almost all projects. Content analysis can be used at any stage of a pro-
                      ject, but it is particularly useful at the beginning to help analysts learn
                       about the project’s substantive area. It is an excellent method for gath-
                       ering retrospective information about a program from existing data
                       sources. It does not require the collection of new data, and this means
                       that it saves time and money. The possibilities for application we have
                       discussed in this chapter are not exhaustive; rather, we have intended to
                       show the method’s versatility. The number and kind of areas in which
                       content analysis can be applied and the questions it can help answer are
                       limited primarily by its user’s ingenuity and skill in structuring reliable
                       and valid category formats.

