oversight

Decennial Census: Methods for Collecting and Reporting Hispanic Subgroup Data Need Refinement

Published by the Government Accountability Office on 2003-01-17.

Below is a raw (and likely hideous) rendition of the original report. (PDF)

               United States General Accounting Office

GAO            Report to Congressional Requesters




January 2003
               DECENNIAL CENSUS
               Methods for Collecting
               and Reporting
               Hispanic Subgroup
               Data Need Refinement




GAO-03-228
               a
                                                January 2003


                                                DECENNIAL CENSUS

                                                Methods for Collecting and Reporting
Highlights of GAO-03-228, a report to
Congressional Requesters                        Hispanic Subgroup Data Need
                                                Refinement


 To help boost response rates of both           In both the 1990 and 2000 censuses, Hispanics could identify themselves as
 the general and Hispanic                       Mexican, Puerto Rican, Cuban, or other Hispanic. Respondents checking off
 populations, the U.S. Census Bureau            this latter category could write in a specific subgroup such as “Salvadoran.”
 (Bureau) redesigned the 2000                   The “other” category in the 1990 Census included examples of subgroups to
 questionnaire, in part by deleting a           clarify the question. For the 2000 Census, the Bureau removed the subgroup
 list of examples of Hispanic                   examples as part of a broader effort to simplify the questionnaire and help
 subgroups from the question on                 improve response rates. The Bureau removed unnecessary words and added
 Hispanic origin. While more                    blank space to shorten the questionnaire and make it more readable.
 Hispanics were counted in 2000
 compared to 1990, the counts for
 Dominicans and other Hispanic                  Although the Bureau conducted a number of tests on the sequencing and
 subgroups were lower than                      wording of the race and ethnicity questions, and sought input from several
 expected. Concerned that this was              expert panels, no Bureau tests were designed specifically to measure the
 caused by the deletion of Hispanic             impact of the questionnaire changes on the quality of Hispanic subgroup data.
 subgroup examples, congressional               According to Bureau officials, because federal laws and guidelines require
 requesters asked us to investigate             data on Hispanics but not Hispanic subgroups, the Bureau targeted its
 the research and management                    resources on research aimed at improving the overall count of Hispanics.
 activities behind the changes.                 Bureau evaluations conducted after the census indicated that deleting the
                                                subgroup examples might have confused some respondents and produced
                                                less-than-accurate subgroup data. A key factor behind the Bureau’s release of
                                                the questionable subgroup data was its lack of adequate guidelines governing
 GAO recommends that the Bureau                 the quality needed before making data publicly available. As part of its
  •   implement its plans to                    planning for the 2010 Census, the Bureau intends to conduct further research
      conduct further research on               on the Hispanic origin question, including a field test in parts of New York
      the Hispanic question, taking
      steps to properly test the
                                                City. However, until research on a new version of the question is finalized,
      impact of any changes on the              Bureau officials said that other census surveys will continue to use the 2000
      quality of data on Hispanic               Census format of the Hispanic origin question.
      subgroups and Hispanics
      overall, and
  •   develop agencywide protocols               Enumerator Administers Census Questionnaire
      that provide guidelines for
      Bureau decisions on the level
      of quality needed to release
      data to the public, how to
      characterize any limitations in
      the data, and when it is
      acceptable to delay or
      suppress the data.

 The Bureau agreed with our
 recommendations, but took
 exception to our findings concerning
 the adequacy of its data quality
 guidelines.
 www.gao.gov/cgi-bin/getrpt?GAO-03-228.

 To view the full report, including the scope
 and methodology, click on the link above.
 For more information, contact Patricia A.
 Dalton at (202) 512-6806.                       Source: U.S. Census Bureau.
Contents



Letter                                                                                                                1
                           Results in Brief                                                                           2
                           Background                                                                                 4
                           Objectives, Scope, and Methodology                                                         5
                           Efforts to Simplify Questionnaire Led Bureau to Delete List of
                             Example Hispanic Subgroups                                                                6
                           The Bureau Plans to Conduct Targeted Research on Hispanic
                             Subgroups in the Future                                                                  15
                           Conclusions                                                                                17
                           Recommendations for Executive Action                                                       18
                           Agency Comments and Our Evaluation                                                         18


Appendix
             Appendix I:   Comments from the Department of Commerce                                                   20


Related GAO Products                                                                                                  25


Figures                    Figure 1: Evolution of the Hispanic Question from the 1970 Census
                                     to the 2000 Census                                                               7
                           Figure 2: The Bureau Simplified the 2000 Census Questionnaire                              9
                           Figure 3: The 2000-Style Questionnaire Produced Lower Subgroup
                                     Counts than Those from a Test Using the 1990-Style
                                     Questionnaire                                                                    15




                            This is a work of the U.S. Government and is not subject to copyright protection in the
                            United States. It may be reproduced and distributed in its entirety without further
                            permission from GAO. It may contain copyrighted graphics, images or other materials.
                            Permission from the copyright holder may be necessary should you wish to reproduce
                            copyrighted materials separately from GAO’s product.




                           Page i                                                       GAO-03-228 Decennial Census
A
United States General Accounting Office
Washington, D.C. 20548



                                    January 17, 2003                                                             Leter




                                    The Honorable Danny K. Davis
                                    Ranking Minority Member
                                    Subcommittee on Civil Service,
                                      Census and Agency Organization
                                    Committee on Government Reform
                                    House of Representatives

                                    The Honorable Wm. Lacy Clay
                                    The Honorable Charles A. Gonzalez
                                    The Honorable Carolyn B. Maloney
                                    House of Representatives

                                    Collecting data on race and ethnicity is among the federal government’s
                                    most complex and controversial data collection efforts. The decennial
                                    census has collected these data in various forms beginning with the very
                                    first national headcount in 1790. Since the 1960s, race and ethnicity data
                                    have been used to monitor and enforce compliance with a number of civil
                                    rights laws, including those governing equality in employment, voting,
                                    housing, mortgage lending, health care services, and education. Over time,
                                    in response to changing federal mandates, demographics, and its own
                                    operational requirements, the U.S. Census Bureau (Bureau) has changed
                                    the format and sequence of the race and ethnicity questions. The Bureau
                                    made one such change for the 2000 Census when, in an effort to improve
                                    the count of Hispanics and simplify the questionnaire, it redesigned the
                                    question on Hispanic origin and dropped a list of examples of Hispanic
                                    subgroups.

                                    As soon as the Hispanic and Hispanic subgroup data from the 2000 Census
                                    were released in May 2001, questions were raised about the counts for
                                    specific Hispanic subgroups. For example, the reported count of
                                    Dominican Hispanics was significantly lower than the counts reported in
                                    other Bureau surveys. Concerned that the lower-than-expected Hispanic
                                    subgroup counts were the result of dropping the list of example write-in
                                    Hispanic subgroups from the 2000 questionnaire, you asked us to
                                    investigate the research and management activities behind this change. As
                                    agreed with your offices, we reviewed (1) the decision-making process
                                    behind the Bureau’s removal of the example subgroups, (2) the research
                                    the Bureau conducted to aid in that decision, and (3) the Bureau’s future
                                    plans for collecting Hispanic subgroup data.




                                    Page 1                                            GAO-03-228 Decennial Census
                   This report parallels our recent study addressing congressional concerns
                   about how the Bureau reported data on people counted at emergency and
                   transitional shelters, a segment of the population that includes, among
                   others, the homeless.1 Both reports are part of our ongoing series on
                   lessons learned from the 2000 Census that can help inform the planning
                   effort for 2010. (See the Related GAO Products section for the reports
                   issued to date).



Results in Brief   The Bureau removed examples of Hispanic subgroups from the census
                   question on Hispanic origin as part of an effort to make the questionnaire
                   more “respondent-friendly.” The Bureau’s evaluations of the 1990 Census
                   indicated that deleting unnecessary words and adding more white space,
                   among other changes, could help improve response rates. The Bureau also
                   modified the wording and format of the Hispanic question in order to
                   improve Hispanic participation in the census.

                   Throughout the 1990s, the Bureau conducted a number of tests to
                   determine the impact that these and other changes had on the overall count
                   of Hispanics. However, because Office of Management and Budget
                   standards governing the collection of race and ethnic data do not require
                   data on Hispanic subgroups, the Bureau did not specifically design any
                   tests to determine the likely effect of the changes on the quality of Hispanic
                   subgroup data.

                   Although the Bureau did not test the likely impact of questionnaire changes
                   on the Hispanic subgroup data, it released subgroup counts along with the
                   overall Hispanic data in May 2001. Immediately following the release of
                   these data, local government officials and representatives of Hispanic
                   subgroups raised questions about the accuracy of specific subgroup
                   counts. Bureau evaluations conducted following the census suggest that
                   dropping the examples of Hispanic subgroups confused some respondents
                   and produced less-than-accurate subgroup data. For example, in one
                   experiment, the Bureau mailed a 1990-style questionnaire (which included
                   subgroup examples) to a sample of individuals as part of the 2000 Census.
                   The Bureau found that 93 percent of Hispanics given the 1990-style form
                   reported a specific subgroup, compared to 81 percent of Hispanics given


                   1
                    U.S. General Accounting Office, Decennial Census: Methods for Collecting and Reporting
                   Data on the Homeless and Others without Conventional Housing Need Refinement, GAO-
                   03-227 (Washington, D.C: Jan. 17, 2003).




                   Page 2                                                    GAO-03-228 Decennial Census
the 2000-style form. Thus, while the Bureau reported what respondents
marked on their questionnaires, because of respondents’ confusion over
the wording of the question, the subgroup data could be misleading.

The Bureau has made improving the quality of the Hispanic question a
focus for the 2010 Census and intends to test questionnaire changes aimed
at improving the quality of its overall count of Hispanics and its counts of
Hispanic subgroups. In 2003, the Bureau is to begin testing the Hispanic
question, and as part of a field test in 2004, the Bureau plans to administer
the questionnaire in parts of the New York City borough of Queens. Any
changes to the census questionnaire will also affect other Bureau surveys,
such as the proposed American Community Survey (ACS), which the
Bureau designed in part to replace the census long-form questionnaire.
Bureau officials said that the ACS will continue to use the 2000 Census
Hispanic question until research and testing on a new version is complete.

A key factor behind the Bureau’s release of apparently less-than-accurate
Hispanic subgroup data appears to be a lack of adequate guidelines
governing decisions on quality considerations that should be addressed
before making data publicly available. Had such guidelines been in place
prior to releasing the Hispanic subgroup data, they could have prompted
the Bureau to apply more rigorous quality checks on the accuracy of the
Hispanic subgroup data; provided a basis for either releasing, delaying, or
suppressing the data; and informed decisions on how to describe any of
their limitations.

The lack of data quality guidelines resulted in similar difficulties when the
Bureau initially decided not to release data on the homeless and others
without conventional housing. In our companion report, we recommended
that the Secretary of Commerce ensure that the Bureau develop
agencywide guidelines governing the level of quality needed to release data
to the public, when and how to characterize any limitations, and when it is
acceptable to suppress data. Because these incidents, if repeated, could
erode public confidence in the data, it will be important for the Bureau to
implement these recommendations. Additionally, with respect to the
Hispanic subgroup data, we are recommending that the Bureau take steps
to properly test the impact that any changes to the Hispanic origin question
have on the quality of Hispanic data, and the quality of Hispanic subgroups
in particular.

The Secretary of Commerce forwarded written comments from the U.S.
Census Bureau on a draft of this report (see app. I). The Bureau agreed



Page 3                                              GAO-03-228 Decennial Census
             with our conclusions and recommendations and is taking steps to
             implement them, but took exception to our findings concerning the
             adequacy of its data quality guidelines.



Background   While the decennial census has long collected data on race and ethnicity,2 a
             specific question on Hispanic origin was first added to the 1970 Census in
             response to the 1965 Voting Rights Act, which required the data to ensure
             equality in voting.3 Today, antidiscrimination provisions in a number of
             statutes require census data on race and Hispanic origin in order to monitor
             and enforce equal access to housing, education, employment, and other
             areas. The Office of Management and Budget (OMB), through its Federal
             Statistical Policy Directive No. 15, sets the standards governing federal
             agencies’ collection and reporting of race and ethnicity data.

             At least seven cabinet-level government departments, the Federal Reserve,
             every state government, and a number of public and private organizations
             use Hispanic data. Although not required by federal legislation or OMB
             standards, Hispanic subgroup data are also used for many of these same
             purposes. In addition, subgroup data are especially important to
             communities with rapidly growing and diverse Hispanic populations.

             Collecting data on race and ethnicity has been a persistent challenge for the
             Bureau. Race and ethnicity are subjective characteristics, which makes
             measurement difficult. Moreover, the Bureau has found that some
             Hispanics equate their ethnicity—Hispanic—with race, and thus find it
             difficult to classify themselves by the standard race categories that include,
             for example, white, black, and Asian.

             The Bureau’s preparations for the 2000 Census included an extensive
             research and testing program to improve the Hispanic count. In 1990, the


             2
              The Bureau, in accordance with Office of Management and Budget Federal Statistical
             Policy Directive 15, Race and Ethnic Standards for Federal Statistics and Administrative
             Reporting, collects data on two ethnicities: Hispanic origin and not of Hispanic origin. We
             use the same definition in this report. Additionally, the standards call for self-reporting of
             race and ethnicity rather than identification based on scientific or anthropological
             standards. The standards also cover reporting on race and ethnicity in administrative
             reports and for civil rights monitoring. They also specify that the data are not to be used for
             determining program eligibility.
             3
             42 U.S.C. 1973aa-1a.




             Page 4                                                          GAO-03-228 Decennial Census
                         Bureau estimated that it did not enumerate 5 percent of the Hispanic
                         population. Further, the ethnicity question, which was posed to all
                         respondents, appeared to confuse both Hispanics and non-Hispanics. For
                         example, many non-Hispanics, thinking the question only pertained to
                         Hispanics, did not answer the question. Overall, 10 percent of respondents
                         failed to answer the 1990 Hispanic question—the highest of any short form
                         item in 1990. As a result, the Bureau made improving the Hispanic count a
                         major priority for the 2000 Census.



Objectives, Scope, and   Our objectives were to review (1) the Bureau’s decision-making process
                         that led to its dropping the list of subgroup examples from the Hispanic
Methodology              question on the 2000 Census form, (2) the research conducted by the
                         Bureau to aid in this decision, and (3) the Bureau’s future plans for
                         collecting Hispanic subgroup data.

                         To address each of these objectives, we interviewed key Bureau officials
                         and examined Bureau, OMB, and other documents, including planning
                         materials and internal memos. To obtain a local perspective of how
                         municipal governments and community leaders use Hispanic subgroup
                         data, we met with data users in New York City, including representatives of
                         the New York Department of Planning and the Dominican and Puerto Rican
                         communities. We also attended a meeting of the Dominican American
                         National Round Table, a Dominican American advocacy group that
                         discussed issues relating to the 2000 Census count of Dominican Hispanics.
                         We also attended meetings of the Census Advisory Committee on Race and
                         Ethnicity that addressed the issue of the quality of the Hispanic subgroup
                         data.

                         Finally, to examine the research behind the Bureau’s decision to remove
                         the example subgroups from the 2000 questionnaire, we reviewed the
                         results of the Bureau’s National Content Survey, Targeted Race and
                         Ethnicity Test, and other research conducted throughout the 1990s in
                         preparation for the 2000 Census. Additionally, we reviewed information
                         from the Bureau’s meetings with its Advisory Committee on the Decennial
                         Census and its Advisory Committee on Race and Ethnicity. We also
                         examined relevant materials from OMB’s Interagency Committee for the
                         Review of the Racial and Ethnic Standards.

                         To review the Bureau’s future plans for collecting Hispanic subgroup data,
                         we attended meetings of the National Academy of Science Panel on Future
                         Census Methods, the Decennial Census Advisory Committee, and the



                         Page 5                                            GAO-03-228 Decennial Census
                        Census Advisory Committee on Race and Ethnicity. We also discussed
                        these plans with Bureau officials.

                        Our audit work was conducted in New York City and Washington, D.C., and
                        at the Bureau’s headquarters in Suitland, Maryland, from January through
                        September 2002. Our work was done in accordance with generally
                        accepted government auditing standards.

                        We requested comments on a draft of this report from the Secretary of
                        Commerce. On November 27, 2002, the Secretary forwarded the U.S.
                        Census Bureau’s written comments on the draft. The comments are
                        reprinted in appendix I. We address these comments at the end of this
                        report.



Efforts to Simplify     Collecting accurate ethnic data has challenged the Bureau for over 30
                        years. Since the 1970 Census, when the Bureau first included a question on
Questionnaire Led       Hispanic origin, every census has had comparatively high Hispanic
Bureau to Delete List   undercounts that reduced the quality of the data. As a result, the Bureau
                        has modified the Hispanic question on every census since then as part of a
of Example Hispanic     continuing effort to improve the Hispanic count. (See fig. 1.) In addition, a
Subgroups               Spanish language version of the census form has been available upon
                        request since 1980.




                        Page 6                                              GAO-03-228 Decennial Census
Figure 1: Evolution of the Hispanic Question from the 1970 Census to the 2000 Census
         ▲




  1970                                                          First time data were collected.




                                     "Spanish/Hispanic" added to question.         "No" moved to front of list.
          ▲




   1980




                                                                                                     "Central or South American" removed.


                                                Dropped the word "descent."
       ▲




  1990




                                                                                                              Example write-in groups listed;
                                                                                                              respondents allowed to provide
                                                                                                              a write-in response for "other
                                                                                                              Spanish/Hispanic."
                                                          "Latino" added.
       ▲




  2000

                                                                  Dropped the word "origin."



                                                                                    Location of instructions to write in subgroups moved.
                                                                               Examples of write-in other Hispanic subgroups were removed.




Source: U.S. Census Bureau and GAO analysis.




                                                Page 7                                                              GAO-03-228 Decennial Census
For the 2000 Census, Hispanics could identify themselves as Mexican,
Puerto Rican, Cuban, or “other Spanish/Hispanic/Latino.” Respondents
who checked off this last category could write in a specific subgroup such
as “Salvadoran.” Although this approach was similar to that used for the
1990 Census, as shown in figure 1, the “other” category in the 1990 Census
included examples of other Hispanic subgroups. The Bureau deleted these
examples as one of several changes to the Hispanic question for the 2000
Census. Other changes included (1) adding the word “Latino” to the
designation Spanish/Hispanic, (2) dropping the word “origin” from the
question, and (3) moving the location of instructions on writing in an
unlisted subgroup. According to Bureau officials, these latter three
changes were made to improve the Hispanic count.

The Bureau removed the subgroup examples as part of a broader effort to
simplify the questionnaire and thus help reverse the downward trend in
mail response rates that had been occurring since 1970. Indeed,
evaluations of the 1990 Census indicated that the overall design of the form
was confusing to many and contributed to lower response rates,
particularly among some hard-to-enumerate groups such as Hispanics. In
redesigning the questionnaire, the Bureau added as much white space as
possible, and removed unnecessary words to make the questionnaire
shorter and more readable. As shown in figure 2, the 2000 questionnaire
appears more “respondent-friendly” compared to the 1990 questionnaire.




Page 8                                             GAO-03-228 Decennial Census
Figure 2: The Bureau Simplified the 2000 Census Questionnaire

           1990 Questionnaire


                                                                                                       2000 Questionnaire

                                          1                 1
                                                                1



                                                                                                      2


                 2




                                               3                            3




                                2                               2
                                                       4                     4




                           1990 Questionnaire                           2000 Questionnaire
                  1        Multiple people on each page                 Each household member on separate page
                 2         Race question before Hispanic question       Hispanic question before race question
                 3         Respondents fill in bubbles to mark age      Respondents write-in age, saving space
                 4         Space used to list many Hispanic subgroups   Fewer Hispanics subgroups listed, saving space


Source: U.S. Census Bureau and GAO analysis.




                                                   Page 9                                                  GAO-03-228 Decennial Census
The Bureau initially proposed removing the example write-in subgroups
during 1990 through 1992. A first version of the questionnaire without the
example subgroups was used in the 1992 National Census Test. However,
as discussed in the next section, testing continued from 1992 to 1996 to
ensure that removing the write-in example groups did not harm the overall
count of Hispanics. From 1995 to 1997, after testing showed that removal
of the write-in example groups would not harm the overall Hispanic count,
the Bureau finalized its decision to remove the example subgroups.

Although federal law and OMB standards4 only require information on
whether an individual is Hispanic, Bureau officials told us they collect
subgroup data to help improve the overall Hispanic count. According to
the Bureau, many Hispanics do not view themselves as Hispanic, but
identify instead with their country of origin or with a particular Hispanic
subgroup. State and local governments, academic institutions, community
organizations, and marketing firms, among other organizations, also use
Hispanic subgroup data for a variety of purposes. For example, officials in
the New York City Department of Planning told us that they need accurate
information on the number and distribution of Hispanic subgroups in
planning the delivery of numerous city services.

According to a Bureau official, no data are available on the precise impact
the questionnaire redesign had on overall response rates in part because it
was made in conjunction with other efforts to improve the response rate,
such as a more aggressive outreach and promotion campaign. However,
the initial mail response rate was 64 percent, 3 percentage points higher
than the Bureau’s expectations, and comparable to the similar 1990 mail
response rate.




4
 Public Law 94-311 requires the collection of data on “Americans of Spanish origin or
descent.” OMB Federal Statistical Policy Directive 15 states that collection of data on
Hispanic subgroups is optional, as long as the collection of these data does not harm efforts
to collect accurate data on the number of Hispanics.




Page 10                                                        GAO-03-228 Decennial Census
                           Moreover, evaluations conducted since the 2000 Census by the Bureau
                           indicate that the Bureau obtained a more complete count of Hispanics in
                           the 2000 Census than it did in 1990. For example, Bureau data show that
                           the 2000 Census missed an estimated 2.85 percent of the Hispanic
                           population compared to an estimated 4.99 percent in 1990—a 43 percent
                           reduction of the undercount.5 The Bureau credits the improvement in part
                           to the changes it made to the questionnaire. However, as discussed in the
                           next section, removing the examples of Hispanic subgroups may have
                           reduced the completeness of data on individual segments of the Hispanic
                           population.



No Bureau Tests Were       Bureau guidance requires that any changes to the census form must first be
Designed Specifically to   thoroughly tested. For example, according to Bureau officials, before
                           changing a question, the Bureau must first conduct research studies,
Measure the Impact of
                           cognitive tests, and field tests to determine how best to sequence and word
Questionnaire Changes on   the question, and to see if the proposed changes are likely to achieve the
Hispanic Subgroup Data     desired results. Additionally, the census questionnaire is to be reviewed by
                           a variety of census advisory groups, OMB, and Congress before it is
                           finalized.

                           Nevertheless, while the Bureau conducted a number of tests of the
                           sequencing and wording of the race and ethnicity questions, according to
                           Bureau officials, it did not specifically design any tests to determine the
                           impact of the changes on the quality of Hispanic subgroup data.6 Because
                           OMB standards do not require data on Hispanic subgroups, Bureau officials
                           said that the Bureau targeted its resources on testing and research aimed at
                           improving the overall count of Hispanics.




                           5
                            These figures represent the net Hispanic undercount, which is the difference between the
                           estimated Hispanic population per the Bureau’s Accuracy and Coverage Evaluation Survey
                           and the census count.
                           6
                            The Census Bureau did look at the impact of changes on Hispanic subgroups. However, the
                           sample size in the test was not large enough to detect statistically significant differences for
                           the Hispanic subgroups that constitute the “Other Spanish/Hispanic/Latino” population.
                           Additionally, the test was not designed to detect the impact of each change to the question
                           separately.




                           Page 11                                                          GAO-03-228 Decennial Census
Throughout the 1990s, in revising the race and ethnicity questions, the
Bureau sought input from several expert panels, including the Interagency
Committee formed by OMB7 and the Census Advisory Committee on Racial
and Ethnic Populations, one of several panels with which the Bureau
consulted to help it plan the 2000 Census. In addition, the Bureau
conducted several tests of the questionnaire to assess respondents’
understanding of the questions and their ability to complete them properly.
They included the

• 1992 National Census Test, which field tested potential questions for the
  2000 Census questionnaire;

• 1996 National Content Survey, which examined a number of issues to
  improve race and ethnic reporting; and

• 1996 Race and Ethnic Targeted Test, which tested alternative formats
  for asking race and ethnic questions.

In addition, the Bureau analyzed the results of Hispanic data from the 1990
Census (which led to its conclusions about the undercount), but did not
conduct any specific evaluations of the quality of the 1990 Hispanic
subgroup data. The consultation, research, and testing played a key role in
the Bureau’s decisions to place the ethnicity question before the race
question and make several other changes discussed earlier in this report.

The test results also indicated that the example subgroups could produce
conflicting results. On the one hand, the Bureau found that providing the
example subgroups could help prevent respondents’ confusion over how to
describe their ethnicity. On the other hand, the Bureau found that
removing the example subgroups could help reduce the bias caused by the
example effect, which occurs when a respondent erroneously selects a
response because it is provided in the questionnaire.

Although the Bureau conducted a dress rehearsal for the 2000 Census in
1998 in order to test its overall design, the dress rehearsal did not identify
any problems with the Hispanic subgroup question. According to Bureau
officials, this could have been because none of the three test sites—the city
of Sacramento, California; Menominee County, Wisconsin, including the


7
 A group of more than 30 agencies that represent the many and diverse federal needs for
data on race and ethnicity, including statutory requirements for such data.




Page 12                                                     GAO-03-228 Decennial Census
                             Menominee American Indian Reservation; and the city of Columbia, South
                             Carolina, and its 11 surrounding counties—had a large and diverse enough
                             Hispanic population for the problems to become evident.



Questions Raised about the   In May 2001, the Bureau released data on Hispanics and Hispanic
Quality of Reported          subgroups as part of its first release summarizing the results of the 2000
                             Census, called the SF-1 file. The Bureau also published The Hispanic
Hispanic Subgroup Data
                             Population, a 2000 Census brief that provided an overview of the size and
                             distribution of the Hispanic population in 2000 and highlighted changes in
                             the population since the 1990 census. For the first time, the Bureau
                             released data on Hispanic subgroups as a part of its release of the full count
                             SF-1 data even though it had not fully tested the impact of questionnaire
                             changes on the subgroup data and provided little discussion of the
                             potential limitations of the data.

                             Following the initial release of the Hispanic data, local government officials
                             and Hispanic advocacy groups raised questions about the accuracy of the
                             counts of Hispanic subgroups listed as examples on the 1990 census form,
                             but not the 2000 form. The 2000 Census showed lower counts of several
                             Hispanic subgroups than analysts had expected based on their own
                             estimates using a variety of information sources such as vital statistics,
                             immigration statistics, population surveys, and other data. In New York
                             City, local government officials and representatives of Hispanic subgroups
                             who partnered with the Bureau to improve the enumeration of Hispanics
                             told us that they were particularly concerned about low subgroup counts in
                             their communities in part because they needed accurate numbers to plan
                             and deliver specialized services to particular subgroups. Moreover, they
                             said that because “official census numbers” are often considered definitive,
                             problems with the released Hispanic subgroup numbers could lead to
                             faulty decision making by data users.




                             Page 13                                             GAO-03-228 Decennial Census
Questionnaire Modifications   Since the release of the 2000 Census Hispanic data, the Bureau has
May Have Led to Problems      conducted evaluations of the data that provided more information on how
                              removing the subgroup examples may have affected the quality of Hispanic
with Hispanic Subgroup        subgroup data. One key evaluation was the Alternative Questionnaire
Data                          Experiment, in which the Bureau sent out 1990-style census forms to a
                              sample of individuals as part of the 2000 Census. As shown in figure 3, the
                              Bureau’s research indicates that the 1990-style form elicited more reports
                              of specific Hispanic subgroups than the 2000-style questionnaire.8 Indeed,
                              93 percent of Hispanics given the 1990-style form reported a specific
                              subgroup, compared to 81 percent of Hispanics given the 2000-style form.
                              Moreover, virtually every subgroup reported in the 2000-style form
                              composed a smaller percentage of the overall Hispanic count than the 1990-
                              style form. Thus, while the Bureau reported what respondents checked off
                              on their questionnaires, because of respondents’ confusion over the
                              wording of the question, the 2000 subgroup data could be misleading.

                              Figure 3 also suggests that one possible reason for this might be that many
                              respondents did not understand what they were supposed to write in, as
                              many more people on the 2000-style form wrote in “Hispanic,” “Spanish,” or
                              “Latino” (as opposed to a specific subgroup) compared to the 1990-style
                              questionnaire. Additionally, a higher percentage of the respondents did not
                              provide codeable (useable) responses.

                              Moreover, based on its analysis of the Census 2000 Supplementary
                              Survey—an operational test for collecting long-form-type data based on a
                              nationwide sample of 700,000 households—the Bureau estimated that
                              there were about 150,000 more Dominican Hispanics than were counted in
                              the 2000 Census. Some attribute the discrepancy to the fact that many
                              respondents to the supplementary survey provided their answers by
                              telephone, where enumerators were able to help them better understand
                              the question on Hispanic subgroups.




                              8
                               This study was conducted in English only. Because a sizable number of Hispanics only
                              speak Spanish, the results of this study cannot be generalized to the Hispanic population at
                              large.




                              Page 14                                                       GAO-03-228 Decennial Census
Figure 3: The 2000-Style Questionnaire Produced Lower Subgroup Counts than Those from a Test Using the 1990-Style
Questionnaire
12   Percentage                                                                                                                   11.90



10

                                                                                                                    8.68

 8
                                                                                                                                                        7.25


 6
                                                                                                                                                                   5.03
                                                                                                           4.20
 4
                                                                                                   3.33
                                          2.59 2.76                                2.28
                               1.89                                                                                                            1.90
 2
                       1.34                                              1.39
                                                      0.52 0.57                             0.32
     0.24 0.32
 0
                                                            n
            n




                               n




                                                                                                                  ic
                                                an




                                                                                                  d
                                                                             ran




                                                                                                                                                              le
                                                                                                                                            an ”
                                                                                                                                                   ”
                                                          ua
        nia




                            bia




                                                                                                                                          Sp nic,
                                                                                               iar




                                                                                                                                               ish
                                                                                                               cif




                                                                                                                                                            ab
                                               nic




                                                                             do
                                                       rag




                                                                                              an




                                                                                                               pe
       nti




                         lom




                                                                                                                                                         de
                                                                                                                                             a
                                               mi




                                                                          lva




                                                                                                                                   ,” o isp
                                                                                             Sp




                                                                                                           rs
                                                      ca
       ge




                                                                                                                                                        co
                                          Do
                       Co




                                                                        Sa
                                                      Ni




                                                                                                                                 no n “H
                                                                                                          he
     Ar




                                                                                                                                                       Un
                                                                                                                                       r“
                                                                                                          Ot




                                                                                                                                     i
                                                                                                                           “L rote
                                                                                                                             ati
                                                                                                                             W




     Reported hispanic subgroups
                                                                Census 2000 questionnaire

                                                                1990-style questionnaire

Source: U.S. Census Bureau and GAO analysis.




The Bureau Plans to                                   Because of concerns relating to the 2000 Census counts of Hispanic
                                                      subgroups, Bureau officials said that they plan to focus testing and
Conduct Targeted                                      research on these questions in preparation for the 2010 Census. In
Research on Hispanic                                  particular, they stated that the Bureau would examine the likely impact of
                                                      including Hispanic subgroup examples in the question again, as well as
Subgroups in the                                      other aspects of the question that caused problems for some respondents.
Future                                                Before deciding on a new version of the Hispanic question, the Bureau
                                                      must finish evaluating the results of the 2000 Census, conduct a number of
                                                      cognitive tests, and field-test proposed changes to the question. The
                                                      Bureau plans to begin testing the Hispanic question in 2003 and, as part of a
                                                      field test in 2004, to administer the questionnaire in parts of Queens, New
                                                      York, which the Bureau selected for its racial and ethnic diversity. The



                                                      Page 15                                                                               GAO-03-228 Decennial Census
                           Bureau intends to complete its testing and decide on changes to the
                           Hispanic question from 2006 through 2008.

                           Any changes to the Hispanic question are relevant not only for the 2010
                           Census, but also for other Bureau questionnaires, such as the proposed
                           ACS.9 Bureau officials told us that they expect that the ACS will continue
                           to use the 2000 Census Hispanic question until research and testing on a
                           new version is complete.



The Bureau Lacks Clearly   While continued research could help the Bureau collect better-quality
Written, Transparent       Hispanic subgroup data, it will also be important for the Bureau to address
                           what led it to release data that could mislead users. A key factor in this
Guidelines for Releasing
                           regard is that the Bureau lacks adequate guidelines for making decisions
Data                       about how data quality considerations affect the release of data to the
                           public. Had such guidelines been in place prior to releasing the Hispanic
                           subgroup data, they could have (1) prompted the Bureau to apply more
                           rigorous quality checks on the Hispanic subgroup data, (2) provided a basis
                           for either releasing, delaying, or suppressing the data, and (3) informed
                           decisions on how to describe any limitations to data released.

                           This is not the first time that the lack of Bureau-wide guidelines on the level
                           of quality needed for census results to be released to the public has created
                           difficulties for the Bureau and data users. As we noted in our companion
                           report10 on the Bureau’s methods for collecting and reporting data on the
                           homeless and others without conventional housing, one cause of the
                           Bureau’s shifting position on reporting those data and the resulting public
                           confusion appears to be its lack of documented, clear, transparent, and
                           consistently applied guidelines on the level of quality needed to release
                           data to the public. With the Hispanic subgroup data, the Bureau released
                           the information as planned before it could properly assess its quality,
                           identify problems, and report its limitations. More rigorous guidelines
                           could help ensure that decisions about the quality of all census data the
                           Bureau releases are more consistent and better understood by the public.



                           9
                            The ACS is designed to provide annual data for areas with populations of 65,000 or more
                           and multiyear averages for smaller geographic areas. The ACS is also intended to replace
                           the long-form Census questionnaire.
                           10
                                GAO-03-227.




                           Page 16                                                     GAO-03-228 Decennial Census
              In 2000, the Bureau initiated a program aimed at documenting Bureau-wide
              protocols designed to ensure the quality of data it collected and released.
              Because this effort is still in its early stages, we could not assess it.
              However, Bureau officials believe that the program is a significant first step
              in addressing the Bureau’s lack of data quality guidelines. As the Bureau
              develops its protocols further, it will be important that they be well
              documented, transparent, clearly defined, consistently applied, and
              properly communicated to the public.



Conclusions   Throughout the 1990s, the Bureau went to great lengths to improve
              response rates to the 2000 Census in general, and participation of Hispanics
              in particular. Although the unique contributions of the individual
              components of the Bureau’s efforts cannot be determined, the mail
              response rate was similar to the 1990 level, and the Bureau’s preliminary
              data suggest that the 2000 Census count of Hispanics was an improvement
              over the 1990 count. However, the counts of Hispanic subgroups do not
              appear to have been improved and, in fact, there is concern that some of
              these subgroup counts may be less accurate than the 1990 counts.
              Moreover, the Bureau’s experience in simplifying the questionnaire in part
              by removing the examples of the Hispanic subgroups shows the challenge
              the Bureau faces in trying to improve one component of the census count
              without adversely and unintentionally affecting other aspects of the census
              count. In light of these findings, it will be important for the Bureau to
              continue with its planned research on how best to enumerate Hispanic
              subgroups.

              The Bureau’s release of Hispanic subgroup numbers raised questions about
              the quality of the reported data and the Bureau’s decision to report these
              data as a part of its release of the SF-1 data. Although the specific
              questions about the Hispanic subgroup data differed from those identified
              in our review of the Bureau’s efforts to collect and report data on the
              homeless and others without conventional housing, a common cause of
              both sets of problems was the Bureau’s lack of agencywide guidelines for
              its decisions on the level of quality needed to release data to the public. As
              we recommended in our report on homeless counts, the Bureau needs to
              develop well-documented guidelines that spell out how to characterize any
              limitations in the data, and when it is acceptable to suppress these data.
              The Bureau should also ensure that these guidelines are documented,
              transparent, clearly defined, consistently applied, and properly
              communicated to the public.




              Page 17                                             GAO-03-228 Decennial Census
Recommendations for   To ensure that the 2010 Census will provide public data users with more
                      accurate information on specific Hispanic subgroups, we recommend that
Executive Action      the Secretary of Commerce ensure that the Director of the U.S. Census
                      Bureau implements Bureau plans to research the Hispanic question, taking
                      steps to properly test the impact of the wording, format, and sequencing on
                      the completeness and accuracy of the data on Hispanic subgroups and
                      Hispanics overall. In addition, as we also recommended in our companion
                      report on the homeless and others without conventional housing, we
                      recommend that the Bureau develop agencywide guidelines governing the
                      level of quality needed to release data to the public, when and how to
                      characterize any limitations, and when it is acceptable to delay or suppress
                      data.



Agency Comments and   The Secretary of Commerce forwarded written comments from the U.S.
                      Census Bureau on a draft of this report (see app. I). The Bureau agreed
Our Evaluation        with our conclusions and recommendations and, as indicated in the letter,
                      is taking steps to implement them. However, it expressed several general
                      concerns about our findings. The Bureau’s principal concerns and our
                      response are presented below. The Bureau also suggested minor wording
                      changes to provide additional context and clarification. We accepted the
                      Bureau’s suggestions and made changes to the text as appropriate.

                      The Bureau took exception to our findings concerning the adequacy of its
                      data quality guidelines noting that it “conducted the review of the data on
                      the Hispanic origin population using standard review techniques for
                      reasonableness and quality.” We do not question the Bureau’s commitment
                      to presenting quality data. Rather, our point is that the Bureau needs to
                      translate its commitment to quality into well documented, transparent,
                      clearly defined guidelines to provide a basis for consistent decision making
                      on the level of quality needed to release data to the public, and on when
                      and how to characterize any limitations. During our review, Bureau
                      officials, including the Associate Director for Methodology and Standards,
                      told us that the Bureau had few written guidelines, standards, or
                      procedures related to the quality of data released to the public.

                      A second general concern expressed by the Bureau dealt with our
                      characterization of problems with the Hispanic subgroup counts. The
                      Bureau said that the data met an acceptable level of quality because they
                      accurately reflect what people reported and therefore cannot be
                      characterized as erroneous. We agree with the Bureau on this specific



                      Page 18                                            GAO-03-228 Decennial Census
point. However, we take a broader view of data quality. Specifically, we
believe that questions about the accuracy of the Hispanic subgroup data
must also take into account problems that the respondents had in
understanding the meaning of the question. The Bureau challenged our
assertion that the wording of the question “confused” some respondents,
preferring to say that some respondents may have “interpreted” the
question wording, instructions, and examples differently than expected.
We agree with the Bureau that additional research will be required to
understand the extent of this problem. Nevertheless, we believe there is
sufficient evidence from the Bureau’s subsequent research and from
analysis of trends in the data to support our concerns about the accuracy of
Hispanic example subgroup counts in the 2000 Census.


As agreed with your office, unless you publicly announce its contents
earlier, we plan no further distribution of this report until 30 days from its
issue date. At that time, we will send copies of this report to the Chairman
of the House Committee on Government Reform, the Secretary of
Commerce, and the Director of the U.S. Census Bureau. Copies will be
made available to others on request. This report will also be available at no
charge on GAO’s home page at http://www.gao.gov.

Please contact me on (202) 512-6806 or by E-mail at daltonp@gao.gov if you
have any questions. Other key contributors to this report were Robert
Goldenkoff, Christopher Miller, Elizabeth Powell, Timothy Wexler, Ty
Mitchell, Benjamin Crawford, James Whitcomb, Robert Parker, and
Michael Volpe.




Patricia A. Dalton
Director
Strategic Issues




Page 19                                             GAO-03-228 Decennial Census
Appendix I

Comments from the Department of                             Appendx
                                                                  ies




Commerce                                                     Append
                                                                  x
                                                                  Ii




             Page 20              GAO-03-228 Decennial Census
Appendix I
Comments from the Department of
Commerce




Page 21                           GAO-03-228 Decennial Census
Appendix I
Comments from the Department of
Commerce




Page 22                           GAO-03-228 Decennial Census
Appendix I
Comments from the Department of
Commerce




Page 23                           GAO-03-228 Decennial Census
Appendix I
Comments from the Department of
Commerce




Page 24                           GAO-03-228 Decennial Census
Related GAO Products


             Decennial Census: Methods for Reporting and Collecting Data on the
             Homeless and Others without Conventional Housing Need Refinement.
             GAO-03-227. Washington, D.C.: January 17, 2003.

             2000 Census: Refinements to Full Count Review Program Could Improve
             Future Data Quality. GAO-02-562. Washington, D.C.: July 3, 2002.

             2000 Census: Coverage Evaluation Matching Implemented As Planned,
             but Census Bureau Should Evaluate Lessons Learned. GAO-02-297.
             Washington, D.C.: March 14, 2002.

             2000 Census: Best Practices and Lessons Learned for a More Cost-
             Effective Nonresponse Follow-Up. GAO-02-196. Washington, D.C.:
             February 11, 2002.

             2000 Census: Coverage Evaluation Interviewing Overcame Challenges,
             but Further Research Needed. GAO-02-26. Washington, D.C.: December 31,
             2001.

             2000 Census: Analysis of Fiscal Year 2000 Budget and Internal Control
             Weaknesses at the U.S. Census Bureau. GAO-02-30. Washington, D.C.:
             December 28, 2001.

             2000 Census: Significant Increase in Cost Per Housing Unit Compared
             to 1990 Census. GAO-02-31. Washington, D.C.: December 11, 2001.

             2000 Census: Better Productivity Data Needed for Future Planning and
             Budgeting. GAO-02-4. Washington, D.C.: October 4, 2001.

             2000 Census: Review of Partnership Program Highlights Best Practices
             for Future Operations. GAO-01-579. Washington, D.C.: August 20, 2001.

             Decennial Censuses: Historical Data on Enumerator Productivity Are
             Limited. GAO-01-208R. Washington, D.C.: January 5, 2001.

             2000 Census: Information on Short- and Long-Form Response Rates.
             GAO/GGD-00-127R. Washington, D.C.: June 7, 2000.




(450103)     Page 25                                        GAO-03-228 Decennial Census
GAO’s Mission            The General Accounting Office, the investigative arm of Congress, exists to
                         support Congress in meeting its constitutional responsibilities and to help improve
                         the performance and accountability of the federal government for the American
                         people. GAO examines the use of public funds; evaluates federal programs and
                         policies; and provides analyses, recommendations, and other assistance to help
                         Congress make informed oversight, policy, and funding decisions. GAO’s
                         commitment to good government is reflected in its core values of accountability,
                         integrity, and reliability.


Obtaining Copies of      The fastest and easiest way to obtain copies of GAO documents at no cost is
                         through the Internet. GAO’s Web site (www.gao.gov) contains abstracts and full-
GAO Reports and          text files of current reports and testimony and an expanding archive of older
                         products. The Web site features a search engine to help you locate documents
Testimony                using key words and phrases. You can print these documents in their entirety,
                         including charts and other graphics.
                         Each day, GAO issues a list of newly released reports, testimony, and
                         correspondence. GAO posts this list, known as “Today’s Reports,” on its Web site
                         daily. The list contains links to the full-text document files. To have GAO e-mail this
                         list to you every afternoon, go to www.gao.gov and select “Subscribe to daily
                         E-mail alert for newly released products” under the GAO Reports heading.


Order by Mail or Phone   The first copy of each printed report is free. Additional copies are $2 each. A check
                         or money order should be made out to the Superintendent of Documents. GAO
                         also accepts VISA and Mastercard. Orders for 100 or more copies mailed to a single
                         address are discounted 25 percent. Orders should be sent to:
                         U.S. General Accounting Office
                         441 G Street NW, Room LM
                         Washington, D.C. 20548
                         To order by Phone:     Voice: (202) 512-6000
                                                TDD: (202) 512-2537
                                                Fax: (202) 512-6061


To Report Fraud,         Contact:
                         Web site: www.gao.gov/fraudnet/fraudnet.htm
Waste, and Abuse in      E-mail: fraudnet@gao.gov
Federal Programs         Automated answering system: (800) 424-5454 or (202) 512-7470



Public Affairs           Jeff Nelligan, managing director, NelliganJ@gao.gov (202) 512-4800
                         U.S. General Accounting Office, 441 G Street NW, Room 7149
                         Washington, D.C. 20548
United States                  Presorted Standard
General Accounting Office      Postage & Fees Paid
Washington, D.C. 20548-0001           GAO
                                 Permit No. GI00
Official Business
Penalty for Private Use $300
Address Service Requested