United States General Accounting Office GAO Report to Congressional Requesters January 2003 DECENNIAL CENSUS Methods for Collecting and Reporting Hispanic Subgroup Data Need Refinement GAO-03-228 a January 2003 DECENNIAL CENSUS Methods for Collecting and Reporting Highlights of GAO-03-228, a report to Congressional Requesters Hispanic Subgroup Data Need Refinement To help boost response rates of both In both the 1990 and 2000 censuses, Hispanics could identify themselves as the general and Hispanic Mexican, Puerto Rican, Cuban, or other Hispanic. Respondents checking off populations, the U.S. Census Bureau this latter category could write in a specific subgroup such as “Salvadoran.” (Bureau) redesigned the 2000 The “other” category in the 1990 Census included examples of subgroups to questionnaire, in part by deleting a clarify the question. For the 2000 Census, the Bureau removed the subgroup list of examples of Hispanic examples as part of a broader effort to simplify the questionnaire and help subgroups from the question on improve response rates. The Bureau removed unnecessary words and added Hispanic origin. While more blank space to shorten the questionnaire and make it more readable. Hispanics were counted in 2000 compared to 1990, the counts for Dominicans and other Hispanic Although the Bureau conducted a number of tests on the sequencing and subgroups were lower than wording of the race and ethnicity questions, and sought input from several expected. Concerned that this was expert panels, no Bureau tests were designed specifically to measure the caused by the deletion of Hispanic impact of the questionnaire changes on the quality of Hispanic subgroup data. subgroup examples, congressional According to Bureau officials, because federal laws and guidelines require requesters asked us to investigate data on Hispanics but not Hispanic subgroups, the Bureau targeted its the research and management resources on research aimed at improving the overall count of Hispanics. activities behind the changes. Bureau evaluations conducted after the census indicated that deleting the subgroup examples might have confused some respondents and produced less-than-accurate subgroup data. A key factor behind the Bureau’s release of the questionable subgroup data was its lack of adequate guidelines governing GAO recommends that the Bureau the quality needed before making data publicly available. As part of its • implement its plans to planning for the 2010 Census, the Bureau intends to conduct further research conduct further research on on the Hispanic origin question, including a field test in parts of New York the Hispanic question, taking steps to properly test the City. However, until research on a new version of the question is finalized, impact of any changes on the Bureau officials said that other census surveys will continue to use the 2000 quality of data on Hispanic Census format of the Hispanic origin question. subgroups and Hispanics overall, and • develop agencywide protocols Enumerator Administers Census Questionnaire that provide guidelines for Bureau decisions on the level of quality needed to release data to the public, how to characterize any limitations in the data, and when it is acceptable to delay or suppress the data. The Bureau agreed with our recommendations, but took exception to our findings concerning the adequacy of its data quality guidelines. www.gao.gov/cgi-bin/getrpt?GAO-03-228. To view the full report, including the scope and methodology, click on the link above. For more information, contact Patricia A. Dalton at (202) 512-6806. Source: U.S. Census Bureau. Contents Letter 1 Results in Brief 2 Background 4 Objectives, Scope, and Methodology 5 Efforts to Simplify Questionnaire Led Bureau to Delete List of Example Hispanic Subgroups 6 The Bureau Plans to Conduct Targeted Research on Hispanic Subgroups in the Future 15 Conclusions 17 Recommendations for Executive Action 18 Agency Comments and Our Evaluation 18 Appendix Appendix I: Comments from the Department of Commerce 20 Related GAO Products 25 Figures Figure 1: Evolution of the Hispanic Question from the 1970 Census to the 2000 Census 7 Figure 2: The Bureau Simplified the 2000 Census Questionnaire 9 Figure 3: The 2000-Style Questionnaire Produced Lower Subgroup Counts than Those from a Test Using the 1990-Style Questionnaire 15 This is a work of the U.S. Government and is not subject to copyright protection in the United States. It may be reproduced and distributed in its entirety without further permission from GAO. It may contain copyrighted graphics, images or other materials. Permission from the copyright holder may be necessary should you wish to reproduce copyrighted materials separately from GAO’s product. Page i GAO-03-228 Decennial Census A United States General Accounting Office Washington, D.C. 20548 January 17, 2003 Leter The Honorable Danny K. Davis Ranking Minority Member Subcommittee on Civil Service, Census and Agency Organization Committee on Government Reform House of Representatives The Honorable Wm. Lacy Clay The Honorable Charles A. Gonzalez The Honorable Carolyn B. Maloney House of Representatives Collecting data on race and ethnicity is among the federal government’s most complex and controversial data collection efforts. The decennial census has collected these data in various forms beginning with the very first national headcount in 1790. Since the 1960s, race and ethnicity data have been used to monitor and enforce compliance with a number of civil rights laws, including those governing equality in employment, voting, housing, mortgage lending, health care services, and education. Over time, in response to changing federal mandates, demographics, and its own operational requirements, the U.S. Census Bureau (Bureau) has changed the format and sequence of the race and ethnicity questions. The Bureau made one such change for the 2000 Census when, in an effort to improve the count of Hispanics and simplify the questionnaire, it redesigned the question on Hispanic origin and dropped a list of examples of Hispanic subgroups. As soon as the Hispanic and Hispanic subgroup data from the 2000 Census were released in May 2001, questions were raised about the counts for specific Hispanic subgroups. For example, the reported count of Dominican Hispanics was significantly lower than the counts reported in other Bureau surveys. Concerned that the lower-than-expected Hispanic subgroup counts were the result of dropping the list of example write-in Hispanic subgroups from the 2000 questionnaire, you asked us to investigate the research and management activities behind this change. As agreed with your offices, we reviewed (1) the decision-making process behind the Bureau’s removal of the example subgroups, (2) the research the Bureau conducted to aid in that decision, and (3) the Bureau’s future plans for collecting Hispanic subgroup data. Page 1 GAO-03-228 Decennial Census This report parallels our recent study addressing congressional concerns about how the Bureau reported data on people counted at emergency and transitional shelters, a segment of the population that includes, among others, the homeless.1 Both reports are part of our ongoing series on lessons learned from the 2000 Census that can help inform the planning effort for 2010. (See the Related GAO Products section for the reports issued to date). Results in Brief The Bureau removed examples of Hispanic subgroups from the census question on Hispanic origin as part of an effort to make the questionnaire more “respondent-friendly.” The Bureau’s evaluations of the 1990 Census indicated that deleting unnecessary words and adding more white space, among other changes, could help improve response rates. The Bureau also modified the wording and format of the Hispanic question in order to improve Hispanic participation in the census. Throughout the 1990s, the Bureau conducted a number of tests to determine the impact that these and other changes had on the overall count of Hispanics. However, because Office of Management and Budget standards governing the collection of race and ethnic data do not require data on Hispanic subgroups, the Bureau did not specifically design any tests to determine the likely effect of the changes on the quality of Hispanic subgroup data. Although the Bureau did not test the likely impact of questionnaire changes on the Hispanic subgroup data, it released subgroup counts along with the overall Hispanic data in May 2001. Immediately following the release of these data, local government officials and representatives of Hispanic subgroups raised questions about the accuracy of specific subgroup counts. Bureau evaluations conducted following the census suggest that dropping the examples of Hispanic subgroups confused some respondents and produced less-than-accurate subgroup data. For example, in one experiment, the Bureau mailed a 1990-style questionnaire (which included subgroup examples) to a sample of individuals as part of the 2000 Census. The Bureau found that 93 percent of Hispanics given the 1990-style form reported a specific subgroup, compared to 81 percent of Hispanics given 1 U.S. General Accounting Office, Decennial Census: Methods for Collecting and Reporting Data on the Homeless and Others without Conventional Housing Need Refinement, GAO- 03-227 (Washington, D.C: Jan. 17, 2003). Page 2 GAO-03-228 Decennial Census the 2000-style form. Thus, while the Bureau reported what respondents marked on their questionnaires, because of respondents’ confusion over the wording of the question, the subgroup data could be misleading. The Bureau has made improving the quality of the Hispanic question a focus for the 2010 Census and intends to test questionnaire changes aimed at improving the quality of its overall count of Hispanics and its counts of Hispanic subgroups. In 2003, the Bureau is to begin testing the Hispanic question, and as part of a field test in 2004, the Bureau plans to administer the questionnaire in parts of the New York City borough of Queens. Any changes to the census questionnaire will also affect other Bureau surveys, such as the proposed American Community Survey (ACS), which the Bureau designed in part to replace the census long-form questionnaire. Bureau officials said that the ACS will continue to use the 2000 Census Hispanic question until research and testing on a new version is complete. A key factor behind the Bureau’s release of apparently less-than-accurate Hispanic subgroup data appears to be a lack of adequate guidelines governing decisions on quality considerations that should be addressed before making data publicly available. Had such guidelines been in place prior to releasing the Hispanic subgroup data, they could have prompted the Bureau to apply more rigorous quality checks on the accuracy of the Hispanic subgroup data; provided a basis for either releasing, delaying, or suppressing the data; and informed decisions on how to describe any of their limitations. The lack of data quality guidelines resulted in similar difficulties when the Bureau initially decided not to release data on the homeless and others without conventional housing. In our companion report, we recommended that the Secretary of Commerce ensure that the Bureau develop agencywide guidelines governing the level of quality needed to release data to the public, when and how to characterize any limitations, and when it is acceptable to suppress data. Because these incidents, if repeated, could erode public confidence in the data, it will be important for the Bureau to implement these recommendations. Additionally, with respect to the Hispanic subgroup data, we are recommending that the Bureau take steps to properly test the impact that any changes to the Hispanic origin question have on the quality of Hispanic data, and the quality of Hispanic subgroups in particular. The Secretary of Commerce forwarded written comments from the U.S. Census Bureau on a draft of this report (see app. I). The Bureau agreed Page 3 GAO-03-228 Decennial Census with our conclusions and recommendations and is taking steps to implement them, but took exception to our findings concerning the adequacy of its data quality guidelines. Background While the decennial census has long collected data on race and ethnicity,2 a specific question on Hispanic origin was first added to the 1970 Census in response to the 1965 Voting Rights Act, which required the data to ensure equality in voting.3 Today, antidiscrimination provisions in a number of statutes require census data on race and Hispanic origin in order to monitor and enforce equal access to housing, education, employment, and other areas. The Office of Management and Budget (OMB), through its Federal Statistical Policy Directive No. 15, sets the standards governing federal agencies’ collection and reporting of race and ethnicity data. At least seven cabinet-level government departments, the Federal Reserve, every state government, and a number of public and private organizations use Hispanic data. Although not required by federal legislation or OMB standards, Hispanic subgroup data are also used for many of these same purposes. In addition, subgroup data are especially important to communities with rapidly growing and diverse Hispanic populations. Collecting data on race and ethnicity has been a persistent challenge for the Bureau. Race and ethnicity are subjective characteristics, which makes measurement difficult. Moreover, the Bureau has found that some Hispanics equate their ethnicity—Hispanic—with race, and thus find it difficult to classify themselves by the standard race categories that include, for example, white, black, and Asian. The Bureau’s preparations for the 2000 Census included an extensive research and testing program to improve the Hispanic count. In 1990, the 2 The Bureau, in accordance with Office of Management and Budget Federal Statistical Policy Directive 15, Race and Ethnic Standards for Federal Statistics and Administrative Reporting, collects data on two ethnicities: Hispanic origin and not of Hispanic origin. We use the same definition in this report. Additionally, the standards call for self-reporting of race and ethnicity rather than identification based on scientific or anthropological standards. The standards also cover reporting on race and ethnicity in administrative reports and for civil rights monitoring. They also specify that the data are not to be used for determining program eligibility. 3 42 U.S.C. 1973aa-1a. Page 4 GAO-03-228 Decennial Census Bureau estimated that it did not enumerate 5 percent of the Hispanic population. Further, the ethnicity question, which was posed to all respondents, appeared to confuse both Hispanics and non-Hispanics. For example, many non-Hispanics, thinking the question only pertained to Hispanics, did not answer the question. Overall, 10 percent of respondents failed to answer the 1990 Hispanic question—the highest of any short form item in 1990. As a result, the Bureau made improving the Hispanic count a major priority for the 2000 Census. Objectives, Scope, and Our objectives were to review (1) the Bureau’s decision-making process that led to its dropping the list of subgroup examples from the Hispanic Methodology question on the 2000 Census form, (2) the research conducted by the Bureau to aid in this decision, and (3) the Bureau’s future plans for collecting Hispanic subgroup data. To address each of these objectives, we interviewed key Bureau officials and examined Bureau, OMB, and other documents, including planning materials and internal memos. To obtain a local perspective of how municipal governments and community leaders use Hispanic subgroup data, we met with data users in New York City, including representatives of the New York Department of Planning and the Dominican and Puerto Rican communities. We also attended a meeting of the Dominican American National Round Table, a Dominican American advocacy group that discussed issues relating to the 2000 Census count of Dominican Hispanics. We also attended meetings of the Census Advisory Committee on Race and Ethnicity that addressed the issue of the quality of the Hispanic subgroup data. Finally, to examine the research behind the Bureau’s decision to remove the example subgroups from the 2000 questionnaire, we reviewed the results of the Bureau’s National Content Survey, Targeted Race and Ethnicity Test, and other research conducted throughout the 1990s in preparation for the 2000 Census. Additionally, we reviewed information from the Bureau’s meetings with its Advisory Committee on the Decennial Census and its Advisory Committee on Race and Ethnicity. We also examined relevant materials from OMB’s Interagency Committee for the Review of the Racial and Ethnic Standards. To review the Bureau’s future plans for collecting Hispanic subgroup data, we attended meetings of the National Academy of Science Panel on Future Census Methods, the Decennial Census Advisory Committee, and the Page 5 GAO-03-228 Decennial Census Census Advisory Committee on Race and Ethnicity. We also discussed these plans with Bureau officials. Our audit work was conducted in New York City and Washington, D.C., and at the Bureau’s headquarters in Suitland, Maryland, from January through September 2002. Our work was done in accordance with generally accepted government auditing standards. We requested comments on a draft of this report from the Secretary of Commerce. On November 27, 2002, the Secretary forwarded the U.S. Census Bureau’s written comments on the draft. The comments are reprinted in appendix I. We address these comments at the end of this report. Efforts to Simplify Collecting accurate ethnic data has challenged the Bureau for over 30 years. Since the 1970 Census, when the Bureau first included a question on Questionnaire Led Hispanic origin, every census has had comparatively high Hispanic Bureau to Delete List undercounts that reduced the quality of the data. As a result, the Bureau has modified the Hispanic question on every census since then as part of a of Example Hispanic continuing effort to improve the Hispanic count. (See fig. 1.) In addition, a Subgroups Spanish language version of the census form has been available upon request since 1980. Page 6 GAO-03-228 Decennial Census Figure 1: Evolution of the Hispanic Question from the 1970 Census to the 2000 Census ▲ 1970 First time data were collected. "Spanish/Hispanic" added to question. "No" moved to front of list. ▲ 1980 "Central or South American" removed. Dropped the word "descent." ▲ 1990 Example write-in groups listed; respondents allowed to provide a write-in response for "other Spanish/Hispanic." "Latino" added. ▲ 2000 Dropped the word "origin." Location of instructions to write in subgroups moved. Examples of write-in other Hispanic subgroups were removed. Source: U.S. Census Bureau and GAO analysis. Page 7 GAO-03-228 Decennial Census For the 2000 Census, Hispanics could identify themselves as Mexican, Puerto Rican, Cuban, or “other Spanish/Hispanic/Latino.” Respondents who checked off this last category could write in a specific subgroup such as “Salvadoran.” Although this approach was similar to that used for the 1990 Census, as shown in figure 1, the “other” category in the 1990 Census included examples of other Hispanic subgroups. The Bureau deleted these examples as one of several changes to the Hispanic question for the 2000 Census. Other changes included (1) adding the word “Latino” to the designation Spanish/Hispanic, (2) dropping the word “origin” from the question, and (3) moving the location of instructions on writing in an unlisted subgroup. According to Bureau officials, these latter three changes were made to improve the Hispanic count. The Bureau removed the subgroup examples as part of a broader effort to simplify the questionnaire and thus help reverse the downward trend in mail response rates that had been occurring since 1970. Indeed, evaluations of the 1990 Census indicated that the overall design of the form was confusing to many and contributed to lower response rates, particularly among some hard-to-enumerate groups such as Hispanics. In redesigning the questionnaire, the Bureau added as much white space as possible, and removed unnecessary words to make the questionnaire shorter and more readable. As shown in figure 2, the 2000 questionnaire appears more “respondent-friendly” compared to the 1990 questionnaire. Page 8 GAO-03-228 Decennial Census Figure 2: The Bureau Simplified the 2000 Census Questionnaire 1990 Questionnaire 2000 Questionnaire 1 1 1 2 2 3 3 2 2 4 4 1990 Questionnaire 2000 Questionnaire 1 Multiple people on each page Each household member on separate page 2 Race question before Hispanic question Hispanic question before race question 3 Respondents fill in bubbles to mark age Respondents write-in age, saving space 4 Space used to list many Hispanic subgroups Fewer Hispanics subgroups listed, saving space Source: U.S. Census Bureau and GAO analysis. Page 9 GAO-03-228 Decennial Census The Bureau initially proposed removing the example write-in subgroups during 1990 through 1992. A first version of the questionnaire without the example subgroups was used in the 1992 National Census Test. However, as discussed in the next section, testing continued from 1992 to 1996 to ensure that removing the write-in example groups did not harm the overall count of Hispanics. From 1995 to 1997, after testing showed that removal of the write-in example groups would not harm the overall Hispanic count, the Bureau finalized its decision to remove the example subgroups. Although federal law and OMB standards4 only require information on whether an individual is Hispanic, Bureau officials told us they collect subgroup data to help improve the overall Hispanic count. According to the Bureau, many Hispanics do not view themselves as Hispanic, but identify instead with their country of origin or with a particular Hispanic subgroup. State and local governments, academic institutions, community organizations, and marketing firms, among other organizations, also use Hispanic subgroup data for a variety of purposes. For example, officials in the New York City Department of Planning told us that they need accurate information on the number and distribution of Hispanic subgroups in planning the delivery of numerous city services. According to a Bureau official, no data are available on the precise impact the questionnaire redesign had on overall response rates in part because it was made in conjunction with other efforts to improve the response rate, such as a more aggressive outreach and promotion campaign. However, the initial mail response rate was 64 percent, 3 percentage points higher than the Bureau’s expectations, and comparable to the similar 1990 mail response rate. 4 Public Law 94-311 requires the collection of data on “Americans of Spanish origin or descent.” OMB Federal Statistical Policy Directive 15 states that collection of data on Hispanic subgroups is optional, as long as the collection of these data does not harm efforts to collect accurate data on the number of Hispanics. Page 10 GAO-03-228 Decennial Census Moreover, evaluations conducted since the 2000 Census by the Bureau indicate that the Bureau obtained a more complete count of Hispanics in the 2000 Census than it did in 1990. For example, Bureau data show that the 2000 Census missed an estimated 2.85 percent of the Hispanic population compared to an estimated 4.99 percent in 1990—a 43 percent reduction of the undercount.5 The Bureau credits the improvement in part to the changes it made to the questionnaire. However, as discussed in the next section, removing the examples of Hispanic subgroups may have reduced the completeness of data on individual segments of the Hispanic population. No Bureau Tests Were Bureau guidance requires that any changes to the census form must first be Designed Specifically to thoroughly tested. For example, according to Bureau officials, before changing a question, the Bureau must first conduct research studies, Measure the Impact of cognitive tests, and field tests to determine how best to sequence and word Questionnaire Changes on the question, and to see if the proposed changes are likely to achieve the Hispanic Subgroup Data desired results. Additionally, the census questionnaire is to be reviewed by a variety of census advisory groups, OMB, and Congress before it is finalized. Nevertheless, while the Bureau conducted a number of tests of the sequencing and wording of the race and ethnicity questions, according to Bureau officials, it did not specifically design any tests to determine the impact of the changes on the quality of Hispanic subgroup data.6 Because OMB standards do not require data on Hispanic subgroups, Bureau officials said that the Bureau targeted its resources on testing and research aimed at improving the overall count of Hispanics. 5 These figures represent the net Hispanic undercount, which is the difference between the estimated Hispanic population per the Bureau’s Accuracy and Coverage Evaluation Survey and the census count. 6 The Census Bureau did look at the impact of changes on Hispanic subgroups. However, the sample size in the test was not large enough to detect statistically significant differences for the Hispanic subgroups that constitute the “Other Spanish/Hispanic/Latino” population. Additionally, the test was not designed to detect the impact of each change to the question separately. Page 11 GAO-03-228 Decennial Census Throughout the 1990s, in revising the race and ethnicity questions, the Bureau sought input from several expert panels, including the Interagency Committee formed by OMB7 and the Census Advisory Committee on Racial and Ethnic Populations, one of several panels with which the Bureau consulted to help it plan the 2000 Census. In addition, the Bureau conducted several tests of the questionnaire to assess respondents’ understanding of the questions and their ability to complete them properly. They included the • 1992 National Census Test, which field tested potential questions for the 2000 Census questionnaire; • 1996 National Content Survey, which examined a number of issues to improve race and ethnic reporting; and • 1996 Race and Ethnic Targeted Test, which tested alternative formats for asking race and ethnic questions. In addition, the Bureau analyzed the results of Hispanic data from the 1990 Census (which led to its conclusions about the undercount), but did not conduct any specific evaluations of the quality of the 1990 Hispanic subgroup data. The consultation, research, and testing played a key role in the Bureau’s decisions to place the ethnicity question before the race question and make several other changes discussed earlier in this report. The test results also indicated that the example subgroups could produce conflicting results. On the one hand, the Bureau found that providing the example subgroups could help prevent respondents’ confusion over how to describe their ethnicity. On the other hand, the Bureau found that removing the example subgroups could help reduce the bias caused by the example effect, which occurs when a respondent erroneously selects a response because it is provided in the questionnaire. Although the Bureau conducted a dress rehearsal for the 2000 Census in 1998 in order to test its overall design, the dress rehearsal did not identify any problems with the Hispanic subgroup question. According to Bureau officials, this could have been because none of the three test sites—the city of Sacramento, California; Menominee County, Wisconsin, including the 7 A group of more than 30 agencies that represent the many and diverse federal needs for data on race and ethnicity, including statutory requirements for such data. Page 12 GAO-03-228 Decennial Census Menominee American Indian Reservation; and the city of Columbia, South Carolina, and its 11 surrounding counties—had a large and diverse enough Hispanic population for the problems to become evident. Questions Raised about the In May 2001, the Bureau released data on Hispanics and Hispanic Quality of Reported subgroups as part of its first release summarizing the results of the 2000 Census, called the SF-1 file. The Bureau also published The Hispanic Hispanic Subgroup Data Population, a 2000 Census brief that provided an overview of the size and distribution of the Hispanic population in 2000 and highlighted changes in the population since the 1990 census. For the first time, the Bureau released data on Hispanic subgroups as a part of its release of the full count SF-1 data even though it had not fully tested the impact of questionnaire changes on the subgroup data and provided little discussion of the potential limitations of the data. Following the initial release of the Hispanic data, local government officials and Hispanic advocacy groups raised questions about the accuracy of the counts of Hispanic subgroups listed as examples on the 1990 census form, but not the 2000 form. The 2000 Census showed lower counts of several Hispanic subgroups than analysts had expected based on their own estimates using a variety of information sources such as vital statistics, immigration statistics, population surveys, and other data. In New York City, local government officials and representatives of Hispanic subgroups who partnered with the Bureau to improve the enumeration of Hispanics told us that they were particularly concerned about low subgroup counts in their communities in part because they needed accurate numbers to plan and deliver specialized services to particular subgroups. Moreover, they said that because “official census numbers” are often considered definitive, problems with the released Hispanic subgroup numbers could lead to faulty decision making by data users. Page 13 GAO-03-228 Decennial Census Questionnaire Modifications Since the release of the 2000 Census Hispanic data, the Bureau has May Have Led to Problems conducted evaluations of the data that provided more information on how removing the subgroup examples may have affected the quality of Hispanic with Hispanic Subgroup subgroup data. One key evaluation was the Alternative Questionnaire Data Experiment, in which the Bureau sent out 1990-style census forms to a sample of individuals as part of the 2000 Census. As shown in figure 3, the Bureau’s research indicates that the 1990-style form elicited more reports of specific Hispanic subgroups than the 2000-style questionnaire.8 Indeed, 93 percent of Hispanics given the 1990-style form reported a specific subgroup, compared to 81 percent of Hispanics given the 2000-style form. Moreover, virtually every subgroup reported in the 2000-style form composed a smaller percentage of the overall Hispanic count than the 1990- style form. Thus, while the Bureau reported what respondents checked off on their questionnaires, because of respondents’ confusion over the wording of the question, the 2000 subgroup data could be misleading. Figure 3 also suggests that one possible reason for this might be that many respondents did not understand what they were supposed to write in, as many more people on the 2000-style form wrote in “Hispanic,” “Spanish,” or “Latino” (as opposed to a specific subgroup) compared to the 1990-style questionnaire. Additionally, a higher percentage of the respondents did not provide codeable (useable) responses. Moreover, based on its analysis of the Census 2000 Supplementary Survey—an operational test for collecting long-form-type data based on a nationwide sample of 700,000 households—the Bureau estimated that there were about 150,000 more Dominican Hispanics than were counted in the 2000 Census. Some attribute the discrepancy to the fact that many respondents to the supplementary survey provided their answers by telephone, where enumerators were able to help them better understand the question on Hispanic subgroups. 8 This study was conducted in English only. Because a sizable number of Hispanics only speak Spanish, the results of this study cannot be generalized to the Hispanic population at large. Page 14 GAO-03-228 Decennial Census Figure 3: The 2000-Style Questionnaire Produced Lower Subgroup Counts than Those from a Test Using the 1990-Style Questionnaire 12 Percentage 11.90 10 8.68 8 7.25 6 5.03 4.20 4 3.33 2.59 2.76 2.28 1.89 1.90 2 1.34 1.39 0.52 0.57 0.32 0.24 0.32 0 n n n ic an d ran le an ” ” ua nia bia Sp nic, iar ish cif ab nic do rag an pe nti lom de a mi lva ,” o isp Sp rs ca ge co Do Co Sa Ni no n “H he Ar Un r“ Ot i “L rote ati W Reported hispanic subgroups Census 2000 questionnaire 1990-style questionnaire Source: U.S. Census Bureau and GAO analysis. The Bureau Plans to Because of concerns relating to the 2000 Census counts of Hispanic subgroups, Bureau officials said that they plan to focus testing and Conduct Targeted research on these questions in preparation for the 2010 Census. In Research on Hispanic particular, they stated that the Bureau would examine the likely impact of including Hispanic subgroup examples in the question again, as well as Subgroups in the other aspects of the question that caused problems for some respondents. Future Before deciding on a new version of the Hispanic question, the Bureau must finish evaluating the results of the 2000 Census, conduct a number of cognitive tests, and field-test proposed changes to the question. The Bureau plans to begin testing the Hispanic question in 2003 and, as part of a field test in 2004, to administer the questionnaire in parts of Queens, New York, which the Bureau selected for its racial and ethnic diversity. The Page 15 GAO-03-228 Decennial Census Bureau intends to complete its testing and decide on changes to the Hispanic question from 2006 through 2008. Any changes to the Hispanic question are relevant not only for the 2010 Census, but also for other Bureau questionnaires, such as the proposed ACS.9 Bureau officials told us that they expect that the ACS will continue to use the 2000 Census Hispanic question until research and testing on a new version is complete. The Bureau Lacks Clearly While continued research could help the Bureau collect better-quality Written, Transparent Hispanic subgroup data, it will also be important for the Bureau to address what led it to release data that could mislead users. A key factor in this Guidelines for Releasing regard is that the Bureau lacks adequate guidelines for making decisions Data about how data quality considerations affect the release of data to the public. Had such guidelines been in place prior to releasing the Hispanic subgroup data, they could have (1) prompted the Bureau to apply more rigorous quality checks on the Hispanic subgroup data, (2) provided a basis for either releasing, delaying, or suppressing the data, and (3) informed decisions on how to describe any limitations to data released. This is not the first time that the lack of Bureau-wide guidelines on the level of quality needed for census results to be released to the public has created difficulties for the Bureau and data users. As we noted in our companion report10 on the Bureau’s methods for collecting and reporting data on the homeless and others without conventional housing, one cause of the Bureau’s shifting position on reporting those data and the resulting public confusion appears to be its lack of documented, clear, transparent, and consistently applied guidelines on the level of quality needed to release data to the public. With the Hispanic subgroup data, the Bureau released the information as planned before it could properly assess its quality, identify problems, and report its limitations. More rigorous guidelines could help ensure that decisions about the quality of all census data the Bureau releases are more consistent and better understood by the public. 9 The ACS is designed to provide annual data for areas with populations of 65,000 or more and multiyear averages for smaller geographic areas. The ACS is also intended to replace the long-form Census questionnaire. 10 GAO-03-227. Page 16 GAO-03-228 Decennial Census In 2000, the Bureau initiated a program aimed at documenting Bureau-wide protocols designed to ensure the quality of data it collected and released. Because this effort is still in its early stages, we could not assess it. However, Bureau officials believe that the program is a significant first step in addressing the Bureau’s lack of data quality guidelines. As the Bureau develops its protocols further, it will be important that they be well documented, transparent, clearly defined, consistently applied, and properly communicated to the public. Conclusions Throughout the 1990s, the Bureau went to great lengths to improve response rates to the 2000 Census in general, and participation of Hispanics in particular. Although the unique contributions of the individual components of the Bureau’s efforts cannot be determined, the mail response rate was similar to the 1990 level, and the Bureau’s preliminary data suggest that the 2000 Census count of Hispanics was an improvement over the 1990 count. However, the counts of Hispanic subgroups do not appear to have been improved and, in fact, there is concern that some of these subgroup counts may be less accurate than the 1990 counts. Moreover, the Bureau’s experience in simplifying the questionnaire in part by removing the examples of the Hispanic subgroups shows the challenge the Bureau faces in trying to improve one component of the census count without adversely and unintentionally affecting other aspects of the census count. In light of these findings, it will be important for the Bureau to continue with its planned research on how best to enumerate Hispanic subgroups. The Bureau’s release of Hispanic subgroup numbers raised questions about the quality of the reported data and the Bureau’s decision to report these data as a part of its release of the SF-1 data. Although the specific questions about the Hispanic subgroup data differed from those identified in our review of the Bureau’s efforts to collect and report data on the homeless and others without conventional housing, a common cause of both sets of problems was the Bureau’s lack of agencywide guidelines for its decisions on the level of quality needed to release data to the public. As we recommended in our report on homeless counts, the Bureau needs to develop well-documented guidelines that spell out how to characterize any limitations in the data, and when it is acceptable to suppress these data. The Bureau should also ensure that these guidelines are documented, transparent, clearly defined, consistently applied, and properly communicated to the public. Page 17 GAO-03-228 Decennial Census Recommendations for To ensure that the 2010 Census will provide public data users with more accurate information on specific Hispanic subgroups, we recommend that Executive Action the Secretary of Commerce ensure that the Director of the U.S. Census Bureau implements Bureau plans to research the Hispanic question, taking steps to properly test the impact of the wording, format, and sequencing on the completeness and accuracy of the data on Hispanic subgroups and Hispanics overall. In addition, as we also recommended in our companion report on the homeless and others without conventional housing, we recommend that the Bureau develop agencywide guidelines governing the level of quality needed to release data to the public, when and how to characterize any limitations, and when it is acceptable to delay or suppress data. Agency Comments and The Secretary of Commerce forwarded written comments from the U.S. Census Bureau on a draft of this report (see app. I). The Bureau agreed Our Evaluation with our conclusions and recommendations and, as indicated in the letter, is taking steps to implement them. However, it expressed several general concerns about our findings. The Bureau’s principal concerns and our response are presented below. The Bureau also suggested minor wording changes to provide additional context and clarification. We accepted the Bureau’s suggestions and made changes to the text as appropriate. The Bureau took exception to our findings concerning the adequacy of its data quality guidelines noting that it “conducted the review of the data on the Hispanic origin population using standard review techniques for reasonableness and quality.” We do not question the Bureau’s commitment to presenting quality data. Rather, our point is that the Bureau needs to translate its commitment to quality into well documented, transparent, clearly defined guidelines to provide a basis for consistent decision making on the level of quality needed to release data to the public, and on when and how to characterize any limitations. During our review, Bureau officials, including the Associate Director for Methodology and Standards, told us that the Bureau had few written guidelines, standards, or procedures related to the quality of data released to the public. A second general concern expressed by the Bureau dealt with our characterization of problems with the Hispanic subgroup counts. The Bureau said that the data met an acceptable level of quality because they accurately reflect what people reported and therefore cannot be characterized as erroneous. We agree with the Bureau on this specific Page 18 GAO-03-228 Decennial Census point. However, we take a broader view of data quality. Specifically, we believe that questions about the accuracy of the Hispanic subgroup data must also take into account problems that the respondents had in understanding the meaning of the question. The Bureau challenged our assertion that the wording of the question “confused” some respondents, preferring to say that some respondents may have “interpreted” the question wording, instructions, and examples differently than expected. We agree with the Bureau that additional research will be required to understand the extent of this problem. Nevertheless, we believe there is sufficient evidence from the Bureau’s subsequent research and from analysis of trends in the data to support our concerns about the accuracy of Hispanic example subgroup counts in the 2000 Census. As agreed with your office, unless you publicly announce its contents earlier, we plan no further distribution of this report until 30 days from its issue date. At that time, we will send copies of this report to the Chairman of the House Committee on Government Reform, the Secretary of Commerce, and the Director of the U.S. Census Bureau. Copies will be made available to others on request. This report will also be available at no charge on GAO’s home page at http://www.gao.gov. Please contact me on (202) 512-6806 or by E-mail at email@example.com if you have any questions. Other key contributors to this report were Robert Goldenkoff, Christopher Miller, Elizabeth Powell, Timothy Wexler, Ty Mitchell, Benjamin Crawford, James Whitcomb, Robert Parker, and Michael Volpe. Patricia A. Dalton Director Strategic Issues Page 19 GAO-03-228 Decennial Census Appendix I Comments from the Department of Appendx ies Commerce Append x Ii Page 20 GAO-03-228 Decennial Census Appendix I Comments from the Department of Commerce Page 21 GAO-03-228 Decennial Census Appendix I Comments from the Department of Commerce Page 22 GAO-03-228 Decennial Census Appendix I Comments from the Department of Commerce Page 23 GAO-03-228 Decennial Census Appendix I Comments from the Department of Commerce Page 24 GAO-03-228 Decennial Census Related GAO Products Decennial Census: Methods for Reporting and Collecting Data on the Homeless and Others without Conventional Housing Need Refinement. GAO-03-227. Washington, D.C.: January 17, 2003. 2000 Census: Refinements to Full Count Review Program Could Improve Future Data Quality. GAO-02-562. Washington, D.C.: July 3, 2002. 2000 Census: Coverage Evaluation Matching Implemented As Planned, but Census Bureau Should Evaluate Lessons Learned. GAO-02-297. Washington, D.C.: March 14, 2002. 2000 Census: Best Practices and Lessons Learned for a More Cost- Effective Nonresponse Follow-Up. GAO-02-196. Washington, D.C.: February 11, 2002. 2000 Census: Coverage Evaluation Interviewing Overcame Challenges, but Further Research Needed. GAO-02-26. Washington, D.C.: December 31, 2001. 2000 Census: Analysis of Fiscal Year 2000 Budget and Internal Control Weaknesses at the U.S. Census Bureau. GAO-02-30. Washington, D.C.: December 28, 2001. 2000 Census: Significant Increase in Cost Per Housing Unit Compared to 1990 Census. GAO-02-31. Washington, D.C.: December 11, 2001. 2000 Census: Better Productivity Data Needed for Future Planning and Budgeting. GAO-02-4. Washington, D.C.: October 4, 2001. 2000 Census: Review of Partnership Program Highlights Best Practices for Future Operations. GAO-01-579. Washington, D.C.: August 20, 2001. Decennial Censuses: Historical Data on Enumerator Productivity Are Limited. GAO-01-208R. Washington, D.C.: January 5, 2001. 2000 Census: Information on Short- and Long-Form Response Rates. GAO/GGD-00-127R. Washington, D.C.: June 7, 2000. (450103) Page 25 GAO-03-228 Decennial Census GAO’s Mission The General Accounting Office, the investigative arm of Congress, exists to support Congress in meeting its constitutional responsibilities and to help improve the performance and accountability of the federal government for the American people. GAO examines the use of public funds; evaluates federal programs and policies; and provides analyses, recommendations, and other assistance to help Congress make informed oversight, policy, and funding decisions. GAO’s commitment to good government is reflected in its core values of accountability, integrity, and reliability. Obtaining Copies of The fastest and easiest way to obtain copies of GAO documents at no cost is through the Internet. GAO’s Web site (www.gao.gov) contains abstracts and full- GAO Reports and text files of current reports and testimony and an expanding archive of older products. The Web site features a search engine to help you locate documents Testimony using key words and phrases. You can print these documents in their entirety, including charts and other graphics. Each day, GAO issues a list of newly released reports, testimony, and correspondence. GAO posts this list, known as “Today’s Reports,” on its Web site daily. The list contains links to the full-text document files. To have GAO e-mail this list to you every afternoon, go to www.gao.gov and select “Subscribe to daily E-mail alert for newly released products” under the GAO Reports heading. Order by Mail or Phone The first copy of each printed report is free. Additional copies are $2 each. A check or money order should be made out to the Superintendent of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or more copies mailed to a single address are discounted 25 percent. Orders should be sent to: U.S. General Accounting Office 441 G Street NW, Room LM Washington, D.C. 20548 To order by Phone: Voice: (202) 512-6000 TDD: (202) 512-2537 Fax: (202) 512-6061 To Report Fraud, Contact: Web site: www.gao.gov/fraudnet/fraudnet.htm Waste, and Abuse in E-mail: firstname.lastname@example.org Federal Programs Automated answering system: (800) 424-5454 or (202) 512-7470 Public Affairs Jeff Nelligan, managing director, NelliganJ@gao.gov (202) 512-4800 U.S. General Accounting Office, 441 G Street NW, Room 7149 Washington, D.C. 20548 United States Presorted Standard General Accounting Office Postage & Fees Paid Washington, D.C. 20548-0001 GAO Permit No. GI00 Official Business Penalty for Private Use $300 Address Service Requested
Decennial Census: Methods for Collecting and Reporting Hispanic Subgroup Data Need Refinement
Published by the Government Accountability Office on 2003-01-17.
Below is a raw (and likely hideous) rendition of the original report. (PDF)