oversight

Federal Statistical System: Agencies Can Make Greater Use of Existing Data, but Continued Progress Is Needed on Access and Quality Issues

Published by the Government Accountability Office on 2012-02-24.

Below is a raw (and likely hideous) rendition of the original report. (PDF)

                United States Government Accountability Office

GAO             Report to the Chairman, Subcommittee on
                Federal Financial Management, Government
                Information, Federal Services, and International
                Security, Committee on Homeland Security and
                Governmental Affairs, U.S. Senate
February 2012
                FEDERAL
                STATISTICAL
                SYSTEM
                Agencies Can Make
                Greater Use of
                Existing Data, but
                Continued Progress Is
                Needed on Access and
                Quality Issues




GAO-12-54
                                                February 2012

                                                FEDERAL STATISTICAL SYSTEM
                                                Agencies Can Make Greater Use of Existing Data,
                                                but Continued Progress Is Needed on Access and
                                                Quality Issues
Highlights of GAO-12-54, a report to the
Chairman, Subcommittee on Federal Financial
Management, Government Information,
Federal Services, and International Security,
Committee on Homeland Security and
Governmental Affairs, U.S. Senate

Why GAO Did This Study                          What GAO Found
As demand for more and better                   The Office of Management and Budget (OMB), agencies, and interagency
information increases, rising costs and         statistical committees have distinct roles in identifying opportunities to improve
other challenges require that the               federal information collection efforts. OMB exercises several authorities that
federal statistical system identify             promote the system’s efficiency, including overseeing and approving agency
efficiencies. To explore opportunities to       information collections. The website Reginfo.gov provides the public with
improve cost-effectiveness, GAO was             information, such as cost and burden, on collections that OMB reviews, though
asked to (1) review how the Office of           GAO’s review identified some discrepancies in selected items. OMB periodically
Management and Budget (OMB) and                 issues guidance to agencies on complying with federal requirements for
agencies improve information
                                                information collections, but this guidance generally does not prescribe specific
collections, (2) evaluate opportunities
                                                actions to take. GAO’s analysis of agencies’ documentation of active surveys
and constraints for agencies to use
administrative data (information
                                                indicated that 77 percent included detailed descriptions of efforts to identify
collected as part of the administration         duplication, while those that did not tended to be for collections that are unlikely
of a program or held by private                 to duplicate existing information; and 75 percent reported actions beyond those
companies) with surveys, and (3)                required by statute to solicit external input. OMB, through enhanced guidance,
assess the benefits and constraints of          could promote additional awareness of options agencies can take to identify
surveys making greater use of the               duplication and solicit input. Interagency committees, which primarily draw
Census Bureau’s American                        members from the 13 agencies that have statistics as their primary focus, are
Community Survey (ACS) data and                 particularly important in helping ensure collaboration. The committees have
resources. GAO focused on collections           numerous projects underway aimed at addressing key challenges facing the
administered to households and                  statistical system. However, mechanisms for disseminating information about
individuals, analyzed statutory and             their work are not comprehensive or up-to-date. Though member agencies are
agency documents, did five case                 the most-likely customers of the committees’ products, making information about
studies of surveys, reviewed                    committee work and priorities more accessible could benefit other agencies,
documentation of representative                 academics, and the general public. It could also benefit committee members by
samples of active surveys, and                  providing a central repository for information.
interviewed agency officials and
experts.                                        Administrative data have greater potential to supplement rather than replace
                                                survey data. Agencies currently combine the two data sources in four key ways
What GAO Recommends                             to cost-effectively increase efficiency and quality. Specifically, agencies use
GAO recommends that OMB take                    administrative data to: (1) link to survey data to create new data products; (2)
several actions to improve the broader          supplement surveys’ sample frames; (3) compare to survey data to improve
efficiency of the federal statistical           accuracy and design of surveys; and (4) combine with survey data to create, or
system, including implementing                  model, estimates. However, expanding the use of administrative data faces key
additional quality-control procedures           constraints related to the access and quality of the data. While agencies and
for selected website data, enhancing            committees are taking steps to address these constraints and facilitate the
awareness of ways to meet information           process through which agencies work together to share data, individual tools
collection requirements, better                 may not be sufficient. A more-comprehensive framework for use by all agencies
disseminating information on                    involved in data-sharing decisions that includes key questions to consider when
interagency committees, and                     evaluating potential use of administrative data could make the decision process
developing comprehensive guidance               more consistent and transparent.
for agencies to use when considering
data sharing. OMB generally agreed              ACS, an ongoing monthly survey that provides information about the nation’s
with all of GAO’s recommendations.              communities, offers agencies important opportunities to increase the efficiency
                                                and reduce the costs of their surveys, but its current design limits the extent to
                                                which agencies can utilize some of these opportunities. Uses that do not affect
                                                ACS design or the survey’s respondents, such as using ACS estimates to inform
View GAO-12-54. For more information,
contact Robert Goldenkoff at (202) 512-2757     survey design or evaluate other surveys’ results, have widespread potential.
or goldenkoffr@gao.gov or Ronald S. Fecso at    However, more-intensive uses, such as adding content or supplemental surveys
(202) 512-7791 or fecsor@gao.gov.               to the ACS, currently have limited potential.
                                                                                         United States Government Accountability Office
Contents


Letter                                                                                       1
               Background                                                                    4
               OMB and Agencies Take a Number of Steps to Ensure Efficient
                 Information Collections, Though Opportunities Exist for
                 Refinements                                                                 8
               Administrative Data Could Help Improve Federal Surveys, but
                 Continued Progress Is Needed on Access and Quality Issues                 19
               Prospects for Enhanced Use of the ACS with Other Surveys Are
                 Mixed                                                                     28
               Conclusions                                                                 34
               Recommendations for Executive Action                                        35
               Agency Comments and Our Evaluation                                          36

Appendix I     Scope and Methodology                                                       39



Appendix II    Description of Case-Study Surveys                                           45



Appendix III   Selected Statutes Related to Information Collection                         51



Appendix IV    Printable Interactive Graphic                                               53



Appendix V     Comments from the Department of Commerce                                    54



Appendix VI    GAO Contacts and Staff Acknowledgments                                      56



Tables
               Table 1: Overview of Interagency Statistical Committees                     17
               Table 2: Key Characteristics of the ACS                                     29
               Table 3: Number of Collections, by Stratum                                  42
               Table 4: Actions Taken to Address Constraints That Hamper
                        Greater Use of Administrative Data                                 53



               Page i                                      GAO-12-54 Federal Statistical System
Figures
          Figure 1: The Thirteen Principal Statistical Agencies and Their
                   Parent Organizations                                                             5
          Figure 2: Most Information Collections from Households and
                   Individuals Have Relatively Modest Costs                                         7
          Figure 3: Actions Taken to Address Constraints That Hamper
                   Greater Use of Administrative Data                                               25


          Abbreviations

          ACS              American Community Survey
          BLS              Bureau of Labor Statistics
          CE Surveys       Consumer Expenditure Surveys
          CED              Consumer Expenditure Diary Survey
          CEQ              Consumer Expenditure Quarterly Interview Survey
          CIPSEA           Confidential Information Protection and Statistical Efficiency
                           Act
          ERS              Economic Research Service
          FCSM             Federal Committee on Statistical Methodology
          ICSP             Interagency Council on Statistical Policy
          NCHS             National Center for Health Statistics
          NCSES            National Center for Science and Engineering Statistics
          NHANES           National Health and Nutrition Examination Survey
          NHIS             National Health Interview Survey
          NSCG             National Survey of College Graduates
          OIRA             Office of Information and Regulatory Affairs
          OMB              Office of Management and Budget
          PRA              Paperwork Reduction Act
          ROCIS            Regulatory Information Service Center and OIRA
                           Consolidated Information System
          SCOPE            Statistical Community of Practice and Engagement
          SIPP             Survey of Income and Program Participation



          This is a work of the U.S. government and is not subject to copyright protection in the
          United States. The published product may be reproduced and distributed in its entirety
          without further permission from GAO. However, because this work may contain
          copyrighted images or other material, permission from the copyright holder may be
          necessary if you wish to reproduce this material separately.




          Page ii                                               GAO-12-54 Federal Statistical System
United States Government Accountability Office
Washington, DC 20548




                                   February 24, 2012

                                   The Honorable Thomas R. Carper
                                   Chairman
                                   Subcommittee on Federal Financial Management, Government
                                    Information, Federal Services, and International Security
                                   Committee on Homeland Security and Governmental Affairs
                                   United States Senate

                                   Dear Mr. Chairman:

                                   Information is a critical strategic asset, and all levels of government, as
                                   well as businesses and private citizens, depend on relevant, accurate,
                                   and timely social, demographic, financial, and other federally funded data-
                                   collection efforts to inform their planning and other decisions. Collectively,
                                   this information plays a vital role in measuring the health and well-being
                                   of the nation, informing private-sector investment, allocating federal
                                   funding, and measuring the outcomes of government programs.

                                   However, the federal statistical system, including (1) agencies that collect
                                   and analyze data, and (2) the Office of Management and Budget (OMB),
                                   which oversees the system, faces several challenges. Key among them is
                                   that the demand for information is increasing, especially as organizations
                                   look for ways to operate more cost-effectively, while the cost of collecting
                                   data is growing and response rates to surveys—both government and
                                   private-sector—are declining, driven in part by concerns over privacy and
                                   confidentiality. In the face of these challenges, it will be important for
                                   federal statistical agencies to identify opportunities to increase their
                                   efficiency, while maintaining or improving data quality and minimizing
                                   respondent burden and respecting privacy and confidentiality concerns.
                                   Greater use of administrative data, which includes information collected
                                   as part of the execution of government programs as well as information
                                   held by private companies, has been proposed as one approach to
                                   enhance efficiency and quality. 1 Another potential approach is making
                                   greater use of the American Community Survey (ACS), a monthly survey




                                   1
                                    Examples of administrative data include Social Security Administration records, state
                                   unemployment records, medical records, and store loyalty-card data.




                                   Page 1                                                GAO-12-54 Federal Statistical System
that replaced the census long form and provides annual data on
communities’ demographic, social, economic, and housing conditions.

At your request, this report (1) reviews the ways in which OMB and
agencies identify opportunities for improvement and increased efficiency;
(2) evaluates opportunities and constraints for the statistical agencies to
use administrative data in conjunction with selected surveys; and (3)
assesses the benefits and constraints of selected surveys making greater
use of ACS data and resources.

To achieve our objectives, we focused our review on statistical
information collections administered to households and individuals, as
opposed to businesses or other entities, and subject to the Paperwork
Reduction Act (PRA), which requires OMB approval of certain federal
data collections. 2 Specifically, we performed case studies of five federal
surveys: the Consumer Expenditure Surveys, sponsored by the Bureau of
Labor Statistics (BLS); the National Health and Nutrition Examination
Survey and the National Health Interview Survey, both sponsored by the
National Center for Health Statistics (NCHS); the National Survey of
College Graduates, sponsored by the National Center for Science and
Engineering Statistics (NCSES), part of the National Science Foundation;
and the Survey of Income and Program Participation, sponsored by the
Census Bureau. We selected these surveys based on several factors,
such as their size and cost and whether they use or have the potential to
use administrative data or ACS data. We focused our selection on large
surveys, in terms of both cost and number of respondents, because
potential cost savings and efficiency gains are likely greatest for them.

Additionally, to address all three objectives, we examined related statutes
and regulations, applicable OMB guidance, documentation of the ACS
and our case study surveys, papers and reports, and our own prior work. 3
To gain an understanding of the information collections in our scope, we
reviewed publicly available data from Reginfo.gov, a government website
with information on agency requests for OMB approval of information



2
The PRA is codified at 44 U.S.C. §§ 3501-3521.
3
 For examples of our prior work, see: GAO, Federal Information Collection: A
Reexamination of the Portfolio of Major Federal Household Surveys Is Needed,
GAO-07-62 (Washington, D.C.: Nov. 15, 2006); American Community Survey: Key
Unresolved Issues, GAO-05-82 (Washington, D.C.: Oct. 8, 2004).




Page 2                                           GAO-12-54 Federal Statistical System
collections. 4 We analyzed the subject matter of all of the collections in our
scope and, for a representative sample of 106 surveys, analyzed
agencies’ reported efforts to identify duplication and consult with persons
outside of the agency. We interviewed experts on the federal statistical
system and officials at OMB and the four agencies that administer the
case-study surveys to learn about coordination among agencies, efforts
agencies take to identify improvement, and experts’ and officials’
perspectives on current and potential uses of administrative data and
ACS. We also interviewed and discussed these topics with officials at the
Department of Agriculture’s Economic Research Service (ERS), which is
a member of several interagency statistical committees and the lead
agency for the Statistical Community of Practice and Engagement. 5 In
evaluating OMB, agency, and interagency actions to improve efficiency,
we used as criteria the requirements of the PRA and practices identified
in our prior work on agency collaboration. 6

For the purposes of this review, we assessed the reliability of the data
from the Reginfo.gov website and determined that they were reliable for
some of our purposes but not others. Specifically, we reviewed related
documentation, conducted interviews with OMB officials, and compared
selected data elements from the Reginfo.gov website to supporting
documents. We determined that the data were sufficiently reliable for
purposes of identifying the collections within our scope and obtaining
information on the collections’ subject matter and actions taken by
agencies to identify duplication and solicit input. As described later in this
report, data provided on the website were not sufficiently reliable for the
purpose of assessing collections’ annual cost to the federal government
and annual respondent burden hours. Appendix I includes additional
information on our scope and methodology. Appendix II contains more
detailed descriptions of our case-study surveys.

We conducted this performance audit from December 2010 to February
2012 in accordance with generally accepted government auditing
standards. Those standards require that we plan and perform the audit to


4
    The website is maintained by OMB and the General Services Administration.
5
 The Statistical Community of Practice and Engagement is an interagency committee that
focuses on providing a collaborative community for agencies that focus on statistics.
6
 GAO, Results-Oriented Government: Practices That Can Help Enhance and Sustain
Collaboration among Federal Agencies, GAO-06-15 (Washington, D.C.: Oct. 21, 2005).




Page 3                                                GAO-12-54 Federal Statistical System
             obtain sufficient, appropriate evidence to provide a reasonable basis for
             our findings and conclusions based on our audit objectives. We believe
             that the evidence obtained provides a reasonable basis for our findings
             and conclusions based on our audit objectives.


             In contrast to many other countries, the United States does not have a
Background   primary statistical agency. 7 Instead, the statistical system is
             decentralized, with statistical agencies generally located in different
             government departments. This structure keeps statistical work within
             close proximity to the various cabinet-level departments that use the
             information. There are 13 federal agencies, referred to as the principal
             statistical agencies, which have statistical activities as their core mission.
             These agencies conduct much of the government’s statistical work,
             though there are more than 80 additional federal agencies that carry out
             some statistical work in conjunction with their primary missions. The 13
             principal statistical agencies are all attached to a cabinet-level department
             or an independent agency that reports to the president. As shown in
             figure 1, they are located at different levels within their respective
             departments and agencies.




             7
              Examples of countries that have centralized statistical agencies include Australia,
             Canada, and Sweden.




             Page 4                                                 GAO-12-54 Federal Statistical System
Figure 1: The Thirteen Principal Statistical Agencies and Their Parent Organizations




                                         Note: The 13 principal statistical agencies’ names are displayed in boxes within the figure.




                                         Page 5                                                        GAO-12-54 Federal Statistical System
For fiscal year 2011, $6.83 billion was requested for statistical work,
which includes the collections in our scope as well as work that focuses
on entities other than households and individuals, such as businesses
and farms. 8 This amount is about 0.2 percent of that year’s total federal
budget request. Much of this work is concentrated in the 13 principal
statistical agencies, which account for approximately 40 percent of
requested funding. The budget request for the Census Bureau is among
the highest of the principal statistical agencies. Excluding funding related
to the decennial census, the fiscal year 2011 budget request for the
Census Bureau was $558 million. 9 In addition to conducting its own
statistical activities, the Census Bureau also performs statistical work for
other agencies on a reimbursable basis.

Most of the collections in our scope have relatively modest annual costs.
In a sample of 112 information collections that fell within our scope and
were active as of September 22, 2011, the majority cost less than
$500,000 annually, and fewer than one in five cost more than $1 million
annually. 10 There are a few, more expensive, large and broad-based
collections such as the Current Population Survey and National Health
Interview Survey, both of which cost tens of millions of dollars each year,
and the ACS, which costs over $200 million annually (see fig. 2).




8
 The amount requested for statistical work includes requested funding for work done by
federal agencies that have annual budgets of $500,000 or more for statistical work. OMB
presented this information in Statistical Programs of the United States Government, Fiscal
Year 2011, an annual report that it prepares on statistical program funding. This was the
most up-to-date budget information available at the time of our review.
9
 When decennial census costs are included, the fiscal year 2011 budget request for the
Census Bureau was $1.3 billion.
10
  The sample of 112 collections was designed to be representative of the population of
555 collections in our scope that were active as of September 22, 2011.




Page 6                                                GAO-12-54 Federal Statistical System
Figure 2: Most Information Collections from Households and Individuals Have
Relatively Modest Costs




Various statutes and guidance from OMB and other entities establish
standards for quality and privacy that apply to the federal statistical
system. One of the most significant statutes is the PRA, which designates
OMB as the coordinating body of the federal statistical system. The PRA
establishes requirements that agencies must meet in order to administer
information collections, and OMB must meet in overseeing the system,
including that it issue guidance to agencies. Other entities also provide
guidance to agencies that conduct statistical work. 11 In addition, use of
information must be balanced with protection of privacy and
confidentiality. Statutes such as the Confidential Information Protection
and Statistical Efficiency Act (CIPSEA) apply to the federal statistical


11
  For example, the Committee on National Statistics publishes “Principles and Practices
for a Federal Statistical Agency” every 4 years in order to provide a current edition to
newly appointed cabinet secretaries at the beginning of each presidential administration.
This report outlines basic principles that statistical agencies should adhere to in order to
carry out their missions effectively, as well as practices designed to help implement them.




Page 7                                                 GAO-12-54 Federal Statistical System
                         system and focus on ensuring the privacy and confidentiality of
                         respondents’ information. 12 Agency-specific statutes also protect the
                         privacy and confidentiality of data collected by those agencies. For
                         example, Title 13 of the U.S. Code authorizes the Census Bureau to
                         request and collect information from individuals but also guarantees the
                         confidentiality of these data and establishes penalties for unlawfully
                         disclosing this information. Additional statutes are described in more
                         detail in appendix III.



OMB and Agencies
Take a Number of
Steps to Ensure
Efficient Information
Collections, Though
Opportunities Exist
for Refinements
OMB Uses Its Oversight   Under PRA, OMB, through its Office of Information and Regulatory Affairs
Authority to Improve     (OIRA), has responsibility and broad authority to improve the efficiency
Efficiency               and effectiveness of federal information resources. 13 In this regard, OMB
                         is charged with the oversight and coordination of federal agencies’
                         statistical activities. Specifically, these oversight functions are carried out
                         by OIRA’s Statistical and Science Policy Branch, headed by the Chief
                         Statistician, which includes five staff members who work closely on these
                         oversight and coordination activities with approximately 25 other OIRA
                         desk officers. OMB exercises four key authorities that contribute to the
                         efficiency of the federal statistical system:

                         •     Oversight and approval of information collections: OMB generally
                               must approve information collections that are to be administered to 10
                               or more people. 14 OMB staff review agencies’ information collection


                         12
                             44 U.S.C. § 3501 note.
                         13
                             44 U.S.C. § 3504.
                         14
                           Under PRA the term “person” includes, among others, individuals, partnerships,
                         associations, corporations, and state and local governments.




                         Page 8                                               GAO-12-54 Federal Statistical System
     requests to determine whether proposed collections meet PRA
     standards by assessing such factors as whether they are necessary
     for the mission of the agency and do not unnecessarily duplicate
     existing information. This review also enables OMB to identify
     opportunities for improvement. For example, according to OMB and
     agency officials, if it determines that it is necessary to ask similar
     questions in multiple collections, then OMB works to ensure that
     agencies ask them in a consistent manner, when appropriate.
•    Standard-setting and guidance to agencies: OMB is responsible
     for developing and implementing governmentwide policies, principles,
     standards, and guidelines related to statistical issues, such as
     procedures and methods for collecting data and disseminating
     information. Specifically, OMB issues directives, guidance, and
     memorandums, and provides additional information through
     information sessions and presentations, to guide federal data
     collection and promote the quality and efficiency of information
     collections. For example, OMB published “Questions and Answers
     When Designing Surveys for Information Collections,” a set of 81
     questions and answers on the OMB review process for agency
     information collection requests required by PRA. 15 OMB also issued
     Standards and Guidelines for Statistical Surveys, which outlines 20
     standards and related guidelines for the design and methodology of
     statistical surveys. Finally, OMB issues memorandums focusing on
     various topics, with recent ones clarifying the guidance for complying
     with PRA and encouraging agencies to coordinate efforts to share
     data.
•    Budget development and reporting: Although agency budgets are
     initiated within agencies, OMB is responsible, under PRA, for ensuring
     that agency budget proposals are consistent with systemwide
     priorities for maintaining and improving the quality of federal statistics.
     In addition to the budgets themselves, OMB reports information to the
     public and Congress about the identification of key priorities through
     key documents. OMB annually reports on the paperwork burden
     federal collections impose on the public in the Information Collection
     Budget of the United States Government. In addition, OMB annually
     describes statistical program funding and proposed program changes
     for statistical activities in the Statistical Programs of the United States
     Government.



15
  OMB, “Questions and Answers When Designing Surveys for Information Collections,”
(January 2006).




Page 9                                            GAO-12-54 Federal Statistical System
                          •    Other statistical-policy coordination activities: The Chief
                               Statistician and staff in OMB’s Statistical and Science Policy Branch
                               participate in both formal and informal coordination activities with
                               agencies. OMB’s role and participation in the formal interagency
                               committees are discussed later in this report. In general, it maintains
                               regular contact with staff at principal statistical agencies. Additionally,
                               OMB encourages agencies that are designing information collections
                               to collaborate with principal statistical agencies because they can help
                               improve survey design and methodology. For example, according to
                               OMB and Census Bureau officials, OMB encouraged the Corporation
                               for National and Community Service to work with the Census Bureau
                               and BLS to sponsor a supplement to the Current Population Survey
                               rather than a stand-alone survey. OMB indicated that sponsoring this
                               supplement likely resulted in cost savings, improved data quality, and
                               greater utility. 16


The Reliability of        One tool that OMB uses to facilitate its oversight and coordination
Information in OMB’s      functions under PRA is an internal system called the Regulatory
Information Collections   Information Service Center and OIRA Consolidated Information System
                          (ROCIS), which contains information on all active collections and those
Database Needs to Be      pending OMB approval. Agencies use the system to submit information
Improved                  collection requests. This system also facilitates OIRA’s review of the
                          requests and underlies the information provided on the public website
                          Reginfo.gov. Agency submissions to OMB typically include a copy of the
                          data-collection instrument (e.g., a survey) and supporting documentation
                          that, in a standardized form, provides information on the collection, such
                          as the estimated annual burden hours and cost to the federal
                          government. Further, under PRA, agencies must certify that the collection
                          satisfies the act’s standards, for example that the collection avoids
                          unnecessary duplication. Making this information transparent and easily
                          accessible to other agencies facilitates coordination and can potentially
                          help agencies avoid duplication and identify opportunities for
                          improvement. Furthermore, OMB uses the information contained in its
                          internal system to track reviews of information collections, and to compile



                          16
                            Similar coordination could be fruitful for other surveys as well. For example, we have
                          noted concerns about the surveys the Department of Labor uses for Davis-Bacon Act
                          wage determination and have recommended that the department seek help from an
                          independent statistical organization to ensure survey methods are sound and in
                          accordance with best practices. See GAO, Davis-Bacon Act: Methodological Changes
                          Needed to Improve Wage Survey, GAO-11-152 (Washington, D.C.: March 22, 2011).




                          Page 10                                                GAO-12-54 Federal Statistical System
                          quantitative data for the Information Collection Budget of the United
                          States Government.

                          Despite the benefits of this electronic system, our review identified some
                          discrepancies between the Reginfo.gov website’s data and the underlying
                          documentation for certain key variables. Specifically, we reviewed a
                          systematic random sample of 56 of the 555 collections in our scope and
                          checked the reported information for annual cost to the federal
                          government and annual burden hours. For 11 of the 56 information
                          collections, the information on cost or burden, or both, did not match
                          between the two sources. In cases where annual cost did not match, the
                          differences ranged from $1,000 to $19.3 million. In cases where annual
                          burden hours did not match, the differences ranged from 30 to almost
                          500,000 hours. OMB confirmed that the information in the external
                          Reginfo.gov system is the same as in its internal ROCIS system. As a
                          result, these discrepancies raise questions about the confidence that
                          users can have in both the internal and external databases and may
                          affect OMB’s ability to track information collection requests. OMB officials
                          told us that responsibility for ensuring data reliability is shared between
                          OMB and agencies. The Regulatory Information Service Center has
                          issued detailed guidance to agencies on how to upload information into
                          ROCIS, and the system has a function that allows agencies to check the
                          completeness of data for individual information collections to ensure that
                          no required data are missing. Entering this information is not always
                          straightforward, however, and some interpretation of the underlying
                          documentation may be required. The discrepancies that we identified
                          indicate that additional actions, such as edit checks, review by an
                          informed staff member, or increased clarification in supporting documents
                          are necessary to ensure the reliability of Reginfo.gov and ROCIS data.


Agencies Identify         Our analysis indicated that agencies addressed PRA standards related to
Duplication and Solicit   duplication and public comment in their information collection requests to
Input to Enhance          OMB, and in many cases went beyond the actions specifically described
                          in PRA and related OMB guidance.
Efficiency
                          The elements of PRA most directly related to our review were identifying
                          duplication and soliciting external input on proposed collections. To
                          analyze agencies’ actions in these two areas, we reviewed a
                          generalizable sample of supporting statements from 106 active statistical




                          Page 11                                      GAO-12-54 Federal Statistical System
information collections administered to households and individuals. 17
Each of the supporting statements we reviewed addressed those PRA
standards, as required, and in many cases included detailed descriptions,
the content of which we analyzed in order to identify the range of actions
that agencies took.

Although agencies must address how their proposed collections meet
PRA standards, the act and OMB guidance do not prescribe many
specific actions that agencies need to take in addressing these standards.
Regarding duplication, PRA does not dictate how agencies should
address the standard. Regarding external input, PRA does require that
agencies at a minimum provide notice in the Federal Register to allow the
public to comment on proposed collections as well as consult with
members of the public and affected agencies. OMB guidance expands
somewhat on ways that agencies can address these standards,
particularly in the case of surveys using statistical methods. However, just
as with PRA, much is left to the discretion of agencies and little is
specifically required. For example, OMB guidance states that agencies
should review existing studies and consult with survey methodologists
and data users. 18

Identifying potential duplication: Our analysis showed that agencies
took various steps to comply with the PRA requirement that information
collections do not unnecessarily duplicate an available information
source. 19 Specifically, based on our analysis, we estimated the following
for the universe of collections in our scope:

•     77 percent included detailed explanations of the actions taken to
      identify potential duplication. 20 Those supporting statements that did



17
  Information collections’ supporting statements contain a narrative section through which
agencies describe their efforts to identify potential duplication. Our review focused on
collections that were active as of May 17, 2011.
18
  These directions are provided in OMB’s “Questions and Answers When Designing
Surveys for Information Collections,” and OMB Circular No. A-130.
19
  OMB defines unnecessary duplication as information similar to or corresponding to
information that could serve the agency’s purposes and need and is already accessible to
the agency. (OMB, The Paperwork Reduction Act of 1995: Implementing Guidance for
OMB Review of Agency Information Collection, draft [Aug. 16, 1999]).
20
    The 95 percent confidence interval for this estimate is (68, 85).




Page 12                                                   GAO-12-54 Federal Statistical System
      not include detailed explanations were generally for information
      collections that had unique scopes or other characteristics that made
      them unlikely to duplicate existing information.
•     57 percent reported reviewing other surveys when looking for
      duplication. 21 For example, the National Cancer Institute identified
      seven other surveys that collected information similar to that of a
      Current Population Survey supplement on tobacco use and explained
      why the data from these surveys could not replace those collected
      through the supplement.
•     46 percent indicated that the agency considered administrative data
      as a potential source of data. 22
•     About a quarter indicated that they consulted with other entities, such
      as agencies, and a similar number reported that they conducted
      literature searches. 23
•     In addition, for six of the information collections in our sample,
      agencies sponsored a collection in the form of a supplement to the
      Current Population Survey rather than creating a stand-alone survey,
      thus piggybacking onto another survey vehicle to potentially avoid
      duplication.
Despite these steps, the collection of similar data in different surveys is
unavoidable for methodological reasons. In some cases, agencies need
to ask the same or similar questions because different surveys target
different populations. Both the National Survey of College Graduates and
the Current Population Survey ask about respondents’ college degree
and occupation, but the National Survey of College Graduates targets
individuals in the United States who have bachelor’s degrees or higher in
science or engineering, while the Current Population Survey targets a
nationally-representative sample of U.S. civilians aged 16 and older.
Furthermore, according to agency officials, assessing relationships
among survey variables may require asking the same or similar questions
in different surveys. For example, it is common for surveys to ask for
respondents’ ages in order to analyze how responses to other questions
vary according to this variable. In addition, asking the same question
among surveys allows agencies to compare survey estimates and


21
    The 95 percent confidence interval for this estimate is (46, 68).
22
    The 95 percent confidence interval for this estimate is (32, 59).
23
  Twenty-four percent of collections indicated that agencies consulted with another entity,
and 25.5 percent reported that they conducted literature searches. The 95 percent
confidence intervals for these estimates are (15, 35) and (16, 37), respectively.




Page 13                                                   GAO-12-54 Federal Statistical System
evaluate surveys’ data quality. In order to facilitate comparisons among
surveys, OMB encourages asking consistent questions, when possible,
about certain characteristics such as race and ethnicity. Further, when
considered from an individual’s perspective, duplication of survey
questions is relatively rare. This is because in a given year a very small
percentage of households are selected to participate in a single collection
within our scope. 24 The likelihood that a household would be selected for
participation in more than one collection, and thus household members
be asked the same question more than once, is considerably lower.

Soliciting input and feedback on information collections: Most
agencies in our scope took steps to seek outside input beyond those
prescribed by OMB guidance and the PRA. On the basis of our analysis,
we estimated the following for the universe of collections in our scope:

•     75 percent indicated that the agency reported obtaining external
      feedback in addition to publishing notices in the Federal Register. 25
•     57 percent indicated that agencies consulted with experts. 26 For
      example, the sponsor of the National Survey of Women Veterans, a
      survey on the health-care needs, experiences, and preferences of
      women veterans, consulted with individuals representing a variety of
      research and clinical backgrounds, such as public health, social
      welfare, and psychology.
•     Agencies less frequently reported consulting with other agencies,
      contractors or subcontractors, or interagency or advisory
      committees. 27 In addition, agencies reported soliciting feedback
      directly from former and potential survey respondents and data users
      and customers. 28 They also reported conducting literature searches




24
  For example, the ACS is the largest survey in our scope and is administered annually to
2.5 percent of households.
25
    The 95 percent confidence interval for this estimate is (65, 83).
26
    The 95 percent confidence interval for this estimate is (46, 67).
27
  Thirty nine percent of collections indicated that agencies consulted with other agencies,
32 percent reported contacting contractors or subcontractors, and 15 percent described
meeting with interagency or advisory committees. The 95 percent confidence intervals for
these estimates are (30, 49), (22, 41), and (8, 25), respectively.
28
  For example, we estimate that 15 percent reported soliciting input from data users and
customers. The 95 percent confidence interval for this estimate is (8, 25).




Page 14                                                   GAO-12-54 Federal Statistical System
     and sponsoring or participating in workshops, panels, or other
     events. 29
Agencies in our sample reported making changes in response to input,
potentially resulting in improvements to their information collections. For
example, in response to recommendations made by the Committee on
National Statistics on a Current Population Survey supplement about food
security, ERS reported entering into an agreement with Iowa State
University to study food security measurement issues. This collaborative
project is exploring alternatives to an aspect of the supplement’s current
design, which could result in alternatives to methods used to estimate
food security prevalence and potentially improve measurement precision
and reliability. The U.S. Geological Survey also reported incorporating
changes in response to feedback on its Landsat Survey. 30

Agencies’ actions to find duplication and solicit input that we identified in
our review, as well as others that OMB may identify, could be useful for
OMB to share with other agencies that sponsor information collections.
Offering more-detailed guidance in a single document that outlines
different actions agencies can take to identify duplication and solicit input
would help ensure that agencies are aware of the various options. It
would also allow them to easily access and reference this information.
OMB could include this information in one of its periodic memorandums
related to compliance with the PRA. We previously reported on the
importance of establishing ways to operate across agency boundaries,
and promoting these actions is one way OMB can do this. 31 Also, just as
OMB’s guidance to agencies in complying with the Information Quality Act
gives agencies flexibility to determine the most appropriate actions, it is
important that any new guidance continue to give agencies discretion in


29
  Factors that facilitate interaction among agencies and between agencies and others in
the statistical community include agency staff’s professional involvement in committee
work, movement by some staff to other agencies during their careers, and training
opportunities. For example, survey methodologists work together on various interagency
subcommittees. Plus, their professional development also includes attending local and
other conferences at which papers are presented describing uses and activities related to
surveys in other agencies. These opportunities for cross-agency professional knowledge
transfer facilitate collaboration and the identification of opportunities for efficiency.
30
  The Landsat Survey collects information from professional users of satellite imagery to
better understand the uses and applications of moderate-resolution satellite imagery as
well as information about the users.
31
 GAO-06-15.




Page 15                                               GAO-12-54 Federal Statistical System
                            the number and types of actions they take to identify duplication and
                            solicit input. This is because the most appropriate actions will vary based
                            on the characteristics of the collection.


Interagency Committees      Interagency statistical committees offer opportunities for broader
Facilitate Collaboration,   collaboration to increase the efficiency of the federal statistical system.
but Better Communication    Three key committees are the Interagency Council on Statistical Policy
                            (ICSP), the Federal Committee on Statistical Methodology (FCSM), and
Could Increase              the Statistical Community of Practice and Engagement (SCOPE), all of
Effectiveness               which are either chaired or sponsored by OMB. 32 Importantly, the
                            activities of the interagency committees are consistent with key
                            collaborative practices we identified in our previous work. 33 For example,
                            each of these committees has defined roles and responsibilities, and the
                            committees serve as a vehicle for the agencies to operate across agency
                            boundaries. Specifically, ICSP serves an advisory function to the Chief
                            Statistician and focuses on broader issues related to the federal statistical
                            system. In addition, ICSP provides overarching guidance to FCSM and
                            SCOPE. FCSM investigates statistical practices and methodologies used
                            in federal statistical programs, while SCOPE focuses on cross-agency
                            activities of data management and dissemination. Table 1 provides an
                            overview of these committees.




                            32
                              Outside of these interagency committees, there are nonfederal organizations, such as
                            the Committee on National Statistics and the Council of Professional Associations on
                            Federal Statistics, which serve as resources to identify opportunities for improving federal
                            statistics.
                            33
                              GAO-06-15.




                            Page 16                                                GAO-12-54 Federal Statistical System
Table 1: Overview of Interagency Statistical Committees

                          Interagency Council on                Federal Committee on Statistical Statistical Community of Practice
                          Statistical Policy (ICSP)             Methodology (FCSM)               and Engagement (SCOPE)
Date established          1989a                                 1975                                       2009
Membership                The heads of the principal            About 20 members appointed by              Appointed representatives from the
                          statistical agencies, plus the        the Chief Statistician based on            principal statistical agencies, plus the
                          statistical unit at the               technical expertise and history of         statistical unit at the Environmental
                          Environmental Protection              innovative contributions to the            Protection Agency.
                          Agency.                               federal statistical system.
Mission                   •   Coordinate statistical work,      •     Communicate and                      •    Provide a collaborative community
                              particularly when activities            disseminate information on                for statistical agencies to produce
                              and issues cut across                   statistical practice among all            relevant, accurate, timely, cost-
                              agencies.                               federal statistical agencies.             effective data and insightful
                          •   Exchange information              •     Recommend the introduction                research disseminated through
                              about agency programs                   of new methodologies in                   shared state-of-the-art best
                              and activities.                         federal statistical programs to           practices to support data-driven
                                                                      improve data quality.                     decisions.
                          •   Provide advice and
                              counsel to OMB on                 •     Provide a mechanism for
                              statistical matters.                    statisticians in different federal
                                                                      agencies to meet and
                                                                      exchange ideas.
Description of selected   •   Identifying the highest-          •     Discussing disclosure                •    Surveying tools used by statistical
projects                      priority statistical program            limitation methods;                       agencies to comply with
                              improvements.                     •     Investigating nonresponse                 standards for access to electronic
                          •   Developing views on                     issues related to selected                and information technology
                              improving implementation                surveys.                                  procured by agencies, and
                              of the PRA.                                                                       recommending the best tools for
                                                                •     Clarifying legal issues of                use.
                          •   Providing direction to                  confidentiality and informed
                              FCSM’s subcommittees on                 consent.                             •    Developing protocols for pilot
                              privacy and administrative                                                        testing of a secure cloud
                                                                •     Examining issues related to               environment for storing data and
                              data.                                   the quality of administrative             recommended software.
                                                                      data.
                                              Source: GAO analysis of OMB and agency data.
                                              a
                                               ICSP was established in 1989 and codified in the 1995 reauthorization of the PRA.


                                              The committees study statistical issues and methods through
                                              subcommittees and working groups, most of which rely on volunteers
                                              from member agencies who take on these responsibilities in addition to
                                              their current job duties. The work of the subcommittees and working
                                              groups has been useful to other agencies. For example, an FCSM
                                              subcommittee produced a checklist that, according to OMB, is used
                                              around the world to determine whether a public-use data product
                                              sufficiently protects the confidentiality of individuals’ data.

                                              The interagency committees use various methods to disseminate
                                              information on their activities and products, but they do not do so in a



                                              Page 17                                                          GAO-12-54 Federal Statistical System
timely or comprehensive manner. The committees’ work is summarized in
OMB’s annual report Statistical Programs of the United States
Government, but the report does not always communicate key
information about it. For example, the fiscal year 2011 report states that
one of ICSP’s activities over the past year was identifying the highest-
priority statistical-program improvements, but does not provide
information about all of these improvements. 34 In addition, interagency
committees present information about their work at statistical seminars.
For example, according to OMB officials, FCSM has presented work at
the biennial FCSM Statistical Policy Seminars. Additionally, agency
officials noted that members of interagency statistical committees utilize a
limited-access web-based system to facilitate information sharing.
Information about FCSM’s work is also posted on the committee’s
website or the FedStats website. 35 Neither ICSP nor SCOPE has a
dedicated website, though OMB believes that this is not necessary or
appropriate because the work of these groups is deliberative. While the
FCSM website offers the potential to effectively disseminate information,
it is not comprehensive or timely. For example, it provides links to the
sites of various interagency and advisory committees, including three
FCSM permanent working groups, but does not have pages for any of the
active FCSM subcommittees. 36 Moreover, the websites do not appear to
be regularly updated with new products produced by the committees that
could be useful for other agencies. For example, the subcommittee on the
statistical uses of administrative data published a paper in April 2009
highlighting examples of successful data-sharing projects using
administrative data for statistical purposes, but this product is not yet
available on the FCSM website.

Providing more-comprehensive and timely information on interagency
activities could offer benefits. As identified in our previous work,
developing mechanisms to monitor and report on results is a necessary


34
  OMB staff noted that the specific program improvements are reflected in the President’s
budget.
35
  FedStats is a website that provides access to statistical information produced by the
federal government. In addition, it includes all federal agencies listed in Statistical
Programs of the United States Government that report a certain level of expenditures in
statistical activities.
36
  FCSM has active subcommittees looking at statistical uses of administrative data and
privacy issues. In addition, FCSM has permanent working groups that discuss specific
topic areas, such as nonresponse to household surveys.




Page 18                                              GAO-12-54 Federal Statistical System
                           element of a collaborative relationship. 37 In this case, better reporting of
                           committee activities and products could offer benefits to those who are
                           not involved in committee activities, as well as committee members.
                           Membership in the committees is made up almost exclusively of
                           representatives from the 13 principal statistical agencies, so most
                           agencies are not directly involved in committee activities. It makes sense
                           that agencies that have statistics as their primary focus are the most-
                           heavily involved, but those agencies for which statistics is a supporting
                           function to their primary mission, and possibly academics and the broader
                           public, could benefit from greater access to information and products
                           related to the committees’ work and priorities. More easily accessible
                           information would also benefit member agencies, as it would offer a
                           centralized place to maintain committee work and communicate priorities.
                           Much work goes into developing the committees’ products, and making
                           them easily accessible maximizes their value.



Administrative Data
Could Help Improve
Federal Surveys, but
Continued Progress Is
Needed on Access
and Quality Issues

Administrative Data Have   Administrative data, typically collected to administer a program or
Greater Potential to       business, are a growing source of information on individuals and
Supplement, Rather than    households. For example, the Social Security Administration collects data
                           on the earnings of U.S. workers from employers and the Internal Revenue
Replace, Federal Surveys   Service to calculate the amount of benefits for retired workers, spouses,
                           children, and other beneficiaries, while businesses obtain data, for
                           example, on item and amount of purchases when customers use credit
                           cards and store loyalty cards. According to the Census Bureau, the
                           amount of administrative data held by private companies exceeds the
                           amount held by the government. Researchers recently estimated that the



                           37
                            GAO-06-15.




                           Page 19                                       GAO-12-54 Federal Statistical System
amount of digital data in existence, which includes some types of
administrative data such as retail customer databases, more than doubles
every 2 years. 38 Administrative data have been identified as an important
resource for the future of the statistical system, as some of these publicly
and privately held data may be analyzed or reported with survey data to
yield greater value. Furthermore, the increasing capacity to store and
process administrative data has facilitated this potential use.

For decades, agencies have been working to expand the use of
administrative data in conjunction with data collected from surveys, but
certain characteristics of administrative data make it difficult to use them
to replace surveys or sections of surveys administered to households and
individuals. There is interest in exploring how administrative data may be
used to improve data quality, hold down costs, and reduce respondent
burden. For example, as part of the redesign of the Consumer
Expenditure Surveys, BLS is investigating the potential for replacing
some portions of the survey with external sources of expenditure data to
reduce respondent burden and potentially improve data quality. However,
agencies we contacted have not replaced surveys or sections of surveys
administered to households and individuals with administrative data
because data: (1) are often not representative of a survey’s population of
interest; (2) may not correspond to information collected through survey
questions; (3) are vulnerable to program cancellation or changes; and (4)
may take a long time to obtain, which delays use and in some cases
could cause agencies to miss required reporting dates.

Administrative data currently show greater promise for supplementing
federal surveys. Indeed, the agencies we contacted identified four major
opportunities to enhance surveys with administrative data in order to
create efficiencies and enhance data quality. 39 Current uses of
administrative data include the following:




38
  John Gantz and David Reinsel, “Extracting Value from Chaos” (Framingham, Mass.:
IDC Go-to-Market Services, June 2011).
39
   For the purposes of our report, we focused on the use of administrative data with
surveys administered to households and individuals. However, agencies such as the
Census Bureau and BLS also use administrative data with business surveys to produce
business statistics. For example, by combining administrative and survey data, the
Census Bureau produces an annual series on employment by county, and BLS produces
its quarterly series of statistics on gross job gains and losses.




Page 20                                            GAO-12-54 Federal Statistical System
•   Creating new data products: Agencies link survey data and
    administrative data to create new, more robust, statistical data
    products, which increases efficiency in two key ways. First, according
    to OMB and agency officials, agencies can use these new data
    products to evaluate and potentially improve federal policies and
    programs, especially those related to the source of the administrative
    data, without adding to respondent burden. Second, combining
    administrative data with survey data can increase efficiency by
    enhancing previously collected survey data. For example, the National
    Center for Health Statistics’s (NCHS) record-linkage program links
    survey data from various health-related surveys to different
    administrative datasets to create new data products for studying
    factors that influence health-related outcomes, such as disability,
    health care, and mortality.
•   Supplementing surveys’ sample frames: Using administrative data
    to supplement surveys’ sample frames—the sources from which a
    survey’s sample is drawn—can create efficiencies, reduce costs, and
    enhance the quality of surveys. For example, the National Household
    Food Acquisition and Purchase Survey uses administrative data from
    the Supplemental Nutrition Assistance Program to develop a sample
    frame of participating households to potentially include in the survey.
    ERS officials said that using these data to help develop the survey’s
    sample frame costs less than the alternative of screening a broader
    group of respondents to determine if they are participating. In addition,
    agencies can use administrative data to augment sample frames in
    areas where the sample is not large enough to fully support a survey.
    For example, the Census Bureau’s pilot project studying the potential
    to use ACS data as a sample frame for the National Immunization
    Survey used commercial data to supplement ACS data in a county
    that had a limited ACS sample.
•   Comparing data to improve survey accuracy and design: By
    comparing survey data to similar administrative datasets and
    identifying reasons for any discrepancies that may exist, agencies can
    improve the quality of survey data. For example, researchers
    identified opportunities for improving surveys’ designs and
    methodologies after agencies found that surveys of enrollment in
    health-insurance programs provided lower estimates than those
    compiled from administrative data. Agencies can also improve the
    efficiency of their surveys by using administrative data as part of
    nonresponse follow-up activities.
•   Modeling estimates: Agencies combine administrative data and
    survey data to create, or model, estimates that are designed to be
    more accurate than estimates based on survey data alone. The main
    benefit of modeling is that it provides the ability to produce estimates


Page 21                                       GAO-12-54 Federal Statistical System
                                  for smaller geographic areas than is possible using a survey alone.
                                  For example, the Census Bureau conducts the Small Area Income
                                  and Poverty Estimates Program to provide updated data on poverty
                                  and income, which is used to administer federal programs and
                                  allocate federal funds to local areas. The Census Bureau combines
                                  survey data from the ACS with population estimates and
                                  administrative data and has found that this approach produces
                                  consistent and reliable data more reflective of current conditions than
                                  data produced only by existing surveys.

Agencies Are Addressing     Despite the benefits of using administrative data to supplement federal
Issues That Hamper Use of   surveys, agencies face five key constraints related to data access and
Administrative Data, but    quality:
Additional Actions Could    •     Statutory restrictions on data sharing: Federal and state statutes
Facilitate Progress               sometimes prohibit or limit sharing of data for statistical purposes. In
                                  cases where specified authorized uses do not include statistical use,
                                  nothing short of a statutory change can overcome the constraint. In
                                  other cases, statutes limit sharing to purposes related to program
                                  administration. For example, the 2008 Farm Bill restricts access to
                                  data on participants in certain nutrition-assistance programs to uses
                                  for the “administration or enforcement” of the programs. 40 Similarly,
                                  the Higher Education Act of 1965, as amended, restricts federal
                                  student aid data to purposes related to the “application, award, and
                                  administration of aid.” 41 However, agencies holding such restricted
                                  data can differ on whether statistical uses are related to program
                                  administration. The Census Bureau successfully negotiated access to
                                  the nutrition assistance data because it could demonstrate that the
                                  linked data would help the federal sponsor and state agencies
                                  develop better measures of outcomes, such as poverty, inequality,
                                  and the receipt of government transfers. Conversely, the Census
                                  Bureau was unable to gain access to the federal student aid data for
                                  statistical uses because the Department of Education did not consider
                                  that any of the planned uses related to the program’s administration.



                            40
                              7 U.S.C. § 2020(e)(8)(A)(i). The Department of Agriculture administers the program at
                            the federal level through the Food and Nutrition Service, while state agencies administer
                            the program at the state and local levels, including determination of eligibility and
                            allotments.
                            41
                                20 U.S.C. § 1090(a)(3)(E).




                            Page 22                                               GAO-12-54 Federal Statistical System
•    Consent: Individuals’ consent to allow their administrative and survey
     data to be linked affects uses of administrative data for statistical
     purposes. Seeking consent derives from a core concept of personal
     privacy: the notion that each individual should have the ability to
     control personal information about himself or herself. 42 Moreover,
     there can be issues regarding the privacy and confidentiality of data
     collected for one purpose and used for another, and agencies use
     different practices, wording, and level of detail to meet consent
     requirements, according to OMB officials. At the time administrative
     data are collected, an agency can inform individuals that their data
     may be used for statistical purposes, but, according to ERS officials,
     agencies collecting administrative data often do not consider possible
     future statistical uses and therefore may not provide such notice.
     Obtaining consent after data have been collected can be time-
     consuming and costly. In addition, an agency can ask survey
     respondents for permission to link their survey data with certain
     administrative data. Some respondents may not consent, which can
     substantially limit the number of respondents eligible for linkage and
     as a result potentially affect the quality of the linked data. 43
•    Costs and infrastructure: Because the primary cost of collecting
     administrative data has already been incurred, using these data can,
     in some cases, be more efficient and less costly than new survey
     efforts. However, there still are costs to using administrative data for
     statistical purposes, including up-front and ongoing investments to
     purchase and maintain hardware and software to link data and protect
     their confidentiality. Agencies identified various factors that can affect
     costs. These include but are not limited to negotiations with the
     agency holding the data, the quality of the administrative data, and the
     ease with which they can be linked to other data. BLS officials said
     that in some cases the costs of using administrative data with survey
     data may outweigh any savings and that evaluation of administrative
     data options always requires careful consideration of a wide range of
     quality and cost issues, including the costs of specialized personnel
     and infrastructure. According to an FCSM study that profiled
     examples of successful statistical uses of administrative data,


42
  GAO, Record Linkage and Privacy: Issues in Creating New Federal Research and
Statistical Information, GAO-01-126SP (Washington, D.C.: April 2001).
43
  Agency officials noted that, if the survey respondents who consent differ from those who
do not consent, analysis of the linked files may lead to misleading or biased results. Also,
the reduced sample size from an analysis using data for those who consent may increase
confidence intervals for calculated estimates.




Page 23                                                GAO-12-54 Federal Statistical System
    agencies wanting to share data also may not have the necessary
    staff, policies, or procedures. For example, negotiating data-sharing
    agreements may require significant time. Moreover, many key
    administrative datasets are held by states, further complicating the
    data-sharing process because agencies have to negotiate under
    different policies and procedures as well as work with numerous staff
    across states.
•   Documentation of datasets: OMB and agency officials said that
    agencies holding administrative data do not uniformly document
    information about their datasets in a way that is always useful or
    efficient for use outside of the agency. This lack of documentation of
    datasets makes evaluating their potential for statistical uses
    challenging. For example, definitions of key variables of research
    interest or information about how frequently the agency updates the
    data may not be available. ERS officials also noted that private
    companies typically do not disclose detailed information about the
    sources of their data, making it difficult to assess their quality. As a
    result, agencies interested in using these data for statistical purposes
    may have to spend additional time and resources to understand the
    content and structure of the datasets.
•   Quality of data: Agency officials and experts identified reasons why
    the quality of administrative data can vary, which can affect their
    potential use with survey data. Specifically, different agencies may
    use different systems, definitions, and time frames when collecting
    administrative data. For example, states may collect and evaluate the
    quality of data in different ways, making it complicated to aggregate
    the data across states as well as to compare state-level data. In
    addition, several factors can influence the accuracy of data reported in
    administrative data. For example, agencies that collect data for the
    purpose of program administration may be concerned with the
    accuracy of only the variables used for such purpose. Moreover,
    reporting incentives may influence data quality. For example,
    individuals may underreport income on tax forms, and program
    agencies may pay less attention to the accuracy of information
    collected from applicants when it does not affect their participation in a
    program.
Agencies and interagency committees have been taking numerous
actions to address these constraints. For example, ERS, in collaboration
with the Census Bureau, NCHS, and OMB is undertaking a pilot project to
address data quality concerns with state-level administrative data, and
FCSM is working on a project to clarify legal requirements for informed
consent (see fig. 3).




Page 24                                       GAO-12-54 Federal Statistical System
 Interactive graphic      Figure 3: Actions Taken to Address Constraints That Hamper Greater Use of Administrative Data




Directions:
[Click] on the types
of constraints within
the graphic structure
on the right to see
                                                            Statutory restrictions                               Data
descriptions and                                                                                            documentation
                                                               on data sharing
examples of selected
actions taken by
agencies to address
each constraint type
                                                                                             Constraints to
                                                                                           statistical uses of
                                                                                          administrative data
                                                                                                                         Quality of
                                                      Consent                                                              data



                                                                                            Costs and
                                                                                          infrastructure




                          Source: GAO analysis of OMB and principal statistical agency data.




                          Constraint: Access

                            Statutory restrictions on data sharing


                          Description                                                    Examples of actions taken
                          Statutes may not authorize statistical                         • OMB has issued guidance encouraging greater sharing of data 	
                          uses for data collected by a program.                          	 while protecting privacy; and
                                                                                         • FCSM produced a document, highlighting lessons learned when 	
                                                                                         	 negotiating data-sharing agreements.




                          • Click to make view needed visible. In the “Print” dialog box, choose “Current page,” then “OK.” Repeat to print each view.
     Print instructions   • A text version of this graphic is available in appendix IV.

                                       Page 25                                                                       GAO-12-54 Federal Statistical System
One theme that cuts across many of these efforts, and where additional
short-term actions could accelerate progress, is identifying ways to
facilitate the process of deciding whether to share data among agencies.
FCSM published a paper describing successful data-sharing
arrangements between various federal and state agencies. One of the
four core elements of success that FCSM identified in these
arrangements was mutual interest, in that each participant—in particular
the agency providing the data—evaluates a proposed data-sharing
agreement from its own perspective. 44 On the one hand, agencies may
share data because the linked data can benefit program administration,
as noted earlier. On the other hand, OMB and agency officials noted that
agencies may decide against sharing because perceived disadvantages,
such as policy concerns and potential identification of weaknesses in
program administration, outweigh the possible benefits. In such a case,
an individual agency’s interests may be at odds with the broader
efficiency of the whole federal statistical system. As illustrated in figure 3,
FCSM and agencies are developing tools to approach these decisions in
a more standardized way, such as developing checklists for evaluating
the quality of administrative data and a template for executing data-
sharing agreements. However, these individual tools focus on particular
aspects of data sharing—for example, the checklist focuses on data
quality. Separately, they may not be sufficient for agencies to efficiently
identify potential datasets with the greatest potential for mutual benefit
and address all factors involved in the decision-making process.

The benefits of having more-comprehensive centralized guidance could
include greater consistency, clarification, and efficiency. A more-
comprehensive standardized framework that ties together existing tools
with additional resources in order to cover major aspects of the data-
sharing process could bring consistency to the decision-making process.
Similar to the checklist that FCSM is developing for agencies to use in
evaluating data quality, the framework could include a template outlining
a list of key questions for all agencies involved in the proposed data
sharing, including federal and state agencies that hold data, to address
issues such as: (1) the steps to take to ensure data reliability; (2) any
statutory limitations on planned uses of the data (including confidentiality
protections); (3) whether consent has already been obtained for additional



44
  The three other core elements of success in these arrangements were (1) vision and
support by agency leadership, (2) narrow but flexible goals, and (3) infrastructure.




Page 26                                             GAO-12-54 Federal Statistical System
use of the data, or how it will be obtained; and (4) methods to fully
account for the costs associated with obtaining and using the data. To be
comprehensive, such guidance would not need to be voluminous, but it
should identify each of these major aspects of data sharing, provide
advice to agencies, and reference any tools available to assist agencies
during the process. It should also be kept up-to-date, reflecting changes
in legislation or other factors that affect data sharing, as well as any new
tools that are developed. Although such a framework may not lead to
sharing in all cases, the framework could better ensure that agencies
weigh the related benefits and costs in a more balanced, consistent, and
transparent fashion. Such guidance could also clarify ways that agencies
could resolve disagreements over data sharing. It could also improve
efficiency, given that agency officials we spoke with cited examples in
which it took multiple years to reach a resolution on data sharing, by
helping agencies evaluate available data and determine those that have
the greatest potential for mutual benefit.

While agencies can take steps to address some constraints on sharing
data, in other cases only policy actions on the part of the executive
branch or Congress can lift barriers. One of the primary examples of such
action is Congress’s enactment of CIPSEA in 2002, which authorized the
Census Bureau to share selected business data with BLS and the Bureau
of Economic Analysis for statistical purposes. However, CIPSEA is limited
because the Census Bureau’s business data are based in large part on
tax data, and as a result the tax code would need to be amended for the
Census Bureau to also share these data with other statistical agencies.
There have been proposals to amend the tax code to further expand the
scope and coverage of CIPSEA, but action has not yet been taken by
Congress. 45




45
  As discussed in our recent report, Taxpayer Privacy: A Guide for Screening and
Assessing Proposals to Disclose Confidential Tax Information to Specific Parties for
Specific Purposes (GAO-12-231SP), Internal Revenue Code Section 6103 provides that
federal tax information is to be kept confidential and used to administer federal tax laws
except as otherwise specifically authorized by law.




Page 27                                                GAO-12-54 Federal Statistical System
Prospects for
Enhanced Use of the
ACS with Other
Surveys Are Mixed
The ACS Provides Unique    The Census Bureau’s full implementation of the ACS in 2005 was a major
Coverage of the Nation’s   change to the statistical system. The survey is unique among other
Population                 surveys of households and individuals because of its size—the monthly
                           surveys add to an annual sample of 3.54 million addresses. The ACS
                           provides annual estimates of social and economic characteristics for all
                           areas of the country and is a primary source of information on small
                           areas, such as towns and tribal lands, down to the neighborhood level.
                           The ACS covers a broad range of topics, such as housing, education, and
                           employment. The information provided by the ACS was previously only
                           available once a decade from the decennial census long form, which the
                           ACS replaced. Users of ACS information include all levels of government,
                           the private and nonprofit sectors, and researchers. According to the
                           Census Bureau, ACS estimates are currently used to help allocate more
                           than $400 billion in federal funding annually. Table 2 lists some of the key
                           characteristics of the ACS.




                           Page 28                                      GAO-12-54 Federal Statistical System
                             Table 2: Key Characteristics of the ACS

                                 Characteristic              ACS
                                 Response requirements       Responses are required by law
                                 Frequency of                Administered on a monthly basis
                                 administration
                                 Frequency of data           Annual
                                 products
                                 Reference point for data    Period estimates: the period over which data are cumulated
                                 products                    is determined by the population of the geographic area for
                                                             which the estimate applies. Estimates for places with
                                                             populations of more than 65,000 represent a 1-year period;
                                                             places with populations of 20,000 to 65,000 represent 3-year
                                                             periods; and places with populations smaller than 20,000
                                                             represent 5-year periods.
                                 Number of questions         48 potential questions per person, plus 21 per housing unita
                                 Respondent burden           38 minutes per respondentb
                                 Sample size                 3.54 million addresses per year
                                 Key uses                    Directing government funding, informing government and
                                                             private-sector decision making, and research
                             Source: Census Bureau.
                             a
                              Although there are 48 individual questions on the ACS, several questions only apply to respondents
                             with certain characteristics, so respondents likely do not answer every question. For example, only
                             ACS respondents who are female and age 15 to 50 are asked to answer a question about whether
                             they have given birth in the past year.
                             b
                              The Census Bureau estimates that the respondent burden is 38 minutes for the questionnaire it
                             administers to households. Its estimates for other interviews, such as group quarters, are different.


Agencies Use the ACS to      Several of the ACS’s characteristics lend to its appeal for use for other
Inform the Design of Other   surveys, including that it produces annual estimates on a broad range of
Surveys and Analyze Their    topics at finer geographic levels than other surveys, and agencies and
                             others identified five areas of opportunity in which surveys can make use
Results                      of ACS data and resources. Two of these areas, which generally rely on
                             publicly-available ACS estimates and do not require changes to the
                             survey’s design or methodology, have the greatest potential for
                             widespread use. The Census Bureau has provided users with various
                             resources to guide their use of ACS estimates. These include a guide to
                             comparing estimates, handbooks directed to specific types of users,
                             training presentations, and a tutorial. The two areas with the most
                             potential for use are as follows:

                             •      Evaluating and supplementing other surveys’ results: Survey
                                    administrators and data users can also use ACS estimates to
                                    evaluate information collected by other surveys. For example, survey
                                    administrators can use ACS information to evaluate the quality of


                             Page 29                                                       GAO-12-54 Federal Statistical System
    responses to other surveys that include some questions that are the
    same as or similar to ACS questions. Additionally, data users and
    survey administrators can use ACS data to supplement information
    collected by other surveys. For example, a recent report based on
    analysis of ACS data describes how median earnings vary by the field
    in which people obtain their bachelor’s degrees. 46 Such information
    can complement results from other surveys. In this case, NCSES also
    produces information on earnings by degree type, based on
    information in its Scientists and Engineers Statistical Data System
    database, which contains data on people with a science or
    engineering degree and those who work in related fields. NCSES
    information collection is less frequent than ACS estimates and
    pertains to a more-narrowly defined population, but allows more
    detailed analysis of issues such as how people use their college
    degrees at work. Together, these two sources of information offer
    more-timely and more-detailed information than a single source.
•   Designing other surveys: There is also widespread potential to use
    ACS data to more-efficiently design other surveys. Because many of
    the topics included in the ACS are covered in more detail by other
    surveys or relate to other surveys’ target populations, survey
    administrators can use ACS estimates at different demographic or
    geographic levels to stay up-to-date on changes that may affect their
    surveys. These estimates can also be used when designing and
    selecting a survey’s sample. Census Bureau officials told us that,
    when designing a survey, survey administrators can use the data to
    guide the selection of a survey’s sample so that it better represents
    individuals or households with certain characteristics. For example,
    the Survey of Income and Program Participation can use ACS
    estimates at different demographic or geographic levels to identify and
    more-efficiently sample geographic areas with disproportionately large
    numbers of low-income households because this is a population of
    interest for the survey. Because these data are available for small
    geographic areas, agencies can use the data when samples include
    more-local geographic levels.




46
  Anthony P. Carnevale, Jeff Strohl, and Michelle Menton, What’s It Worth? The
Economic Value of College Majors, Georgetown University Center on Education and the
Workforce (Washington, D.C.: May 24, 2011).




Page 30                                           GAO-12-54 Federal Statistical System
Uses of ACS That Require   Agencies and others identified three uses of ACS data and resources
Design and Methodology     that, while offering potential benefits to other surveys, face such
Changes Have Limited       constraints that more widespread use is likely not possible under current
                           ACS design. These uses are more intensive than the ones described
Potential                  above, in that they affect the survey’s design and methodology or
                           respondent burden, or both. Because the ACS has a large sample size
                           and a complex methodology, there are logistical challenges involved in
                           changing its design and methodology. Additionally, any changes that
                           affect the survey’s respondent burden also have limited potential, as there
                           are already concerns about the burden that the ACS places on
                           respondents. Uses with more-limited potential are as follows:

                           •   Adding or modifying ACS content: Adding a question to the ACS or
                               modifying existing questions can improve the efficiency of other
                               surveys, though doing so involves trade-offs with factors such as
                               respondent burden. This use of ACS could provide information that
                               would inform the design of other surveys or facilitate the use of ACS
                               data for another survey’s sample frame. For example, NCSES worked
                               with the Census Bureau to add a question to the ACS about the field
                               in which respondents earned their bachelor’s degrees in order to
                               identify respondents that are in the target population for the National
                               Survey of College Graduates. Despite the potential benefits of adding
                               or modifying ACS content, adding a question to the ACS would
                               increase respondent burden and have operational impacts, as it
                               requires the Census Bureau to change the questionnaire design and
                               processing and editing systems. If these actions result in additional
                               pages for the questionnaire, it could affect costs and the response
                               rate. Modifying questions poses an additional challenge because ACS
                               estimates reflect multiple years of data, and a change in a question
                               may affect the Census Bureau’s ability to cumulate data.
                           •   Adding supplements to the ACS: Another possible use of the ACS
                               by other surveys is adding supplements to the ACS, though this use
                               faces several obstacles. While the ACS currently does not include
                               supplements, doing so could enable surveys to leverage the
                               resources of the ACS. Other surveys, such as the Current Population
                               Survey, allow other agencies or entities to sponsor supplemental
                               surveys that are added on to the survey’s core set of questions.
                               According to officials at BLS, which sponsors the Current Population
                               Survey, in their experience it costs less to add a supplement to an
                               existing survey than to conduct a separate stand-alone survey.
                               Additionally, the agencies sponsoring the supplements gain the
                               benefit of the experience of BLS or Census staff, or both, in designing
                               and implementing surveys. Although the Current Population Survey
                               successfully incorporates supplements, the ACS is different in several


                           Page 31                                      GAO-12-54 Federal Statistical System
     key ways, and adding supplements to the ACS would involve
     significant challenges. For example, the ACS is mandatory, meaning
     that responses are required by law. Assuming a supplement to the
     ACS would be voluntary, Census Bureau officials told us that they
     would have to determine how to distinguish between the mandatory
     and voluntary sections, which would create complexity. Additionally,
     the Census Bureau processes ACS data on a yearly basis and does
     not have a process in place for producing estimates from a single
     month’s data, which would be a challenge if the supplement was
     administered along with only 1 month’s ACS mailout. Finally, including
     supplements raises concerns about respondent burden and
     respondent fatigue. 47 BLS officials noted the potential of matrix
     sampling, in which a set of additional questions, as in a supplement, is
     added to a month of collection (or all months) but differs from a
     supplement in that it is only administered to a subset of the survey
     sample in a given collection period. This option could reduce burden
     and increase efficiency; however, such an option involves logistical
     considerations in administering the survey and processing the data,
     and adds complexity for analysts using the data for research.
•    Creating sample frames: Using ACS data to develop sample frames
     for follow-on surveys has been identified as a potential use of ACS
     data, but several factors limit this use. 48 This involves using ACS data
     to identify ACS respondents with certain characteristics for potential
     inclusion in a follow-on survey and requires the approval of the
     Census Bureau and OMB. 49 At present only NCSES uses ACS data
     for this purpose. Agency officials told us that using ACS data to create
     a sample frame, as opposed to census long-form data, which they
     used previously, has improved the agency’s coverage of its target




47
  Respondent fatigue occurs when respondents become tired of being surveyed and
become more prone to refusal or the quality of their responses deteriorates.
48
  A follow-on survey is one that is sent to ACS respondents after they have completed the
ACS. Census Bureau policy prohibits sending an ACS respondent a follow-on survey
within 6 months of his or her ACS interview.
49
  Using ACS for sample frame development is more intensive than using ACS estimates
to inform the design of a survey’s sample frame, which does not involve contacting ACS
respondents again. In determining whether a survey can use ACS data for its sample
frame, the Census Bureau’s and OMB’s policy is to give priority to surveys that meet
certain criteria, including those that could substantially reduce costs by doing so and those
that produce estimates for populations that would otherwise have prohibitively expensive
screening costs.




Page 32                                                GAO-12-54 Federal Statistical System
    population and has reduced costs and respondent burden. 50 Another
    benefit of this use is expanded analysis, as agencies, under
    appropriate Title 13 restrictions, can analyze respondents’ answers to
    the ACS along with responses to the follow-on surveys, and can
    analyze the characteristics (from ACS data) of those who do and do
    not respond to the follow-on survey to determine if they have different
    characteristics, which might cause bias in the survey. Surveys such
    as the National Survey of College Graduates that focus on
    populations that are costly to identify are likely to realize higher gains
    in efficiency from using ACS data for this purpose. Despite these
    benefits, opportunities for other surveys to use ACS data for this
    purpose are limited. ACS’s sample size, although large compared to
    most surveys, can be too small for another survey to use for a
    sampling frame. This is especially an issue if a survey targets a rare
    population or targets members similar to those of surveys already
    drawing from the ACS for their frame, because there would be too
    much chance of drawing individuals into both follow-on surveys, and
    current policy does not allow for that. Census Bureau policy states
    that, when agencies conduct follow-on surveys, they may not contact
    any member of a household that has already responded to the ACS
    and also had a member selected for a follow-on survey. With certain
    households in the ACS excluded from potential selection, it becomes
    more difficult for other surveys to draw samples because the data no
    longer reflect respondents with certain characteristics.
In the long run, more-intensive uses of ACS data and resources may
require difficult decisions and entail trade-offs with factors such as cost
and respondent burden. Further, they risk affecting ACS response rates
and overall data quality. However, redesign of the scope and
methodology of ACS might overcome some of these constraints. After the
release of the survey’s first 5-year data products in 2010, the Census
Bureau and others began evaluating the survey and exploring options for
increased uses. In addition to its own evaluation of the ACS, at the
Census Bureau’s request the National Academy of Sciences is organizing
workshops with data users to assess the survey. Also, OMB, in
cooperation with the Census Bureau, created an ACS subcommittee of



50
  The census long form did not include a question that asked for the field in which
respondents received their bachelor’s degree. NCSES used long-form data to identify
respondents who had characteristics that made them likely to be in the survey’s target
population, but it had to screen a larger sample in order to identify those who in fact
belonged to the target population.




Page 33                                               GAO-12-54 Federal Statistical System
              the ICSP with the goal of investigating trade-offs of options such as
              adding questions to the ACS and rotating questions in and out of the
              survey. If the Census Bureau changes the survey’s design or
              methodology, these changes may become more feasible.


              To ensure the provision of high-quality, timely statistical data for public-
Conclusions   and private-sector users, OMB and the agencies that make up the federal
              statistical system must continue to identify opportunities for efficiency in
              federal surveys of households and individuals. Most of the surveys and
              other information collections in our scope have relatively modest costs,
              but challenges such as declining survey response rates will strain
              available resources unless agencies find more-effective and less-costly
              ways to collect and analyze the needed information, while maintaining
              critical protections of respondents’ privacy and confidentiality. In the long
              term, addressing the key challenges and constraints that agencies have
              identified will necessitate broader public debates and policy decisions
              about balancing trade-offs among competing values, such as quality,
              cost, timeliness, privacy, and confidentiality. In the short term, our review
              indicated that two promising avenues to sustain the progress that OMB
              and agencies are making include (1) facilitating collaboration and
              coordination among agencies and (2) combining existing data from both
              survey and administrative sources.

              The federal statistical system already exhibits many collaborative traits
              and practices, in particular through projects sponsored by OMB and
              interagency committees that facilitate coordination and the development
              of new policies and tools. However, additional efforts could help enhance
              the effectiveness of these efforts. Going forward, it will be important for
              OMB to supplement existing guidance to clarify the range of options
              available to address PRA standards. Supplementing the guidance could
              increase agencies’ awareness of these options, in particular those that
              were cited less frequently. At the same time, interagency committees
              could do more to improve accessibility and timeliness of their work
              products. Doing so could maximize the usefulness of committees’ work.
              Additionally, OMB’s ability to oversee and coordinate information
              collections across the government would benefit from additional steps to
              ensure the reliability of data on collections’ costs and burdens. Doing so
              would also benefit users of the information, whether they access it
              through the website or though OMB reports.

              Agencies identified multiple ways that combining survey and
              administrative data can improve the efficiency and quality of their work,


              Page 34                                       GAO-12-54 Federal Statistical System
                      and they are already pursuing such opportunities. Importantly, they have
                      demonstrated that using existing datasets to supplement each other can
                      add value for all agencies involved in data sharing. But agencies also
                      face serious constraints to expanded uses of existing data. One of the
                      more-significant barriers is the complexity of the process through which
                      they make decisions about sharing data. Though agencies and
                      interagency committees are working to create tools to facilitate parts of
                      this process, more-comprehensive and centralized guidance for agencies
                      to follow when negotiating and making decisions regarding data-sharing
                      opportunities could help facilitate the process.

                      A standard protocol or framework could accelerate progress in this area
                      by helping agencies to (1) evaluate the growing array of administrative
                      data to identify those datasets that have the greatest potential for mutual
                      benefit of the participating agencies, and (2) consider a common set of
                      criteria and key questions when weighing the pros and cons of sharing
                      data. A key benefit would be to encourage agencies to consider, in a
                      uniform manner, all relevant aspects of these decisions, such as whether
                      or not proposed uses would be consistent with applicable law, maintain
                      confidentiality protections, be cost-effective, and serve to increase the
                      broader efficiency of the federal statistical system.

                      In order to maintain progress in maximizing the efficiency of existing data
Recommendations for   sources, we recommend that the Director of OMB, in consultation with the
Executive Action      Chief Statistician, work with the ICSP to take the following four actions:

                      To improve the broader efficiency of the federal statistical system and
                      improve communication among agencies and others,

                      •   when OMB next updates guidance on agency survey and statistical
                          information collection and dissemination methods, include additional
                          details on actions agencies can take to meet requirements to identify
                          duplication, to consult with persons outside of the agency, and
                          address other requirements as appropriate; and
                      •   create new methods or enhance existing methods to improve the
                          dissemination of information and resources produced by interagency
                          statistical committees. For example, such enhancements could
                          include increasing the timeliness and availability of information on
                          websites to better capture the full range of products and identify
                          committee priorities.
                      To increase the reliability of the information presented on the Reginfo.gov
                      website and in OMB’s internal system,



                      Page 35                                      GAO-12-54 Federal Statistical System
                     •   implement quality-control procedures designed to identify and remedy
                         any differences between cost and burden information provided on the
                         website and in the related supporting statement documentation that
                         underlies this information.
                     To accelerate progress in sharing administrative data for statistical
                     purposes, where appropriate,

                     •   develop comprehensive guidance for both statistical agencies and
                         agencies that hold administrative data to use when evaluating and
                         negotiating data sharing, such guidance should include key questions
                         focused on issues such as statutory authority, confidentiality, cost,
                         and usefulness in order to ensure agencies consider all relevant
                         factors and the broader interest of the federal government.

                     We provided a draft of this report to the Secretaries of Commerce and
Agency Comments      Health and Human Services, the Director of OMB, the Commissioner of
and Our Evaluation   BLS, the Administrator of ERS, and the Director of the National Science
                     Foundation for their review and comment. We received written comments
                     on the draft report from the Secretary of Commerce that are reprinted in
                     appendix V. We also received comments from OMB staff that are
                     summarized below. The Department of Health and Human Services,
                     BLS, National Science Foundation, OMB, and agencies on the ICSP also
                     provided technical comments and suggestions that we incorporated as
                     appropriate.

                     Commerce stated that our observations illuminate future opportunities for
                     using administrative records within the federal statistical system to
                     increase efficiency and better meet informational needs and that our
                     suggested actions would enhance the ability of statistical agencies to
                     realize these opportunities. Regarding our recommendation on standard
                     protocols and procedures to facilitate data sharing, the department noted
                     that policies and other initiatives can also play a role in achieving
                     cooperation. Finally, the department noted that our report’s
                     acknowledgement of related concerns about the quality of administrative
                     data, and the level of support and resources necessary to maintain a
                     statistical and administrative data infrastructure, underscore the
                     importance of our recommendations.

                     OMB generally agreed with our recommendations and said that the
                     agency hopes to pursue these in the future. More specifically,

                         •     OMB agreed that it is worth considering good practices for
                               reducing duplication. As we suggested, OMB indicated that when


                     Page 36                                      GAO-12-54 Federal Statistical System
          its survey guidance is next updated it will include additional details
          and examples of actions agencies can take to identify duplication
          and consult with persons outside the agency.
    •     OMB said that it shared our concerns about timely and easily
          accessible dissemination of information resources produced by
          interagency statistical committees, and that our recommendation
          underscores the need for addressing this issue.
    •     With respect to our recommendation that OMB implement quality-
          control procedures designed to identify and remedy any
          differences between cost and burden information provided on
          Reginfo.gov and in the related supporting statement
          documentation that underlies this information, OMB noted that
          PRA requires OMB to weigh the burdens imposed on the public by
          information collections against the legitimate needs of the federal
          agencies. OMB said that this requires a careful assessment of the
          estimates of paperwork burden that agencies provide to OMB as
          part of their information collection requests and, further, that these
          estimates are subject to public scrutiny and comment in Federal
          Register notices, in the PRA statements provided on information
          collections, and on Reginfo.gov. OMB pointed out that, because
          the burden estimates provided on Reginfo.gov and in the
          underlying supporting statements are all made public,
          discrepancies such as those found by us are public as well. OMB
          said that it will investigate and address any such discrepancies
          that are brought to its attention by GAO or any member of the
          public.
    •     Finally, OMB concurred that administrative records can be a
          valuable supplement to, though usually not a replacement for,
          household surveys. OMB believes that our recommendation to
          develop comprehensive guidance for statistical and administrative
          agencies to use when evaluating and negotiating data-sharing
          agreements would be constructive, but cautioned that this involves
          a very complex set of issues and said it will take some time to
          develop such guidance.


As agreed with your office, unless you publicly announce the contents of
this report earlier, we plan no further distribution until 30 days from the
report date. At that time, we will send copies to the Commissioner of the
Bureau of Labor Statistics (BLS), the Director of the U.S. Census Bureau,
the Administrator of the Economic Research Service (ERS), the Secretary
of Health and Human Services, the Director of the National Science



Page 37                                         GAO-12-54 Federal Statistical System
Foundation, the Director of OMB, the Secretary of Commerce, and the
Under Secretary of Economic Affairs. In addition, the report will be
available at no charge on the GAO website at http://www.gao.gov.

If you or your staff have any questions concerning this report, please
contact Robert Goldenkoff at (202) 512-2757 or goldenkoffr@gao.gov, or
Ronald S. Fecso at (202) 512-7791 or fecsor@gao.gov. Contact points
for our Offices of Congressional Relations and Public Affairs may be
found on the last page of this report. Key contributors are listed in
appendix VI.

Sincerely yours,




Robert Goldenkoff
Director, Strategic Issues




Ronald S. Fecso
Chief Statistician




Page 38                                   GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology
             Appendix I: Scope and Methodology




             The objectives of this report were to (1) review the ways in which the
             Office of Management and Budget (OMB) and agencies identify
             opportunities for improvement and increased efficiency of selected
             information collections; (2) evaluate opportunities and constraints for the
             statistical agencies to use administrative data in conjunction with selected
             surveys; and (3) evaluate ways in which American Community Survey
             (ACS) data and resources can be used in selected surveys, and the
             associated benefits and constraints.

             To achieve our objectives, we focused on statistical information
             collections administered to households and individuals and subject to the
             Paperwork Reduction Act (PRA), which requires OMB approval of certain
             federal data collections. Although in many cases the information and
             views provided by agencies during our review and our general findings
             may also apply to statistical information collections outside of our scope,
             such as those administered to businesses, all of the specific collections
             and surveys we reviewed were administered to households and
             individuals. The majority of the collections within our scope include a
             survey, though some also include other methods of information collection
             such as focus groups. To examine the issues related to our objectives,
             we performed case studies of five federal surveys: the Consumer
             Expenditure Surveys, sponsored by the Bureau of Labor Statistics; the
             National Health and Nutrition Examination Survey and the National Health
             Interview Survey, both sponsored by the National Center for Health
             Statistics, part of the Centers for Disease Control and Prevention; the
             National Survey of College Graduates, sponsored by the National Center
             for Science and Engineering Statistics, part of the National Science
             Foundation; and the Survey of Income and Program Participation,
             sponsored by the Census Bureau. We selected these surveys based on
             several factors, such as their size and cost and whether they use or have
             the potential to use administrative data or ACS data.

             For the first objective, to review the ways in which OMB and agencies
             identify opportunities for improvement and increased efficiency of
             selected statistical information collections, we examined the PRA, OMB
             guidance to agencies, and prior GAO work on the federal statistical
             system. 1 We interviewed officials at OMB and the four agencies that



             1
              GAO, Federal Information Collection: A Reexamination of the Portfolio of Major Federal
             Household Surveys Is Needed, GAO-07-62 (Washington, D.C.: Nov. 15, 2006)




             Page 39                                              GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology




administer the case-study surveys to learn about coordination among
agencies, efforts agencies take to identify improvement, and OMB’s role.
We also interviewed officials at the Department of Agriculture’s Economic
Research Service, which is a member of several interagency statistical
committees and the lead agency for the Statistical Community of Practice
and Engagement. In addition, we interviewed experts on the federal
statistical system to learn about their perspectives on the efficiency of the
federal statistical system and agency and OMB coordination. In
evaluating OMB, agency, and interagency actions, we used as criteria the
requirements of the PRA and practices identified in prior GAO work on
agency collaboration. 2

To address the second objective, to evaluate opportunities and
constraints for agencies to use administrative data in conjunction with
selected surveys, we reviewed statutes that govern the sharing and use
of administrative data, documentation from case-study surveys, and
various papers and reports. We interviewed officials at OMB and experts
in the field of federal statistics to learn about their perspectives on the
current and potential uses of administrative data. We also interviewed
officials at the Economic Research Service and the agencies that sponsor
the case-study surveys to learn about ways in which their surveys use or
could potentially use administrative data. For this objective and the third
we used OMB guidance, relevant statutes, and prior GAO work as criteria
in our evaluation.

For the third objective, to evaluate the ways in which ACS data and
resources can be used in selected surveys, we reviewed Census Bureau
documentation, National Science Foundation reports, prior GAO work,
and reports issued by the Committee on National Statistics. We
interviewed officials at the Census Bureau, which sponsors the ACS, and
at OMB to learn about their perspectives on potential uses of the survey
and its data. We also interviewed officials at the Economic Research
Service and the agencies that administer the case-study surveys to learn
about ways in which their surveys use or could potentially use ACS data
and resources, and experts in the field of federal statistics to learn about
their assessment of the uses and potential uses of the ACS.




2
 GAO, Results-Oriented Government: Practices That Can Help Enhance and Sustain
Collaboration among Federal Agencies, GAO-06-15 (Washington, D.C.: Oct. 21, 2005).




Page 40                                           GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology




To gain a broader perspective on the information collections in our scope
and to inform our work across all three objectives, we obtained and
analyzed publicly-available data from Reginfo.gov, a government website
that provides access to information on agency requests for OMB approval
of information collections. We used the website’s search feature to
download all of the collections that were classified as (1) active, meaning
that they are currently approved by OMB for use by agencies; (2)
employing statistical methods; and (3) directed to households and
individuals. We downloaded data on all information collections that met
these criteria from Reginfo.gov on two dates, May 17, 2011, and
September 22, 2011.

We performed more in-depth analyses of the 507 information collections
in our May 17, 2011, download. First, we reviewed the supporting
statements for each of these collections, and on the basis of information
in these documents classified them according to the subject matter on
which they focus. 3 Next, we grouped the collections into categories,
based on information on the sponsoring agency in Reginfo.gov and the
supporting statements. Depending on the sponsoring agency, we put the
collections into one of four categories: (1) those that are sponsored by
one of the 13 principal statistical agencies; (2) those that are sponsored
by another agency that shares a parent agency with one of the 13
principal statistical agencies (for example, agencies in the Department of
Health and Human Services would fall into this category because it is the
parent agency of the National Center for Health Statistics); (3) those that
are not a principal statistical agency and do not share a parent agency
with one; and (4) unknown, for those whose sponsoring agency we could
not determine based on the available information. We also used the
information in the supporting statements to determine if the collections
included a survey component and found that 481 of the 507 did.

We divided the 481 collections that included a survey component into
three strata that reflect the type of sponsoring agency. Of the 481, we
were not able to determine agency type for 7 collections so we dropped
these records, leaving a population of 474 statistical information
collections. The number of collections by stratum is shown in table 3. In


3
 Agencies include supporting statements with each request for approval of an information
collection. These statements must follow a prescribed format and include specified
information such as the circumstances that make the collection necessary and how, by
whom, and for what purpose the information will be used.




Page 41                                              GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology




order to estimate the prevalence of certain characteristics in this
population—for example, the percentage of information collections for
which the sponsoring agency reported steps taken to identify potential
duplication—we drew a stratified sample of 106 collections. Within each
stratum, we estimated the sample size required to yield a 95 percent
confidence interval of plus or minus 14 percent around such an estimate.
For the overall population of 474, the approximate precision for an
estimated percentage of 50 percent is plus or minus 8.4 percent, at the 95
percent level of confidence.

Table 3: Number of Collections, by Stratum

                                                                  Stratum        Sample
 Stratum                                                        population         size
 Principal statistical agency                                            60           27
 Nonstatistical agency                                                  139           37
 Nonstatistical agency that shares a parent agency with a               275           42
 statistical agency
 Total:                                                                 474          106
Source: GAO analysis of OMB data.



We reviewed the supporting statements of each of the information
collections in our sample of 106, focusing on agencies’ reported efforts to
identify duplication and to consult with persons outside the agency to
obtain their views. Because agencies follow a standard format in
preparing supporting statements, we focused our analysis on the sections
of the supporting statements in which OMB instructs agencies to include
this information (sections 4 and 8, respectively, of section A of the
supporting statement). To review agencies’ reported actions, we used a
data-collection instrument that contained a series of “yes-no” questions
about the types of efforts reported. For example, we reviewed whether
agencies had reported considering administrative data as a potential
source of duplication, and whether agencies reported that they had
consulted with other agencies when describing consultations outside of
the agency. We did not evaluate whether agencies actually took the
actions they reported taking. Estimates produced from the sample of the
collections are subject to sampling error. We express our confidence in
the precision of our results as a 95 percent confidence interval. This is the
interval that would contain the actual population value for 95 percent of
the samples we could have drawn. As a result, we are 95 percent
confident that each of the confidence intervals in this report will include
the true values in the study population.



Page 42                                              GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology




We took several steps to evaluate the reliability of the data we accessed
through the Reginfo.gov website. We interviewed OMB officials and
reviewed documentation of the Reginfo.gov website and the Regulatory
Information Service Center and OIRA Consolidated Information System, 4
which is the system that agencies use to track information collection
requests and that underlies information provided on Reginfo.gov. As part
of our review of the subject matter of the collections in the May 17, 2011,
download, we confirmed that the collections were within our scope. We
also used the information in the September 22, 2011, download to
evaluate the reliability of data on collections’ cost and annual burden. To
do this, we drew a systematic random sample of 56 (approximately 10
percent) of the 555 collections in the download.

We found a number of inconsistencies between the cost and burden
information available on the website and that provided in supporting
statement documentation. According to an official at OMB, the two
sources should match, but the supporting statement documentation is
more accurate than that on the website. On the basis of our assessment,
we determined that the information from the website was not sufficiently
reliable for the purpose of describing the annual cost or annual burden to
respondents of the collections in our scope. However, through this review
and the other steps we took, we found that the other information provided
on the Reginfo.gov site was sufficiently reliable for our other intended
purposes of identifying the collections within our scope and obtaining
information on their subject matter and reported actions taken to identify
unnecessary duplication and solicit input from outside persons and
entities.

Because the cost information on the Reginfo.gov website was not
sufficiently reliable, we used cost information from the supporting
statements of the collections in our sample to provide background
information on the costs of the collections in our scope. In addition to
using information from the supporting statements in our initial sample of
56 collections, we drew another systematic random sample of 56
additional collections from the September download. In total, we obtained
cost information from the supporting statements of 112 (approximately 20
percent) of the 555 collections in our scope that were active as of our
September 22, 2011, download.



4
OIRA is the Office of Information and Regulatory Affairs within OMB.




Page 43                                             GAO-12-54 Federal Statistical System
Appendix I: Scope and Methodology




We conducted this performance audit from December 2010 until February
2012 in accordance with generally accepted government auditing
standards. Those standards require that we plan and perform the audits
to obtain sufficient, appropriate evidence to provide a reasonable basis
for our findings and conclusions based on our audit objectives. We
believe that the evidence obtained provides a reasonable basis for our
findings and conclusions based on our audit objectives.




Page 44                                    GAO-12-54 Federal Statistical System
Appendix II: Description of Case-Study
                     Appendix II: Description of Case-Study
                     Surveys



Surveys

                     Purpose: To collect information on expenditures and households’
Consumer             characteristics
Expenditure (CE)
                     Sponsoring agency: Bureau of Labor Statistics (BLS)
Surveys: Quarterly
Interview Survey     Annual sample size (estimated): CEQ: 14,725 households; 1 CED: 12,075
(CEQ) and Diary      households 2

Survey (CED)         Annual cost to the federal government (estimated): $41.8 million 3

                     Annual burden hours (estimated): CEQ: 36,033hours; 4 CED: 33,721
                     hours 5

                     Target population: Nationally-representative sample of the U.S.
                     population 6

                     Uses of data: According to BLS documentation, the most important use of
                     the CE Surveys is to provide expenditure data for updating the Consumer


                     1
                      BLS estimates that 8,825 of the 14,725 households surveyed per quarter will complete
                     the interviews. As a result, over the course of a year, BLS estimates that there will be
                     35,300 completed interviews.
                     2
                      BLS estimates that 7,050 of the 12,075 households that receive the CED will complete
                     the interview and diaries. Because each household completes two weekly diaries, BLS
                     estimates that households will complete 14,100 diaries per year.
                     3
                      This amount reflects the approximate fiscal year 2010 cost of collecting, processing,
                     reviewing, and publishing data collected through the CE Surveys. Survey costs vary
                     somewhat from year to year.
                     4
                      BLS estimates that respondents will take an average of 60 minutes to complete one
                     interview survey of the CEQ. Since the CEQ is administered to the same sample of
                     households four times in a year, the annual burden for respondents who complete all four
                     surveys is roughly 4 hours. In addition, a certain number of respondents who complete the
                     interview surveys are reinterviewed, a process that adds 10 minutes to these selected
                     respondents’ burden times.
                     5
                       BLS estimates that respondents will take approximately 105 minutes to complete one
                     diary survey of the CED. In addition to the diary survey, respondents complete three
                     interviews, each of which takes 25 minutes. Lastly, a certain number of respondents are
                     reinterviewed, a process that adds 10 minutes to these selected respondents’ burden
                     times.
                     6
                       The CE Surveys are limited to the U.S. civilian, noninstitutionalized population, and as a
                     result exclude certain segments of the population, such as active-duty military members
                     living on bases and prisoners.




                     Page 45                                                 GAO-12-54 Federal Statistical System
                      Appendix II: Description of Case-Study
                      Surveys




                      Price Index, the most widely used measure of inflation. 7 In addition,
                      government agencies, private companies, policymakers, and researchers
                      use data from the CE Surveys in a variety of ways. For example, the
                      Department of Defense uses data from the CE Surveys to update cost-of-
                      living adjustments for military families. Congressional committees also
                      use the data to inform decision making, such as the potential effect of
                      increases in the minimum wage.


                      Purpose: To assess the health and nutritional status of adults and
National Health and   children in the United States
Nutrition
                      Sponsoring agency: National Center for Health Statistics (NCHS),
Examination Survey    Centers for Disease Control and Prevention
(NHANES)
                      Annual sample size (estimated): 5,180 individuals 8

                      Annual cost to the federal government (estimated): $37.8 million 9

                      Annual burden hours (estimated): 49,626 hours 10




                      7
                       The Consumer Price Index produces monthly data on changes in the prices paid by
                      urban consumers for a representative basket of goods and services.
                      8
                       Although a larger pool of respondents participates in a screener survey, NCHS estimates
                      that 5,180 respondents participate in the screener, household interview, and physical
                      examination.
                      9
                       NCHS estimates that the annual cost to the federal government of NHANES for fiscal
                      year 2010 was $37.8 million, including both direct and reimbursable funding provided by
                      other agencies for NCHS statistical services. Survey costs vary somewhat from year to
                      year.
                      10
                        NCHS estimates that the total annual burden for the NHANES is 37,626 hours, including
                      screening, household interviews, physical examinations, and any follow-up interviews. In
                      addition, tests of procedures and special studies account for an additional 12,000 hours,
                      for a total annual burden of 49,626 hours. NCHS estimates that respondents who
                      participate in all aspects of the NHANES, including the screener survey, household
                      interview, and physical examination, can expect a burden of 6.7 hours. In addition to those
                      who complete all aspects of the NHANES, some respondents may only participate in the
                      screener survey and be screened out of the sample, while other respondents may
                      participate in the screener survey and the household interview but not the physical
                      examination. NCHS includes all respondents at these varying levels of participation in its
                      calculation of the annual burden hours.




                      Page 46                                               GAO-12-54 Federal Statistical System
                   Appendix II: Description of Case-Study
                   Surveys




                   Target population: Nationally-representative sample of individuals of all
                   ages 11

                   Uses of data: According to NCHS documentation, a variety of users,
                   including federal agencies, research organizations, universities, health-
                   care providers, and educators, use NHANES data. For example, the Food
                   and Drug Administration uses NHANES data to determine whether
                   changes are needed to federal regulations. In addition, use of NHANES
                   data informs key decision making. For example, according to NCHS
                   documentation, NHANES data on lead levels in blood were instrumental
                   in developing the policy to eliminate lead from gasoline and in food and
                   soft drink cans. As part of its broader data-linkage program, NCHS links
                   NHANES survey data to multiple administrative datasets, such as the
                   National Death Index (a centralized index of state death record
                   information) and Medicare and Medicaid claims from the Centers for
                   Medicare and Medicaid Services. The National Death Index linkages give
                   researchers an opportunity to analyze mortality differences among
                   subgroups defined using the survey information. Similarly, the Medicare
                   and Medicaid claim linkages provide an opportunity to examine health
                   conditions, utilization, and costs among subgroups defined using the
                   survey information. Additionally, according to NCHS officials, NCHS is
                   currently conducting a pilot study to link NHANES data on participants
                   from Texas to administrative data on food assistance.


                   Purpose: To monitor the health of the U.S. population
National Health
Interview Survey   Sponsoring agency: National Center for Health Statistics (NCHS),
                   Centers for Disease Control and Prevention
(NHIS)
                   Annual sample size (estimated): 35,000 households 12




                   11
                      The NHANES is limited to the U.S. civilian, noninstitutionalized population, and as a
                   result excludes certain segments of the population, such as active-duty military members
                   living on bases and prisoners.
                   12
                     NCHS estimates that the annual sample size in 2011 is 35,000 households, and that
                   87,500 individuals will participate in the survey. NCHS plans to increase the sample size
                   in the future.




                   Page 47                                               GAO-12-54 Federal Statistical System
                     Appendix II: Description of Case-Study
                     Surveys




                     Annual cost to the federal government (estimated): $32.2 million 13

                     Annual burden hours (estimated): 34,977 hours 14

                     Target population: Nationally-representative sample of households,
                     collecting data on all members of each household 15

                     Uses of data: According to NCHS documentation, government agencies,
                     policymakers, researchers, and academics use NHIS data for a variety of
                     purposes, such as identifying health problems and evaluating health
                     programs. For example, policymakers used NHIS data to shape the
                     Centers for Disease Control and Prevention’s cervical-cancer screening
                     policy. In addition, other agencies can use the NHIS as a sample frame
                     for their surveys. Lastly, as part of its broader data-linkage program,
                     NCHS links NHIS survey data to multiple administrative datasets,
                     including those it uses for linkages with NHANES data, such as the
                     National Death Index and Medicaid and Medicare claims.


                     Purpose: To provide information on the U.S stock of scientists and
National Survey of   engineers
College Graduates
                     Sponsoring agency: National Center for Science and Engineering
(NSCG)               Statistics, National Science Foundation

                     Sample size per survey administration (estimated): 100,000 individuals



                     13
                       NCHS estimates that the annual cost to the federal government of NHIS for fiscal year
                     2010 was $32.2 million, including both direct and reimbursable funding provided by other
                     agencies for NCHS statistical services. Survey costs vary somewhat from year to year.
                     14
                       NCHS estimates that the total annual burden of the NHIS was 34,977 hours for 2010
                     and 2011. NCHS estimates that a single respondent who completes all portions of the
                     NHIS for a household can expect a time burden of one hour. Some respondents who
                     complete all portions of the NHIS are asked to take a short reinterview survey, a process
                     that adds 5 minutes to these selected respondents’ burden times. In addition to those who
                     complete all portions of the NHIS, some respondents may only participate in a screener
                     survey and be screened out of the sample, which NCHS estimates takes 5 minutes per
                     respondent. NCHS includes all respondents at these varying levels of participation in its
                     calculation of the annual burden hours.
                     15
                       The NHIS is limited to the U.S. civilian, noninstitutionalized population, and as a result
                     excludes certain segments of the population, such as active-duty military members living
                     on bases and prisoners.




                     Page 48                                                 GAO-12-54 Federal Statistical System
                        Appendix II: Description of Case-Study
                        Surveys




                        Cost to the federal government (estimated): $13.3 million 16

                        Burden hours per administration (estimated): 34,792 hours 17

                        Target population: Individuals in the United States who have a bachelor’s
                        degree in science, engineering, or health, and those who have a degree
                        in another field but work in science, engineering, or health occupation.

                        Uses of data: According to National Science Foundation documentation,
                        information from the NSCG is used by researchers and policymakers.
                        Government agencies use the data to assess available scientific and
                        engineering resources and inform the development of related policies.
                        Additionally, educational institutions use NSCG data to inform the
                        establishment and modification of curricula, and businesses use the data
                        to develop recruitment and compensation policies.


                        Purpose: To provide information about status and principal determinants
Survey of Income and    of individuals’ and households’ income and participation in government
Program Participation   programs such as Social Security and Medicaid
(SIPP)                  Sponsoring agency: Census Bureau

                        Annual sample size (estimated): 45,000 households 18

                        Annual cost to the federal government (estimated): $50 million 19




                        16
                          Survey costs vary somewhat from year to year. $13.3 million is the expected cost for the
                        survey in 2012.
                        17
                          This estimate is based on the assumption that 83,500 individuals will respond to the
                        survey and each respondent will take 25 minutes to complete it.
                        18
                          The Census Bureau estimates that of the 65,300 households in its sample,
                        approximately 52,900 are occupied at the time of interview and approximately 45,000
                        households are interviewed. It estimates that each interview yields 2.1 individual
                        interviews, for a total of 94,500 individual interviews per survey administration. The
                        Census Bureau administered the SIPP three times to the same households in fiscal year
                        2011, and estimates that each administration generated 94,500 interviews, for a total of
                        283,500 in the fiscal year.
                        19
                          The Census Bureau estimates that the production cost for all parts of the SIPP in fiscal
                        year 2011 is $50.1 million. Survey costs vary somewhat from year to year.




                        Page 49                                               GAO-12-54 Federal Statistical System
Appendix II: Description of Case-Study
Surveys




Annual burden hours (estimated): 143,303 hours 20

Target population: Nationally-representative sample of households. 21 All
household members 15 years old or over are interviewed for the survey.

Uses of data: According to the Census Bureau, SIPP data are used by
agencies such as the Department of Health and Human Services and the
Department of Agriculture, as well as economic policymakers, Congress,
and state and local governments, to plan and evaluate government
social-welfare and transfer-payment programs.




20
  The bureau estimates the total burden to respondents in fiscal year 2011 as 143,303
hours, which includes the time it takes respondents to fill out the core and topical module
sections, as well as the reinterview of selected respondents. This estimate is based on the
assumption that most respondents take 30 minutes to complete one administration of the
survey. Since the SIPP was administered to the same sample of households three times
in fiscal year 2011, the annual burden for most respondents was 90 minutes.
21
  The SIPP is limited to the U.S. civilian, noninstitutionalized population, and as a result
excludes certain segments of the population, such as active-duty military members living
on bases and prisoners.




Page 50                                                 GAO-12-54 Federal Statistical System
Appendix III: Selected Statutes Related to
                  Appendix III: Selected Statutes Related to
                  Information Collection



Information Collection

                  Selected statutes that regulate the collection and dissemination of
                  information include the following.


                      The Information Quality Act of 2000 requires, among other things, that
Governmentwide    •
                      the Office of Management and Budget (OMB) develop and issue
Statutes              guidelines that provide policy and procedural guidance for federal
                      agencies for ensuring and maximizing the quality of the information
                      they disseminate. These guidelines include steps designed to assure
                      objectivity and utility of disseminated information. See 44 U.S.C. §
                      3504(d)(1); OMB guidelines are at
                      http://www.whitehouse.gov/omb/info_quality_iqg_oct2002/.
                  •   The Privacy Act of 1974, as amended, and the privacy provisions of
                      the E-Government Act of 2002 specify requirements for the protection
                      of personal privacy by federal agencies. The Privacy Act places
                      limitations on agencies’ collection, disclosure, and use of personal
                      information maintained in systems of records. See 5 U.S.C. §§ 552a
                      and 552a note. The E-Government Act requires agencies to conduct
                      privacy impact assessments that analyze how personal information is
                      collected, stored, shared, and managed in a federal system. See 44
                      U.S.C. § 3501 note.
                  •   The Confidential Information Protection and Statistical Efficiency Act
                      (CIPSEA) of 2002 focuses on confidentiality protection and data
                      sharing. It requires that information acquired by an agency under a
                      pledge of confidentiality and for exclusively statistical purposes be
                      used by the agency only for such purposes and not be disclosed in
                      identifiable form for any other use, except with the informed consent
                      of the respondent. It also authorizes identifiable business records to
                      be shared for statistical purposes among the Bureau of Economic
                      Analysis, Bureau of Labor Statistics, and the Census Bureau. See 44
                      U.S.C. § 3501 note.

                      Agency-specific statutes also guide federal data collection and use.
Agency-Specific   •
                      For example, the Census Bureau conducts the census and census-
Statutes              related surveys such as the American Community Survey under Title
                      13 of the U.S. Code, which gives the Census Bureau the authority to
                      request and collect information from individuals but also guarantees
                      the confidentiality of these data and establishes penalties for
                      unlawfully disclosing this information. Unless specifically authorized,
                      these provisions preclude the Census Bureau from sharing identifiable
                      census information with other agencies. See 13 U.S.C. § 9. Title 15 of
                      the U.S. Code permits the Secretary of Commerce to conduct studies
                      on behalf of other agencies and organizations. Identifiable data from



                  Page 51                                      GAO-12-54 Federal Statistical System
Appendix III: Selected Statutes Related to
Information Collection




    surveys conducted under Title 15 authority are subject to the
    sponsoring agency’s legislation and confidentiality requirements. See
    15 U.S.C. § 176a. Statutes and regulations specific to other agencies
    also affect collection and sharing of data.
•   Section 6103 of the Internal Revenue Code provides that federal tax
    information is confidential and may not be disclosed except as
    specifically authorized by law.
•   Section 308(d) of the Public Health Service Act requires that
    identifiable information obtained by the National Center for Health
    Statistics be used only for the purpose for which it was collected
    unless consent is obtained for another purpose, and it prohibits the
    release of identifiable information without consent.
•   Other legislation such as the Family Educational Rights and Privacy
    Act, which protects the privacy of student education records, can
    affect federal data-collection efforts. See 20 U.S.C. § 1232g.




Page 52                                      GAO-12-54 Federal Statistical System
Appendix IV: Printable Interactive Graphic
                                           Appendix IV: Printable Interactive Graphic




                                           This table reproduces the information in the interactive figure 3 earlier in
                                           this report.

Table 4: Actions Taken to Address Constraints That Hamper Greater Use of Administrative Data

Constraint
type       Constraint               Description                                     Examples of actions taken
Access     Statutory restrictions   Statutes may not authorize                      •     OMB [the Office of Management and Budget] has issued
           on data sharing          statistical uses for data collected                   guidance encouraging greater sharing of data while
                                    by a program.                                         protecting privacy; and
                                                                                    •     FCSM [the Federal Committee on Statistical
                                                                                          Methodology] produced a document, highlighting
                                                                                          lessons learned when negotiating data-sharing
                                                                                          agreements.
           Consent                  Agencies may not have consent                   •     When requesting consent to link respondents’ survey
                                    from respondents to use their                         data with administrative data, NCHS [the National
                                    administrative data for statistical                   Center for Health Statistics] has moved from asking for
                                    purposes, or agencies may                             respondents’ full Social Security numbers to only asking
                                    interpret the legal requirements                      for part of their Social Security numbers (used to assure
                                    for consent differently.                              link quality), which has improved consent rates; and
                                                                                    •     FCSM is working on a project to clarify legal
                                                                                          requirements for informed consent and to examine the
                                                                                          current practices agencies typically use to obtain
                                                                                          consent.
           Costs and                Agencies may not have the staff,                •     FCSM is completing a template for agencies to use
           infrastructure           policies, procedures, and systems                     when negotiating data-sharing agreements;
                                    in place to share or use                        •     FCSM produced a document highlighting lessons
                                    administrative data for statistical                   learned when negotiating data sharing agreements; and
                                    purposes.
                                                                                    •     The Census Bureau is using government and
                                                                                          commercial administrative data to simulate the 2010
                                                                                          Census results, as well as comparing the quality of the
                                                                                          Census Bureau’s process for linking data from NCHS
                                                                                          surveys with administrative data to NCHS’s current
                                                                                          record-linkage process.
Quality    Data documentation       Agencies may not uniformly                      •     FCSM is investigating the potential for using a checklist
                                    document information about                            to evaluate the quality of datasets, part of which focuses
                                    administrative datasets.                              on documentation in order to assess potential for
                                                                                          statistical uses.
           Quality of data          The quality of administrative data              •     The Census Bureau is investigating the quality of
                                    varies.                                               administrative data held by private companies;
                                                                                    •     FCSM and the Census Bureau are investigating the
                                                                                          quality of administrative data and the potential for using
                                                                                          a checklist to assess the quality of datasets; and
                                                                                    •     ERS [the Economic Research Service] (in collaboration
                                                                                          with the Census Bureau and NCHS) is undertaking a
                                                                                          pilot project to address data- quality concerns with state-
                                                                                          level administrative data.
                                           Source: GAO analysis of OMB and principal statistical agency data.

                                           Note: Data are from related documentation and interviews with officials at OMB and selected principal
                                           statistical agencies.




                                           Page 53                                                              GAO-12-54 Federal Statistical System
Appendix V: Comments from the Department
             Appendix V: Comments from the Department
             of Commerce



of Commerce




             Page 54                                    GAO-12-54 Federal Statistical System
Appendix V: Comments from the Department
of Commerce




Page 55                                    GAO-12-54 Federal Statistical System
Appendix VI: GAO Contacts and Staff
                  Appendix VI: GAO Contacts and Staff
                  Acknowledgments



Acknowledgments

                  Ronald S. Fecso, (202) 512-7791 or fecsor@gao.gov
GAO Contacts
                  Robert Goldenkoff, (202) 512-2757 or goldenkoffr@gao.gov


                  In addition to the individuals named above, Tim Bober (Assistant
Staff             Director), Carl Barden, Russell Burnett, Robert Gebhart, Jill Lacey,
Acknowledgments   Andrea Levine, Jessica Nierenberg, Susan Offutt, Kathleen Padulchick,
                  Tind Shepper Ryen, and Jared Sippel made key contributions to this
                  report.




(450880)
                  Page 56                                   GAO-12-54 Federal Statistical System
GAO’s Mission         The Government Accountability Office, the audit, evaluation, and
                      investigative arm of Congress, exists to support Congress in meeting its
                      constitutional responsibilities and to help improve the performance and
                      accountability of the federal government for the American people. GAO
                      examines the use of public funds; evaluates federal programs and
                      policies; and provides analyses, recommendations, and other assistance
                      to help Congress make informed oversight, policy, and funding decisions.
                      GAO’s commitment to good government is reflected in its core values of
                      accountability, integrity, and reliability.

                      The fastest and easiest way to obtain copies of GAO documents at no
Obtaining Copies of   cost is through GAO’s website (www.gao.gov). Each weekday afternoon,
GAO Reports and       GAO posts on its website newly released reports, testimony, and
                      correspondence. To have GAO e-mail you a list of newly posted products,
Testimony             go to www.gao.gov and select “E-mail Updates.”

Order by Phone        The price of each GAO publication reflects GAO’s actual cost of
                      production and distribution and depends on the number of pages in the
                      publication and whether the publication is printed in color or black and
                      white. Pricing and ordering information is posted on GAO’s website,
                      http://www.gao.gov/ordering.htm.
                      Place orders by calling (202) 512-6000, toll free (866) 801-7077, or
                      TDD (202) 512-2537.
                      Orders may be paid for using American Express, Discover Card,
                      MasterCard, Visa, check, or money order. Call for additional information.
                      Connect with GAO on Facebook, Flickr, Twitter, and YouTube.
Connect with GAO      Subscribe to our RSS Feeds or E-mail Updates. Listen to our Podcasts.
                      Visit GAO on the web at www.gao.gov.
                      Contact:
To Report Fraud,
Waste, and Abuse in   Website: www.gao.gov/fraudnet/fraudnet.htm
                      E-mail: fraudnet@gao.gov
Federal Programs      Automated answering system: (800) 424-5454 or (202) 512-7470

                      Katherine Siggerud, Managing Director, siggerudk@gao.gov, (202) 512-
Congressional         4400, U.S. Government Accountability Office, 441 G Street NW, Room
Relations             7125, Washington, DC 20548

                      Chuck Young, Managing Director, youngc1@gao.gov, (202) 512-4800
Public Affairs        U.S. Government Accountability Office, 441 G Street NW, Room 7149
                      Washington, DC 20548




                        Please Print on Recycled Paper.