United States General Accounting Office GAO Report to Congressional Requesters May 2003 TITLE I Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies GAO-03-389 May 2003 TITLE I Characteristics of Tests Will Influence Highlights of GAO-03-389, a report to Congressional Requesters Expenses; Information Sharing May Help States Realize Efficiencies The No Child Left Behind Act of The majority of states administer statewide tests and customize questions to 2001 (NCLBA) reauthorized the measure student learning against their state standards. These states differ $10 billion Title I program, which along other characteristics, however, including the types of questions on seeks to improve the educational their tests and how they are scored, the extent to which actual test questions achievement of 12.5 million are released to the public following the tests, and the number of new tests students at risk. In passing the legislation, Congress increased the they need to develop to comply with the NCLBA. frequency with which states are to measure student achievement in GAO provides three estimates of total expenditures between fiscal year mathematics and reading and 2002 and 2008, based on different assumptions about the types of test added science as another subject. questions states may choose to implement and how they are scored. The Congress also authorized funding method by which tests are scored largely explains the differences in GAO’s to support state efforts to develop estimates. and implement tests for this purpose. If all states use tests with multiple-choice questions, which are machine scored, GAO estimates that the total state expenditures will be about Congress mandated that GAO study $1.9 billion. If all states use tests with a mixture of multiple-choice questions the costs of implementing the required tests. This report and a limited number of open-ended questions that require students to write describes characteristics of states’ their response, such as an essay, which are hand scored, GAO estimates Title I tests, provides estimates of spending to be about $5.3 billion. GAO estimates that spending will be at what states may spend to about $3.9 billion, if states keep the mix of question types states reported to implement the required tests, and GAO. In general, hand scoring is more expensive and time and labor identifies factors that explain intensive than machine scoring. Benchmark funding for assessments as variation in expenses. specified in NCLBA will cover a larger percentage of estimated expenditures for tests comprised of multiple-choice questions and a smaller percentage of estimated expenditures for tests comprised of a mixture of multiple-choice Given that significant expenses and open-ended questions. Several states are exploring ways to reduce may be associated with testing, assessment expenses, but information on their experiences is not broadly GAO is recommending that shared among states. Education facilitate the sharing of information on states’ experiences 6 Dollars in billions 5.3 in attempting to reduce expenses. Education agreed with GAO’s 3.9 recommendation but raised 4 concerns about GAO’s 2.7 methodology for estimating expenditures. 2 1.9 0 ns n-e ice ria ark e typn e ed oic tio tio nd pe ho op m es ch pr ch d o e-c qu le- ap Ben an ltipl www.gao.gov/cgi-bin/getrpt?GAO-03-389. ltip nt rre Mu Mu Cu To view the full report, including the scope and methodology, click on the link above. For more information, contact Marnie S. Estimates Shaul at (202) 512-7215 or firstname.lastname@example.org. Source: GAO analysis. Contents Letter 1 Results in Brief 3 Background 4 States Generally Report Administering Statewide Assessments Developed to Measure Their State Standards, but Differ Along Other Characteristics 7 Estimates of Spending Driven Largely by Scoring Expenditures 14 Conclusions 22 Recommendation 22 Agency Comments 22 Appendix I Objectives, Scope, and Methodology 24 Appendix II Accountability and Assessment Requirements under the 1994 and 2001 Reauthorizations of Title I 31 Appendix III Number of Tests States Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003) 33 Appendix IV Estimates of Assessment Expenditures NCLBA Required, but Not in Place at the Time of Our Survey, FY 2002-08 35 Appendix V State Development and Nondevelopment Estimates 36 Appendix VI Fiscal Years 2002-08 Estimated Expenditures for Each Question Type 38 Page i GAO-03-389 Title I Appendix VII Comments from the Department of Education 39 Appendix VIII GAO Contacts and Staff Acknowledgments 42 GAO Contacts 42 Staff Acknowledgments 42 Tables Table 1: Number of Assessments and Subject Areas Required by the 1994 and 2001 ESEA Reauthorizations 5 Table 2: Assessment Minimum Amounts under NCLBA 6 Table 3: The Number of Tests States Reported Needing to Develop or Augment Varies 13 Table 4: Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08 15 Table 5: Estimated Total Expenditures for Test Development Are Lower Than for Test Administration, Scoring, and Reporting 17 Table 6: Total Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08 19 Table 7: States Selected for Study 25 Table 8: Examples of Assessment Expenditures 26 Table 9: Average Annual Expenditures for the 7 States (adjusted to 2003 dollars) 27 Table 10: Estimated Expenditures to Implement Title I Assessments in a Given Year 30 Table 11: Estimates of Expenditures for the Assessments Required by NCLBA That Were Not in Place at the Time of Our Survey, Fiscal Years 2002-08 35 Table 12: Estimates by State, Development, and Nondevelopment Expenditures 36 Table 13: Estimated Expenditures for Each Question Type, Fiscal Years 2002-08 38 Figures Figure 1: The Majority of States Report They Currently Use Statewide Tests and Plan to Continue to Do So 8 Page ii GAO-03-389 Title I Figure 2: The Majority of States Reported That They Currently Use and Plan to Develop New Tests That Are Customized to Measure Their State’s Standards 9 Figure 3: The Majority of States Reported They Use a Combination of Multiple-choice and Open-ended Questions on Their Tests, but Many States Are Uncertain about Question Type on Future Tests 11 Figure 4: States Split in Decision to Release Test Questions to the Public Following Tests 12 Figure 5: Estimated Scoring Expenditures Per Assessment Taken for Selected States, Fiscal Year 2002 16 Figure 6: Various Factors Are Likely to Affect What States Spend on Title I Assessments 18 Figure 7: Total Expenditures Likely to Be Lower in First Few Years and Benchmark Funding in NCLBA Estimated to Cover Most of Expenditures in First Few Years 21 Abbreviations ESEA Elementary and Secondary Education Act LEA local educational agency NASBE National Association of State Boards of Education NCLBA No Child Left Behind Act This is a work of the U.S. Government and is not subject to copyright protection in the United States. It may be reproduced and distributed in its entirety without further permission from GAO. It may contain copyrighted graphics, images or other materials. Permission from the copyright holder may be necessary should you wish to reproduce copyrighted materials separately from GAO’s product. Page iii GAO-03-389 Title I United States General Accounting Office Washington, DC 20548 May 8, 2003 The Honorable Judd Gregg Chairman, Committee on Health, Education, Labor, and Pensions United States Senate The Honorable Edward M. Kennedy Ranking Minority Member, Committee on Health, Education, Labor, and Pensions United States Senate The Honorable John A. Boehner Chairman, Committee on Education and the Workforce House of Represenatives The Honorable George Miller Ranking Minority Member, Committee on Education and the Workforce House of Representatives Title I, the largest source of federal funding for primary and secondary education, provided states $10.3 billion in fiscal year 2002 to improve the educational achievement of 12.5 million students at risk. In passing the No Child Left Behind Act of 2001 (NCLBA), Congress increased funding for Title I and placed additional requirements on states and schools for improving student performance. To provide an additional basis for making judgments about student progress, NCLBA increased the frequency with which states are to assess students in mathematics and reading and added science as another subject. Under NCLBA, states can choose to administer statewide, local, or a combination of state and local assessments, but these assessments must measure states’ content standards for learning. If a state fails to fulfill NCLBA requirements, the Department of Education (Education) can withhold federal funds designated for state administration until the requirements have been fulfilled. To support states in developing and implementing their assessments, Congress authorized specific funding to be allocated to the states between fiscal year 2002 and 2007. Page 1 GAO-03-389 Title I NCLBA requires that states test all students annually in grades 3 through 8 in mathematics and reading or language arts and at least once in one of the high school grades by the 2005-06 school year. It also requires that states test students in science at least once in elementary, middle, and high school by 2007-08. Some states have already developed assessments in many of the required subjects and grades. In the conference report accompanying passage of the NCLBA, Congress mandated that we do a study of the anticipated aggregate cost to states, between fiscal year 2002 and 2008, for developing and administering the mathematics, reading or language arts, and science assessments required under section 1111(b) of the act. As agreed with your offices, this report (1) describes characteristics of states’ Title I assessments and (2) provides estimates of what states may spend to implement the required assessments between fiscal year 2002 and 2008 and identifies factors that explain variation in expenses.1 To determine the characteristics of states’ Title I assessments, we collected information through a survey sent to the 50 states, the District of Columbia, and Puerto Rico; all 52 responded to our survey. We also reviewed published studies detailing the characteristics of states’ assessments. To estimate projected expenditures all states are expected to incur, we reviewed 7 states’ expenditures—all of which had implemented the 6 assessments required by the 1994 Elementary and Secondary Education Act (ESEA) reauthorization and were testing students in many of the additional subjects and grades required by NCLBA. The 7 states were Colorado, Delaware, Maine, Massachusetts, North Carolina, Texas, and Virginia. To estimate projected expenditure ranges for all states, we used expenditures from these 7 states coupled with key information gathered through a survey completed by each state’s assessment director. We estimated projected state expenditures for test development, administration, scoring, and reporting results for both assessments that states need and assessments that states currently have in place. Our methodology for estimating expenditures was reviewed by several internal and external experts and their suggestions have been incorporated as appropriate. Education officials were also briefed on our methodology and raised no substantial concerns. As agreed with your offices, we did not 1 NCLBA authorizes funding through fiscal year 2007 for assessments. However, consistent with the mandate for this study, we examined expenditures between fiscal years 2002 through 2008, enabling us to more fully capture expenditures associated with the science assessments, which are required to be administered in school year 2007-08. Page 2 GAO-03-389 Title I determine expenditures for alternate assessments for students with disabilities nor expenditures for English language proficiency testing. In addition, we did not determine the expenditures local school districts may incur with respect to these assessments. To determine what factors account for variation in projected expenditures, we reviewed the 7 states’ expenditures, noting the test characteristics that were associated with specific types and levels of expenditure. We supplemented our examination of state expenditures with interviews of test publishers and contractors and state assessment officials in these states regarding the factors that account for price and expenditure variation. The expenditure data that we received were not audited. Actual expenditures may vary from projected amounts, particularly when events or circumstances are different from those assumed. All estimates are reported in nominal dollars unless otherwise noted. We conducted our work in accordance with generally accepted government auditing standards between April 2002 and March 2003. (See app. I for more details about our scope and methodology.) The majority of states share two characteristics—they administer Results in Brief statewide assessments rather than individual local assessments and use customized questions to measure the content taught in the state schools rather than questions from commercially available tests. However, states differ in many other respects. For example, some states use assessments that include multiple-choice questions and other states include a mixture of multiple-choice questions and a limited number of questions that require students to write their response, such as an essay. Many states that use questions that require students to write their response believe that such questions enable them to more effectively measure certain skills, such as writing. However, others believe that multiple-choice questions also allow them to assess such skills. In addition, some states make actual test questions available to the public after testing but differ with respect to the percentage of test questions they publicly release and consequently, the number of questions they will need to replace. States also vary in the number of new tests they reported needing to develop to comply with the NCLBA, which ranged from 0 to 17. We provide three estimates—$1.9, $3.9, and $5.3 billion—of total spending by states between fiscal year 2002 and 2008, with the method by which assessments are scored largely explaining the differences in our estimates. These estimates are based on expenditures associated with new assessments as well as existing assessments. The $1.9 billion estimate is Page 3 GAO-03-389 Title I based on the assumption that all states will use multiple-choice questions, which are machine scored. The $3.9 billion estimate is based on the assumption that all states keep the mix of question types—whether multiple-choice or a combination of multiple-choice and open-ended— states reported to us. The $5.3 billion estimate is based on the assumption that all states will use a combination of multiple-choice questions and questions that require students to write their response, such as an essay, which are hand scored. Several states are exploring ways to reduce assessment expenses. This information could be beneficial to others, however, it is currently not being broadly shared. Given that significant expenses may be associated with testing, we are recommending that Education facilitate the sharing of information on states’ experiences as they attempt to reduce expenses. Education agreed with our recommendation, but raised concerns about our methodology for estimating expenditures. Enacted as part of President Johnson’s War on Poverty, the original Background Title I program was created in 1965, but the 1994 and most recently, the 2001 reauthorization of ESEA, mandated fundamental changes to Title I. The 1994 ESEA reauthorization required states to develop state standards and assessments to ensure that students served by Title I were held to the same standards of achievement as other students. Some states had already implemented assessments prior to 1994, but they tended to be norm referenced—a student’s performance was compared to the performance of all students nationally. The 1994 ESEA reauthorization required assessments that were criterion referenced—students’ performance was to be judged against the state standards for what children should know and be able to do.2 In passing the NCLBA, Congress built on the 1994 requirements by, among other things, increasing the number of grades and subject areas in which states were required to assess students, as shown in table 1. NCLBA requires annual testing of students in third through eighth grades, in mathematics and reading or language arts. It also requires mathematics and reading or language arts testing in one of the high school grades (10-12). States must also assess 2 A norm referenced test evaluates an individual’s performance in relation to the performance of a large sample of others, usually selected to represent all students nationally in the same grade or age range. Criterion referenced tests are assessments that measure the mastery of specific skills or subject content and focus on the performance of an individual as measured against a standard or criterion rather than the performance of others taking the test. Page 4 GAO-03-389 Title I students in science at least once in elementary (3-5), middle (6-9), and high school (10-12). NCLBA gives the states until the 2005-06 school year to administer the additional mathematics and reading or language arts assessments and until the 2007-08 school year to administer the science assessments (see app. II for a summary of Title I assessment requirements). Table 1: Number of Assessments and Subject Areas Required by the 1994 and 2001 ESEA Reauthorizations Number of required assessments 1994 ESEA 2001 ESEA Subject reauthorization reauthorization Reading or language arts 3 7 Mathematics 3 7 Science 0 3 Total 6 17 Source: P.L. No. 103-382 (1994) and P.L. No. 107-110 (2001). Unlike the 1994 ESEA reauthorization, NCLBA does not generally permit Education to allow states additional time to implement these assessments beyond the stated time frames.3 Under the 1994 ESEA reauthorization, Congress allowed states to phase in the 1994 ESEA assessment requirements over time, giving states until the beginning of the 2000-01 school year to fully implement them with the possibility of limited time extensions. In April 2002, we reported that the majority of states were not in compliance with the Title I accountability and assessment provisions required by the 1994 law.4 Every state applying for Title I funds must agree to implement the changes described in the 2001 act, including those related to the additional assessments. In addition to the regular Title I state grant, NCLBA authorizes additional funding to states for these assessments between fiscal year 2002 and 2007.5 These funds are to be allocated each year to 3 The Secretary of Education may provide states 1 additional year if the state demonstrates that exceptional or uncontrollable circumstances, such as a natural disaster or precipitous and unforeseen decline in the financial resources of the state prevented full implementation of the academic assessments by the deadlines. 4 U.S. General Accounting Office, Title I: Education Needs to Monitor States’ Scoring of Assessments, GAO-02-393 (Washington, D. C.: Apr. 1, 2002). 5 According to Education, there are also other sources of funding in NCLBA that states may draw upon for assessment related expenses. Page 5 GAO-03-389 Title I states, with each state receiving $3 million, regardless of its size, plus an amount authorized based on its share of the nation’s school age population. States must use the funds to pay the cost of developing the additional state standards and assessments. If a state has already developed the required standards and assessments, it may use these funds to, among other things, develop challenging state academic content and student academic achievement standards in subject areas other than those required under Title I and to ensure the validity and reliability of state assessments. NCLBA authorized $490 million for fiscal year 2002 for state assessments and such funds as may be necessary through fiscal year 2007. However, if in any year Congress appropriates less than the amounts shown in table 2, states may defer or suspend testing; however, states are still required to develop the assessments. In fiscal year 2002, states received $387 million for assessments. Table 2: Assessment Minimum Amounts under NCLBA Fiscal year Appropriation benchmark 2002 $370,000,000 2003 380,000,000 2004 390,000,000 2005 400,000,000 2006 400,000,000 2007 400,000,000 Total $2.34 billion Source: P.L. No. 107-110 (2001). Other organizations have provided cost estimates of implementing the required assessments. The National Association of State Boards of Education (NASBE) estimated that states would spend between $2.7 to $7 billion to implement the required assessments. AccountabilityWorks estimated that states would spend about $2.1 billion.6 States can choose to use statewide assessments, local assessments, or both to comply with NCLBA. States can also choose to develop their own test questions or augment commercially available tests with questions so 6 NASBE and AccountabilityWorks made different assumptions regarding what costs would vary with the number of students tested and which would be invariant costs. For example, NASBE assumed that development costs would vary by the number of students taking the test and AccountabilityWorks did not. Additionally, AccountabilityWorks reports having verified its assumptions with officials from two states, while the authors of the NASBE study do not report having verified their assumption with state officials. Page 6 GAO-03-389 Title I that they measure what students are actually taught in school. However, NCLBA does not permit states to use commercially available tests that have not been augmented. NCLBA provides Education a varied role with respect to these assessments. Education is responsible for determining whether or not states’ assessments comply with Title I requirements. States submit evidence to Education showing that their systems for assessing students and holding schools accountable meet Title I requirements, and Education contracts with individuals who have expertise in assessments and Title I to review this evidence. The experts provide Education with a report on the status of each state regarding the degree to which a state’s system for assessing students meets the requirements and, therefore, warrants approval. Under NCLBA, Education can withhold federal funds provided for state administration until Education determines that the state has fulfilled those requirements.7 Education’s role also includes reporting to Congress on states’ progress in developing and implementing academic assessments, and providing states, at the state’s request, with technical assistance in meeting the academic assessment requirements. It also includes disseminating information to states on best practices. The majority of states report using statewide assessments developed to States Generally measure student learning against the content they are taught in the states’ Report Administering schools, but their assessments differ in many other ways. For example, some states use assessments that include multiple-choice questions, while Statewide others include a mixture of multiple-choice questions and questions that Assessments require students to write their answer by composing an essay or showing how they calculated a math answer. In addition, some states make actual Developed to Measure test questions available to the public but differ with respect to the Their State Standards, percentage of test questions they publicly release. Nearly all states provide but Differ Along Other accommodations for students with disabilities and some states report offering their assessments in languages other than English. States also Characteristics vary in the number of new tests they will need to develop to comply with the NCLBA. 7 This amount is generally 1 percent of the amount that states receive under Title I or $400,000, whichever is greater. Page 7 GAO-03-389 Title I The Majority of States Use Forty-six states currently administer statewide tests to students and Statewide Tests That They 44 plan to continue using statewide tests for future tests NCLBA requires Report Are Written to them to add.8 (See fig. 1.) Only 4 states—Idaho, Kansas, Pennsylvania, and Nebraska—currently use a combination of state and local assessments and Their State Standards only Iowa currently uses all local assessments. Figure 1: The Majority of States Report They Currently Use Statewide Tests and Plan to Continue to Do So Current Future 2% (1) 4% (2) 2% (1) 8% (4) 2%(1) 10% (5) 90% (46) 85% (44) Statewide Local Combination Don't know/missing Source: GAO survey. Note: Percentages do not add to 100 because of rounding. The majority of states (31) report that all of the tests they currently use consist of questions customized, that is, developed specifically to assess student progress against their state’s standards for learning for every grade and subject tested. (See fig. 2.) Many of the remaining states are using different types of tests for different grades and subjects. For example, some states are using customized tests for some grades and subjects and 8 The District of Columbia and Puerto Rico are included in our state totals. Page 8 GAO-03-389 Title I commercially available tests for other grades and subjects. Seven states reported using only commercially available tests in all the grades and subjects they tested. In the future, the majority of states (33) report that all of their tests will consist of customized questions for every subject and grade. Moreover, those states that currently use commercially available tests report plans to replace these tests with customized tests or augment commercially available tests with additional questions to measure what students are taught in schools, as required by NCLBA. Figure 2: The Majority of States Reported That They Currently Use and Plan to Develop New Tests That Are Customized to Measure Their State’s Standards Type of test Current Future 4% (2) 17% (9) 37% (19) 60% (31) 19% (10) 63% (33) Customized test only Other Don't know/missing Source: GAO survey. Note: Percentages do not add to 100 due to rounding. In the current period, “other” includes states that reported using commercially available tests for all grades and subjects tested that had not been augmented with additional questions to measure state standards. These states reported plans to augment these tests with additional questions or replace them with customized tests. Page 9 GAO-03-389 Title I States Vary in Approach to In developing their assessments, nearly all states (50) reported providing Specific Accommodations specific accommodations for students with disabilities.9 These often include Braille, large print, and audiotape versions of their assessments for visually impaired students, as well as additional time and oral administration. About a quarter of the states (12) report offering these assessments in languages other than English, typically Spanish. Both small and larger states scattered across the United States offer assessments in languages besides English. For example, states such as Wyoming and Delaware and large states such as Texas and New York offer Spanish language versions of their assessments. New York and Minnesota offer their assessments in as many as four other languages besides English.10 While a quarter of the states currently translate or offer assessments in languages other than English, additional states may provide other accommodations for students with limited English proficiency, such as additional time to take the test, use of bilingual dictionaries, or versions of the test that limit use of idiomatic expressions. States Are Using Different Thirty-six states report they currently use a combination of multiple- Types of Questions to choice and a limited number of open-ended questions for at least some of Assess Students the assessments they give their students. (See fig. 3.) For example, in Florida, third grade students’ math skills are assessed using multiple- choice questions, while fifth grade students’ math skills are assessed using a combination of multiple-choice and open-ended questions. Twelve states reported having tests that consist entirely of multiple-choice questions. For example, all of Georgia’s and Virginia’s tests are multiple-choice. Almost half of the states reported that they had not made a decision about the ratio of multiple-choice to open-ended questions on future tests. Of the states that had made a decision, most reported plans to develop assessments using the same types of questions they currently use. 9 Two states reported that they did not provide accommodations for students with disabilities at the state level, however, accommodations may have been provided at the local school level. 10 New York offers its assessments in Spanish, Korean, Haitian Creole, and Russian and Minnesota offers its mathematics assessments in Spanish, Hmong, Somali, and Vietnamese. Page 10 GAO-03-389 Title I Figure 3: The Majority of States Reported They Use a Combination of Multiple- choice and Open-ended Questions on Their Tests, but Many States Are Uncertain about Question Type on Future Tests Question type Current Future 8% 4% (4) (2) 23% (12) 35% (18) 48% (25) Don't know 69% (36) 13% (7) Mix of multiple-choice and written response Multiple-choice Don't know Missing Source: GAO survey. States choose to use a mixture of question types on their tests for varying reasons. For example, some officials believe that open-ended questions, requiring both short and long student responses, more effectively measure certain skills such as writing or math computation than multiple-choice questions. Further, they believe that different question types will render a more complete measure of student knowledge and skills. In addition, state laws sometimes require test designers to use more than one type of question. In Maine, for example, state law requires that all state and local assessments employ multiple measures of student performance. Page 11 GAO-03-389 Title I States Split as to Whether Slightly over half of the states currently release actual test questions to the They Make Actual Test public, but differ in the percent of questions they release. (See fig. 4.) Questions Available to the Texas, Massachusetts, Maine, and Ohio release their entire tests to the public following the tests, allowing parents and other interested parties to Public Following Tests see every question their children were asked. Other states, such as New Jersey and Michigan release only a portion of their tests. Moreover, even those states that do not release questions to the general public may release a portion of the questions to teachers, as does North Carolina, so that they can better understand areas where students are having the most difficulty, and improve instructions. States that release questions must typically replace them with new questions. Figure 4: States Split in Decision to Release Test Questions to the Public Following Tests 2% (1) 44% (23) 54% (28) Do release Do not release Don't know/missing Source: GAO survey. Often, states periodically replenish their tests with new questions to improve test security. For example, states like Florida, Kentucky, Maryland, and South Carolina that do not release test questions, replenish or replace questions periodically. In addition to replenishing test items, many states use more than one version for each of their tests and do so for various reasons. For example, Page 12 GAO-03-389 Title I Virginia gives a different version of its test to students who may have been absent. Some states use multiple test versions of their high school tests to allow those students who do not pass it to take it multiple times. Still other states, such as Massachusetts and Maine, use multiple versions to enable the field testing of future test questions. States Vary in the Number States differ in the number of additional tests they reported they need to of Additional Tests They meet NCLBA requirements, with some having all of the tests needed while Reported They Need to others will need to develop new tests or augment commercially available tests with additional questions to fulfill the new requirements for a total of Develop or Augment 17 tests. (See table 3.) Appendix III has information on the number of tests each state needs to develop or augment to comply with NCLBA. The majority of states (32) report they will need to develop or augment 9 or fewer tests and the rest (20) will need to develop or augment 10 or more tests. Eight states—Alabama, New Mexico, Montana, South Dakota, Idaho, West Virginia, Wisconsin, and the District of Columbia report that they need to develop or augment all 17 tests. Maryland is also replacing a large number of its tests (15); although its assessments were certified as compliant with the 1994 law, the tests did not provide scores for individual students. Although Education waived the requirement that Maryland’s tests provide student level data, Maryland is in the process of replacing them so that it can provide such data, enabling parents to know how well their children are performing on state tests. Table 3: The Number of Tests States Reported Needing to Develop or Augment Varies Range in number of test Number states need to comply with NCLBA of states None 5 1–3 4 4-6 6 7-9 17 10-12 10 13 or more 10 Source: GAO survey. Most states reported plans to immediately begin developing the tests, which according to many of the assessment directors we spoke with, typically take 2 to 3 years to develop. For example, most states reported that by 2003 they will have developed or will begin developing the reading and mathematics tests that must be administered by the 2005-06 school Page 13 GAO-03-389 Title I year. Similarly, most states reported that by 2005 they will have developed or will begin developing the science tests that must be administered by the 2007-08 school year. To help them develop these tests, most states report using one or more outside contractors to help manage testing programs. Nearly all states report that developing, administering, scoring, and reporting will be a collaborative effort involving contractors and state and local education agencies. However, while states report that contractors and state education agencies will share the primary role in developing, scoring, and reporting new assessments, local education agencies will have the primary role in administering the assessments. We provide three estimates—$1.9, $3.9, and $5.3 billion—of total state Estimates of Spending spending between fiscal years 2002 and 2008 for test development, Driven Largely by administration, scoring, and test reporting. These figures include estimated expenses for assessments states will need to add as well as Scoring Expenditures continuing expenditures associated with assessments they currently have in place. The method of scoring largely explains the differences in the estimates. However, various other factors, such as the extent to which states release assessment questions to the public after testing and therefore need to replace them, also affect expenditures. Between states, however, the number of students assessed will largely explain variation in expenditures. Moreover, because expenditures for test development are small in relation to test administration, scoring, and reporting (nondevelopment expenditures), we estimate that state expenditures may be lower in the first few years when states are developing their assessments and higher in subsequent years as states begin to administer and score them and report the results. Different Estimates We estimate that states may spend $1.9, $3.9, or $5.3 billion on Primarily Reflect Title I assessments between fiscal years 2002 through 2008, with scoring Differences in How expenditures largely accounting for differences in our estimates. Table 4 shows total state expenditures for the 17 tests required by Assessments Are Scored Title I. In appendix IV, we also provide separate estimates for expenses associated with the subset of the 17 assessments that states reported they did not have in place at the time of our survey but are newly required by NCLBA. Page 14 GAO-03-389 Title I Table 4: Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08 Question type Estimate Questions and scoring methods used Estimate assumes that all states use machine- Multiple-choice $1.9 billion scored multiple-choice questions. Current question Estimate assumes that states use the mix of type $3.9 billion question types reported in our survey. Estimate assumes that all states use both Multiple-choice machine-scored multiple-choice questions and and open-ended $5.3 billion some hand scored open-ended questions. Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. The $1.9 billion estimate assumes that all states will use multiple-choice questions on their assessments. Multiple-choice questions can be scored by scanning machines, making them relatively inexpensive to score. For instance, North Carolina, which uses multiple-choice questions on all of its assessments and machine scores them, spends approximately $0.60 to score each assessment. The $3.9 billion estimate assumes that states will implement assessments with questions like the ones they currently use or plan to use based on state education agency officials’ responses to our survey. However, 25 states reported that they had not made final decisions about question type for future assessments. Thus, the types of questions states ultimately use may be different from the assessments they currently use or plan to use. Finally, the $5.3 billion estimate assumes that all states will implement assessments with both multiple-choice and open-ended questions. Answers to open-ended questions, where students write out their responses, are typically read and scored by people rather than by machines, making them much more expensive to score than answers to multiple-choice questions. We found that states using open-ended questions had much higher scoring expenditures per student than states using multiple-choice questions, as evidenced in the states we visited, as shown in figure 5.11 For example, Massachusetts, which uses many open- ended questions on its Title I assessments, spends about $7.00 to score each assessment. Scoring students’ answers to open-ended questions in Massachusetts involves selecting and training people to read 11 In Texas and Colorado, we were unable to separate scoring expenditures from other types of expenditures. Page 15 GAO-03-389 Title I and score the answers, assigning other people to supervise the readers, and providing a facility where the scoring can take place. In cases where graduation decisions depend in part on a student’s score on the assessment, the state requires that two or three individuals read and score the student’s answer. By using more than one reader to score answers, officials ensure consistency between scorers and are able to resolve disagreements about how well the student performed. Figure 5: Estimated Scoring Expenditures Per Assessment Taken for Selected States, Fiscal Year 2002 8 Estimated scoring expenditures per assessment taken 6 4 2 0 tts are ine a ia lin gin se Ma law ro hu Vir Ca De ac rth ss No Ma Open-ended and multiple-choice Primarily multiple-choice Source: GAO analysis of expenditure data provided by state education agencies. We estimate that, for most states, much of the expense associated with assessments will be related to test scoring, administration, and reporting, not test development, which includes such expenses as question development and field testing.12 (See table 5.) In Colorado, for example, 12 This may not be true for smaller states because they may have fewer assessments to administer, score, and report. Page 16 GAO-03-389 Title I test administration, scoring, and reporting expenditures comprise 89 percent of the total expenditures, while test development expenditures comprised only 11 percent. (See app. V for our estimates of development and nondevelopment expenditures by state.) Table 5: Estimated Total Expenditures for Test Development Are Lower Than for Test Administration, Scoring, and Reporting In millions Current Multiple-choice Multiple-choice question type and open-ended Development $668 $706 $724 Administration, 1,233 3,237 4,590 scoring, and reporting Total $1,901 $3,944 $5,313 Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. Various Factors are Likely While the scoring method explains a great deal of the variation in to Affect Expenditures for expenditures among states, other factors are likely to affect expenditures. Title I Assessments These factors include the number of different test versions used, the extent to which the state releases assessment questions to the public after testing, fees for using copyrighted material, and factors unique to the state. (See fig. 6.) For example, states that use multiple test versions will have higher expenditures than those that have one. Massachusetts used 24 different test versions for many of its assessments and spent approximately $200,000 to develop each assessment. Texas used only 1 version for its assessments and spent approximately $60,000 per assessment. In addition, states that release test items to the public or require rapid reporting of student test scores are likely to have higher expenditures than states that do not because they need to replace these items with new ones to protect the integrity of the tests and assign additional staff to more rapidly score the assessments by the specified time frame. States that customize their assessments may have higher expenditures than states that augment commercially available tests. Moreover, factors unique to the state may affect expenditures. Maine, which had one of the lowest assessment development expenses of all of the states we visited (about $22,000 per assessment), has a contract with a nonprofit testing company. Between states, the number of students tested generally explains much of the variation in expenditures, particularly when question types are similar. States with large numbers of students tested will generally have higher expenditures than states with fewer students. Page 17 GAO-03-389 Title I Figure 6: Various Factors Are Likely to Affect What States Spend on Title I Assessments Likely affect on estimated Factor expenditures Use of open-ended questions Number of students taking assessments Customizing assessments to align with state standards Extent of public release of questions Number of different versions of the assessments Faster turnaround time for scoring Factors unique to the state Source: State education agency official interviews. Benchmark Amounts in Using the benchmark funding levels specified in NCLBA, we estimate that NCLBA Will Cover Varying these amounts would cover varying portions of estimated expenditures. Portions of States’ (See table 6.) In general, these benchmark amounts would cover a larger percentage of the estimated expenditures for states that choose to use Estimated Expenditures multiple-choice tests. To illustrate, we estimated that Alabama would and Amount Covered Will spend $30 million if it continued to use primarily multiple-choice Vary Primarily by Type of questions, but $73 million if the state used assessments with both multiple- Test Questions States Use choice and open-ended questions. The specified amount would cover 151 percent of Alabama’s estimated expenditures if it chose to use all multiple-choice questions, but 62 percent if the state chose to use both multiple-choice and open-ended questions. Page 18 GAO-03-389 Title I Table 6: Total Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08 Appropriation benchmark as percent of Estimates (in millions) estimated expenses Current Multiple- Appropriation Current Multiple-choice Multiple- question choice and benchmark Multiple- question and open- choice type open-ended (in millions)a choice type ended Alabama $30 $30 $73 $46 151% 151% 62% Alaska 17 25 28 26 154 106 93 Arizona 39 108 108 51 132 47 47 Arkansas 23 42 53 37 158 88 70 California 178 235 632 219 123 93 35 Colorado 32 87 87 46 145 53 53 Connecticut 28 68 68 41 147 59 59 Delaware 14 24 24 26 183 106 106 District of Columbia 13 13 17 24 184 184 144 Florida 83 211 281 102 123 48 36 Georgia 54 54 174 67 124 124 39 Hawaii 17 31 31 28 162 91 91 Idaho 18 23 30 30 167 131 98 Illinois 65 164 211 92 141 56 44 Indiana 40 113 113 56 140 49 49 Iowa 24 62 62 38 158 62 62 Kansas 23 36 51 38 164 106 73 Kentucky 28 62 71 43 155 70 61 Louisiana 31 81 81 49 158 60 60 Maine 18 33 33 29 159 86 86 Maryland 35 91 91 51 146 56 56 Massachusetts 38 109 109 55 144 50 50 Michigan 57 177 177 80 140 45 45 Minnesota 34 91 91 51 149 56 56 Mississippi 25 63 63 39 154 61 61 Missouri 36 99 99 54 150 54 54 Montana 18 28 29 27 149 97 94 Nebraska 18 34 34 32 177 93 93 Nevada 21 26 45 33 152 125 72 New Hampshire 17 32 32 29 168 92 92 New Jersey 43 127 127 67 153 53 53 New Mexico 21 39 41 33 155 84 81 New York 83 276 276 121 146 44 44 North Carolina 49 49 152 65 132 132 43 North Dakota 16 23 23 26 162 109 109 Ohio 55 171 171 86 158 50 50 Oklahoma 27 37 66 42 156 114 63 Page 19 GAO-03-389 Title I Appropriation benchmark as percent of Estimates (in millions) estimated expenses Current Multiple- Appropriation Current Multiple-choice Multiple- question choice and benchmark Multiple- question and open- choice type open-ended (in millions)a choice type ended Oregon 28 28 70 40 145 145 57 Pennsylvania 58 162 181 87 150 54 48 Puerto Rico 28 28 70 47 167 167 67 Rhode Island 17 28 28 27 161 98 98 South Carolina 31 82 85 43 139 53 51 South Dakota 18 18 27 26 145 145 97 Tennessee 33 33 85 52 158 158 61 Texas 126 232 441 147 116 63 33 Utah 24 44 61 37 154 84 60 Vermont 16 25 25 25 155 102 102 Virginia 43 60 129 59 136 99 46 Washington 41 118 118 55 135 47 47 West Virginia 23 23 43 31 135 135 72 Wisconsin 29 66 72 53 180 80 73 Wyoming 15 21 21 25 171 119 119 Total $1,901 $3,944 $5,313 $2,733 144% 69% 51% Source: GAO analysis. a Figures in these columns are based largely on benchmark funding levels in NCLBA. If Congress appropriates less than the benchmark amounts, states may defer test administration. For fiscal years 2002 and 2003, however, we used the actual appropriation. In addition, because we were mandated to estimate spending for fiscal year 2008, for purposes of this analysis, we assumed a fiscal year 2008 benchmark of $400 million, the same amount as for fiscal years 2005, 2006, and 2007. It should be noted, however, that Congress has not authorized funding past fiscal year 2007, when Title I would be reauthorized. Benchmarks by state were calculated based on the formula in NCLBA for allocating assessment funds to the states. Total Expenditures Likely Estimated expenditures are likely to be lower in the first few years when to Be Lower in the First tests are being developed and increase in later years when greater numbers of tests are administered, scored, and reported. As a result, the Few Years, Increasing Over benchmark funding amounts in NCLBA would cover a larger percentage of Time as States Begin to estimated expenditures in the first few years. Under some circumstances, Administer, Score, and the funding benchmarks in NCLBA exceed estimated state expenditures. Report Additional For example, as shown in figure 7, the fiscal year 2002 allocation would Assessments more than cover all of the estimated expenses if all states were to use multiple-choice questions or continue with the types of questions they currently use. If all states were to choose to use a mixture of multiple- choice and open-ended questions, the most expensive option, fiscal year 2002 funding would cover 84 percent of states’ total expenditures. We estimate a similar pattern for fiscal year 2003. (See app. VI for fiscal year 2002 through 2008 estimated expenditures for each question type.) In fiscal year 2007 and 2008, benchmark funding would continue to cover all of the estimated expenditures if all states were to use all multiple- Page 20 GAO-03-389 Title I choice questions, about two-thirds of estimated expenditures if all states continued using their current mix of questions, and a little over 50 percent of estimated expenditures if all states were to use a mixture of question types, the most expensive option. Figure 7: Total Expenditures Likely to Be Lower in First Few Years and Benchmark Funding in NCLBA Estimated to Cover Most of Expenditures in First Few Years 1000 Dollars in millions 800 600 400 200 0 2002 2003 2004 2005 2006 2007 2008 Fiscal year Benchmark appropriations Multiple-choice Current question type Multiple-choice and open-ended Source: GAO analysis. Opportunities May Exist to Some states are exploring ways to control expenses related to Share Information on assessments and their experiences may provide useful information to other states about the value of various methods for controlling Efforts to Reduce Testing expenditures. Recently, several states, in conjunction with testing industry Expenditures representatives, met to discuss ways of reducing test expenditures. For example, the group discussed a range of possible options for reducing expenditures, including computer-administered tests; commercially available tests that can be customized to states standards by adding additional questions; computerized scoring of written responses, and Page 21 GAO-03-389 Title I computer scanning of students’ written responses. Information about individual states experiences as they attempt to reduce expenses could benefit other states. However, such information is currently not systematically shared. The 1994 and 2001 ESEA reauthorizations raised student assessments to a Conclusions new level of importance. These assessments are intended to help ensure that all students are meeting state standards. Congress has authorized funding to assist states in developing and implementing these assessments. We estimate that federal funding benchmarks in NCLBA will cover a larger percentage of expenses in the first few years when states are developing their assessments, with the covered percentage decreasing as states begin to administer, score, and report the full complement of assessments. Moreover, the choices states make about how they will assess students will influence expenditures. Some states are investigating ways to reduce the expenses, but currently information on states’ experiences in attempting to reduce expenses is not broadly shared. We believe states could benefit from information sharing. Given the large federal investment in testing and the potential for reducing Recommendation test expenditures, we recommend that Education use its existing mechanisms to facilitate the sharing of information on states’ experiences as they attempt to reduce expenses. The Department of Education provided written comments on a draft of Agency Comments this report, which we have summarized below and incorporated in the report as appropriate. (See app. VII for agency comments.) Education agreed with our recommendation, stating that it looks forward to continuing and enhancing its efforts to facilitate information sharing that might help states contain expenses. However, Education raised concerns about our methodology, noted the availability of additional federal resources under ESEA that might support states’ assessment efforts, and pointed out that not all state assessment costs are generated by NCLBA. With regard to our estimates, we have confidence that our methodology is reasonable and provides results that fairly represent potential expenditures based on the best available information. Education’s comments focus on the uncertainties that are inherent in estimation of any kind—the necessity of assumptions, the possibility of events or trends not readily predicted, and other potential sources of error that are acknowledged in the report—without proposing an alternative methodology. Because of the uncertainty, we produced three estimates Page 22 GAO-03-389 Title I instead of one. In developing our approach, we solicited comments from experts in the area and incorporated their suggestions as appropriate. We also discussed our estimation procedures with Education staff, who raised no significant concerns. Second, Education cites various other sources of funds that states might use to finance assessments. While other sources may be available, we focused primarily on the amounts specifically authorized for assessments in order to facilitate their comparison to estimated expenses and because they are the minimum amounts that Congress must appropriate to ensure that states continue to develop as well as implement the required assessments. We are sending copies of this report to the Secretary of Education, relevant congressional committees, and other interested parties. Please contact me on (202) 512-7215 or Betty Ward-Zukerman on (202) 512-2732 if you or your staff have any questions about this report. In addition, the report will be available at no charge on GAO’s Web site at http://www.gao.gov. Other GAO contacts and staff acknowledgments are listed in appendix VIII. Marnie S. Shaul, Director Education, Workforce and Income Security Issues Page 23 GAO-03-389 Title I Appendix I: Objectives, Scope, and Appendix I: Objectives, Scope, and Methodology Methodology The objectives of this study were to provide information on the basic characteristics of Title I assessments, and to estimate what states would likely spend on Title I assessments between fiscal year 2002 and 2008, and identify factors that explain variation in estimated expenditures. To address the first objective, we collected information from a survey sent to the 50 states, the District of Columbia, and Puerto Rico, and reviewed documentation from state education agencies and from published studies detailing the characteristics of states’ assessments. To address the second objective, we collected detailed assessment expenditure information from 7 states, interviewed officials at state education agencies, discussed cost factors with assessment contractors, and estimated assessment expenditures under three different scenarios. The methods we used to address the objectives were reviewed by several external reviewers, and we incorporated their comments as appropriate. This appendix discusses the scope of the study, the survey, and the methods we used to estimate assessment expenditures. Providing Information on We surveyed all 50 states, the District of Columbia, and Puerto Rico, all of the Basic Characteristics which responded to our survey. We asked them to provide information about their Title I assessments, including the characteristics of current and of Title I Assessments planned assessments, the number and types of new tests they needed to develop to satisfy No Child Left Behind Act (NCLBA) requirements, when they planned to begin developing the new assessments, the types of questions on their assessments, and their use of contractors. We also reviewed documentation from several states about their assessment programs and published studies detailing the characteristics of states’ assessments. Estimating Assessment This study estimates likely expenditures on Title I assessments by states Expenditures and between fiscal year 2002 and 2008, and identifies factors that may explain variation in the estimates. It does not estimate expenditures for alternate Explaining Variation in the assessments for students with disabilitiess for English language Estimates proficiency testing, or expenditures incurred by school districts.1 Instead, we estimated expenses states are expected to incur based on expenditure data obtained for this purpose from 7 states combined with data on these and other states’ assessment plans and characteristics obtained through a 1 The study also does not estimate the opportunity costs of assessments. Page 24 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology survey.2 In the 7 states, we requested information and documentation on expenditures in a standard set of areas, met with state officials to discuss the information and asked that they review our subsequent analysis of information regarding their state. The expenditure data that we received from the 7 states were not audited. Moreover, actual expenditures may vary from projected amounts, particularly when events or circumstances are different from those assumed, such as changes in the competitiveness of the market for student assessment or changes in assessment technology. Selection of 7 States We selected 7 states that had assessments in place in many of the grades and subjects required by the NCLBA from the 17 states with assessment systems that had been certified by Education as in compliance with requirements of the Improving America’s Schools Act of 1994 when we began our work. We included states with varying student enrollments, including 2 states with relatively small numbers of students. The states we selected were Colorado, Delaware, Maine, Massachusetts, North Carolina, Texas and Virginia. (See table 7 for information about the selected states.) Table 7: States Selected for Study Number of assessments Date approved Number of Reading Math Science State by Education students (out of 7) (out of 7) (out of 3) Total Colorado July 2001 724,508 7 5 1 13 Delaware December 2000 114,676 7 7 3 17 Maine February 2002 207,037 3 3 3 9 Massachusetts January 2001 975,150 5 4 3 12 North Carolina June 2001 1,293,638 7 7 0 14 Texas March 2001 4,059,619 7 7 2 16 Virginia January 2001 1,144,915 4 4 3 11 Source: U.S. Department of Education, National Center for Education Statistics, and state education agencies. Collection of Expenditure We collected detailed assessment expenditure information from officials Information from 7 States in the 7 states. We obtained actual expenditures on contracts and state assessment office budget expenditures for fiscal year 2002 for all 7 states 2 Because our expenditure data were limited to 7 states, our estimates may be biased. For example, if the 7 states we selected had higher average development expenditures per ongoing assessment than the average state, then our estimate of development expenditures would be biased upwards. Page 25 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology and for previous years in 4 states.3 In site visits to the 7 states, we interviewed state education agency officials who explained various elements of their contracts with assessment publishing firms and the budget for the state’s assessment office. To the extent possible, we collected expenditure data, distinguishing expenditures for assessment development from expenditures for assessment administration, scoring, and reporting, because expenditures vary differently between these two expenditure categories. Assessment development expenditures vary with the number of assessments while administration, scoring, and reporting expenditures vary with the number of students taking the assessments. (See table 8 for examples of expenditures.) Table 8: Examples of Assessment Expenditures Type of expenditure Example of expenditure Development Question writing Question review (e.g., for bias) Administration Printing and delivering assessment booklets Scoring Scanning completed booklets into scoring machines Reporting Producing individual score reports Source: State education agencies. Calculation of Averages for Using annual assessment expenditures for all 7 states, the number of Development and for assessments developed and implemented, and the number of students who Administration, Scoring, took the assessments, we calculated average expenditures for ongoing development (assessments past their second year of development) and and Reporting average expenditures for administration, scoring, and reporting for each state. (See table 9.) 3 We were unable to obtain information on personnel expenditures from 5 of the 7 states, and so we did not include personnel expenditures in our analysis. In the 2 states in which we obtained personnel expenditures, such expenditures were a relatively small part of the assessment budget. Page 26 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology Table 9: Average Annual Expenditures for the 7 States (adjusted to 2003 dollars) Average development Average expenditures for Both multiple-choice Multiple- expenditures (per ongoing administration, scoring, and and open-ended choice State assessment) reporting (per assessment taken) questions questions Colorado $72,889 $10.35 9 Delaware $66,592 $8.78 9 Maine $22,295 $9.96 9 Massachusetts $190,870 $12.45 9 North Carolina $104,181 $1.85 9 Texas $61,453 $4.72 9 Virginia $78,489 $1.80 9 Source: GAO analysis of state education agency information. Note: We were able to obtain data for more than 1 year for Colorado, Delaware, Maine, Massachusetts, and Texas. For these states, we adjusted their average expenditures to 2003 dollars and then averaged these adjusted expenditures across the years that data were collected. North Carolina did not distinguish Title I assessments from other assessments it offers. Estimating States’ Likely We provide three estimates of what all states are likely to spend on all of Expenditures for 17 Title I the required 17 assessments using the average development expenditure Assessments and average expenditures for administration, scoring, and reporting by question type (multiple-choice or multiple-choice with some open-ended questions). One estimate assumes that all states use only multiple-choice questions, the second assumes that states will use the types of questions state officials reported they use or planned to use, and the third assumes that all states will use both multiple-choice and a limited number of long and short open-ended questions. All estimates reflect states’ timing of their assessments (for example, that science assessments are generally planned to be developed and administered later than assessments for reading and mathematics). To estimate what states would spend under the assumption that they use only multiple-choice questions, we took the mean of the average annual expenditures per assessment for North Carolina, Texas, and Virginia, states that use multiple-choice assessments. To compute an estimate that reflected the types of questions states used or planned to use, we used the appropriate averages. To illustrate, California reported 15 multiple-choice tests and 2 tests that include a combination of multiple-choice and open- ended questions. For the 15 multiple-choice tests, we used the mean from the multiple-choice states (North Carolina, Texas, and Virginia). For the 2 multiple-choice and open-ended tests, we used the mean from the states that had both question types (Colorado, Delaware, Maine, and Massachusetts). To estimate what states would spend, assuming that all states use both multiple-choice and open-ended questions, we used the Page 27 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology mean of the average annual expenditures for Colorado, Delaware, Maine, and Massachusetts, states that use both types of questions. Estimating Development To estimate development expenditures, we obtained information from Expenditures each state regarding the number of assessments it needed to develop, the year in which it planned to begin development of each new assessment, and the number of assessments it already had. For each assessment the state indicated it needed to develop, we estimated initial development expenditures beginning in the year the state said it would begin development and also for the following year because interviews with officials revealed that developing an entirely new assessment takes approximately 2 to 3 years. For the 7 states that provided data, we were typically not able to separate expenditures for new test development from expenditures for ongoing test development. Where such data were available, we determined that development expenses for new assessments were approximately three times the expense of development expenses for ongoing assessments, and we used that approximation in our estimates. For each state each year, we multiplied the number of tests in initial development by three times the average ongoing development expenditure to reflect that initial development of assessments is more expensive than ongoing development.4 We multiplied the number of ongoing tests by the average ongoing development expenditure. The sum of these two products provides a development expenditure for each state in each year and provides a total development estimate. We calculated three estimates as follows: • using the expenditure information from states that use multiple-choice questions, we produced a lower estimate; • using the information from the state survey on the types of tests they planned to develop (some indicated both open-ended/multiple-choice tests and some multiple-choice), we produced a middle estimate;5 and • using the expenditure information from the states that use open-ended and multiple-choice questions, we produced the higher estimate. 4 We found estimates were not sensitive to changes in assumptions regarding development costs, partly because they proved to be a generally small portion of overall expenses. 5 For states that reported that they did not know the kinds of question they would use on future tests, we assumed that future test would be the same as they currently use. Where data were missing, we assumed that states would use assessments with both multiple- choice and open-ended questions, potentially biasing our estimates upward. Page 28 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology Estimating Administration, To produce an estimate for administration, scoring, and reporting, we used Scoring, and Reporting three variables: the average number of students in a grade; the number of Expenditures administered assessments; and the average administration, scoring, and reporting expenditure per assessment taken. We calculated the average number of students in a grade in each year using data from the National Center for Education Statistics’ Common Core of Data for 2000-01 and their Projection of Education Statistics to 2011. We obtained data on the number of administered assessments from our state education agency survey. Data on average expenditures come from the states in which we collected detailed expenditure information. For each state in each year, we multiplied the average number of students in a grade by the number of administered assessments and by the appropriate average assessment expenditure. Summing over states and years provided a total estimate for administration, scoring, and reporting. As above, we performed these calculations, using the expenditure information from multiple-choice states to produce the lower estimate, using the information from the state survey and expenditure information from both combination and multiple-choice states to produce a middle estimate, and using the expenditure information from the combination states to produce the higher estimate. We also estimated what states are likely to spend on the assessments that states did not have in place at the time of our survey, but are required by NCLBA, using the same basic methodology. Table 10 provides an overview of our approach to estimating states’ likely expenditures on Title I assessments. Page 29 GAO-03-389 Title I Appendix I: Objectives, Scope, and Methodology Table 10: Estimated Expenditures to Implement Title I Assessments in a Given Year A Total estimated development = Number of ongoing × Average development expenditure for each ongoing expenditure for ongoing assessments assessment assessments B Total estimated development = Number of new × Three times the average development expenditure for expenditure for new assessments assessments each ongoing assessment C Total estimated expenditures for = Average number of × Average administration, scoring, and reporting administration, scoring, and students in each grade expenditure for each assessment taken, times the reporting (ongoing and new number of assessments administered, for each assessments) ongoing and new assessment A + B + C = States’ estimated expenditures to implement Title I assessments Source: GAO analysis. We conducted our work in accordance with generally accepted government auditing standards between April 2002 and March 2003. Page 30 GAO-03-389 Title I Appendix II: Accountability and Assessment Appendix II: Accountability and Assessment Requirements under the 1994 and 2001 Reauthorizations of Title I Requirements under the 1994 and 2001 Reauthorizations of Title I Requirements for 1994 Requirements for 2001 Developing standards for content and performance Develop challenging standards for what students should know in In addition, develop standards for science content by 2005-06. mathematics and reading or language arts. In addition, for each of The same standards must be used for all children. these standards, states should develop performance standards representing three levels: partially proficient, proficient, and advanced. The standards must be the same for all children. If the state does not have standards for all children, it must develop standards for Title I children that incorporate the same skills, knowledge, and performance expected of other children. Implementing and administering assessments Develop and implement assessments aligned with the content and Add assessments aligned with the content and performance performance standards in at least mathematics and reading or standards in science by the 2007-08 school year. These language arts. science assessments must be administered at some time in each of the following grade ranges: grades 3 through 5, 6 through 9, and 10 through 12. Use the same assessment system to measure Title I students as the Use the same assessment system to measure Title I students state uses to measure the performance of all other students. In the as the state uses to measure the performance of all other absence of a state system, a system that meets Title I requirements students. If the state provides evidence to the Secretary that it must be developed for use in all Title I schools. lacks authority to adopt a statewide system, it may meet the Title I requirement by adopting an assessment system on a statewide basis and limiting its applicability to Title I students or by ensuring that the Title I local educational agency (LEA) adopts standards and aligned assessments. Include in the assessment system multiple measures of student Unchanged performance, including measures that assess higher order thinking skills and understanding. Administer assessments for mathematics and reading in each of the Administer reading and mathematics tests annually in grades following grade spans: grades 3 through 5, 6 through 9, and 10 3 through 8, starting in the 2005-06 school year (in addition to through 12. the assessments previously required sometime within grades 10 through 12). States do not have to administer mathematics and reading or language arts tests annually in grades 3 through 8 if Congress does not provide specified amounts of funds to do so, but states have to continue to work on the development of the standards and assessments for those grades. Have students in grades 4 and 8 take the National Assessment of Educational Progress examinations in reading and mathematics every other year beginning in 2002-03, as long as the federal government pays for it. Assess students with either or both criterion referenced assessments Unchanged and assessments that yield national norms. However, if the state uses only assessments referenced against national norms at a particular grade, those assessments must be augmented with additional items as necessary to accurately measure the depth and breath of the state’s academic contents standards. Assess students with statewide, local, or a combination of state and Unchanged local assessments. However, states that use all local or a combination of state and local assessments, must ensure, among other things, such assessments are aligned with the state’s academic content standards, are equivalent to one another, and Page 31 GAO-03-389 Title I Appendix II: Accountability and Assessment Requirements under the 1994 and 2001 Reauthorizations of Title I Requirements for 1994 Requirements for 2001 enable aggregation to determine whether the state has made adequate yearly progress Implement controls to ensure the quality of the data collected from Unchanged the assessments. Including students with limited English proficiency and with disabilities in assessments Assess students with disabilities and limited English proficiency By 2002-03 annually assess the language proficiency of according to standards for all other students. students with limited English proficiency. Students who have Provide reasonable adaptations and accommodations for students attended a U.S. school for 3 consecutive years must be tested with disabilities or limited English proficiency to include testing in the in English unless an individual assessment by the district language and form most likely to yield accurate and reliable shows testing in a native language will be more reliable. information on what they know and can do. Reporting data Report assessment results according to the following: by state, local Unchanged. educational agency (LEA), school, gender, major racial and ethnic groups, English proficiency, migrant status, disability, and economic disadvantage. LEAs must produce for each Title I school a performance profile with Provide annual information on the test performance of disaggregated results and must publicize and disseminate these to individual students and other indicators included in the state teachers, parents, students, and the community. LEAs must also accountability system by 2002-03. Make this annual provide individual student reports, including test scores and other information available to parents and the public and include data information on the attainment of student performance standards. on teacher qualifications. Compare high- and low-poverty schools with respect to the percentage of classes taught by teachers who are “highly qualified,” as defined in the law, and conduct similar analyses for subgroups listed in previous law. Measuring improvement Use performance standards to establish a benchmark for In addition to showing gains in the academic achievement of improvement referred to as “adequate yearly progress.” All LEAs and the overall school population, schools and districts must show schools must meet the state’s adequate yearly progress standard, that the following subcategories of students have made gains for example, having 90 percent of their students performing at the in their academic achievement: pupils who are economically proficient level in mathematics. LEAs and schools must show disadvantaged, have limited English proficiency, are disabled, continuous progress toward meeting the adequate yearly progress or belong to a major racial or ethnic group. To demonstrate standard. The state defines the level of progress a school or LEA gains among these subcategories of students, school districts must show. Schools that do not make the required advancement measure their progress against the state’s definition of toward the adequate yearly progress standard can face adequate yearly progress. consequences, such as the replacement of the existing staff. States have 12 years for all students to perform at the proficient level. Consequences for not meeting the adequate yearly progress standard LEAs are required to identify for improvement any schools that fail to New requirements are more specific as to what actions an LEA make adequate yearly progress for 2 consecutive years and provide must take to improve failing schools. Actions are defined for technical assistance to help failing schools develop and implement each year the school continues to fail leading up to the 5th year required improvement plans. After a school has failed to meet the of failure when a school may be restructured by changing to a adequate yearly progress standard for 3 consecutive years, LEAs charter school, replacing school staff, or state takeover of the must take corrective action to improve the school. school administration. The new law also provides that LEAs offer options to children in failing schools. Depending on the number of years a school has been designated for improvement, these options may include going to another public school with transportation paid by the LEA or using Title I funds to pay for supplemental help. Source: P. L. No. 103-382 (1994) and Pub.L No. 107-110 (2001). Page 32 GAO-03-389 Title I Appendix III: Number of Tests States Appendix III: Number of Tests States Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003) Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003) State Number of tests needed Alabama 17 Alaska 9 Arizona 9 Arkansas 9 California 5 Colorado 4 Connecticut 8 Delaware 0 District of Columbia 17 Florida 0 Georgia 0 Hawaii 9 Idaho 17 Illinois 6 Indiana 9 Iowa 0 Kansas 11 Kentucky 8 Louisiana 8 Maine 8 Maryland 15 Massachusetts 6 Michigan 8 Minnesota 11 Mississippi 3 Missouri 8 Montana 17 Nebraska 11 Nevada 11 New Hampshire 9 New Jersey 10 New Mexico 17 New York 8 North Carolina 3 North Dakota 11 Ohio 8 Oklahoma 8 Oregon 6 Pennsylvania 11 Puerto Rico 10 Rhode Island 11 South Carolina 3 Page 33 GAO-03-389 Title I Appendix III: Number of Tests States Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003) State Number of tests needed South Dakota 17 Tennessee 15 Texas 1 Utah 0 Vermont 9 Virginia 6 Washington 8 West Virginia 17 Wisconsin 17 Wyoming 11 Source: GAO survey. Page 34 GAO-03-389 Title I Appendix IV: Estimates of Assessment Appendix IV: Estimates of Assessment Expenditures NCLBA Required, but Not in Place at the Time of Our Survey, FY 2002-08 Expenditures NCLBA Required, but Not in Place at the Time of Our Survey, FY 2002-08 Table 11 provides estimates of assessment expenditures states may incur for grades and subjects they reported they would need to add to meet the additional assessment requirements under NCLBA. These estimates do not include any expenditures for continuing development or administration of assessments in grades and subjects already included in states’ reported assessment program, unless states indicated plans to replace its existing assessments. Estimates reflect total expenditures between fiscal year 2002 and 2008, and are based on the assumptions we made regarding question types. Table 11: Estimates of Expenditures for the Assessments Required by NCLBA That Were Not in Place at the Time of Our Survey, Fiscal Years 2002-08 Dollars in billions Question type Estimate Questions and scoring methods used Estimate assumes that all states use Multiple-choice $0.8 machine-scored multiple-choice questions. Current Estimate assumes that states use the mix of question type $1.6 question types they reported in our survey. Estimate assumes that all states use both Multiple-choice machine scored multiple-choice questions and open-ended and some hand scored open-ended $2.0 questions. Source: GAO. Note: Projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. Page 35 GAO-03-389 Title I Appendix V: State Development and Appendix V: State Development and Nondevelopment Estimates Nondevelopment Estimates Table 12 provides test development and nondevelopment expenditures by state between fiscal year 2002-08. Test development estimates reflect expenditures associated with both new and existing tests. Nondevelopment expenditures reflect expenditures associated with administration, scoring, and reporting of results for both new and existing assessments. Table 12: Estimates by State, Development, and Nondevelopment Expenditures Dollars in millions Multiple-choice and open-ended Current question type Multiple-choice Non- Non- Non- Development development Development development Development development Alabama $16 $57 $15 $15 $15 $15 Alaska 15 14 15 10 13 4 Arizona 15 93 15 93 14 25 Arkansas 14 39 14 28 13 10 California 13 619 12 223 12 166 Colorado 13 74 13 74 12 20 Connecticut 14 54 14 54 13 15 Delaware 12 13 12 13 11 3 District of Columbia 13 4 12 1 12 1 Florida 12 269 12 200 11 72 Georgia 12 162 11 44 11 44 Hawaii 14 17 14 17 13 5 Idaho 15 16 14 9 14 4 Illinois 13 198 13 151 12 53 Indiana 14 99 14 99 13 27 Iowa 12 50 12 50 11 14 Kansas 14 37 13 23 13 10 Kentucky 14 58 14 48 13 16 Louisiana 14 67 14 67 13 18 Maine 14 19 14 19 13 5 Maryland 16 75 16 75 15 20 Massachusetts 13 96 13 96 12 26 Michigan 14 163 14 163 13 44 Minnesota 15 76 15 76 14 20 Mississippi 13 51 13 51 12 14 Missouri 14 85 14 85 13 23 Montana 16 13 16 12 15 3 Nebraska 13 21 13 21 12 6 Nevada 14 31 13 13 13 8 New Hampshire 13 18 13 18 12 5 New Jersey 14 113 14 113 13 30 Page 36 GAO-03-389 Title I Appendix V: State Development and Nondevelopment Estimates Dollars in millions Multiple-choice and open-ended Current question type Multiple-choice Non- Non- Non- Development development Development development Development development New Mexico 16 25 16 24 15 7 New York 14 262 14 262 13 70 North Carolina 13 139 12 37 12 37 North Dakota 14 9 14 9 13 2 Ohio 13 158 13 158 12 42 Oklahoma 14 53 13 24 13 14 Oregon 14 57 13 15 13 15 Pennsylvania 15 166 15 147 14 45 Puerto Rico 14 56 13 15 13 15 Rhode Island 14 13 14 13 13 4 South Carolina 13 73 13 70 12 19 South Dakota 17 10 15 3 15 3 Tennessee 15 70 14 19 14 19 Texas 12 429 11 221 11 115 Utah 12 50 12 33 11 13 Vermont 15 10 15 10 14 3 Virginia 13 116 12 48 12 31 Washington 14 104 14 104 13 28 West Virginia 17 26 16 7 16 7 Wisconsin 15 57 15 51 14 15 Wyoming 14 7 14 7 13 2 Total $724 $4,590 $706 $3,237 $668 $1,233 Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. Page 37 GAO-03-389 Title I Appendix VI: Fiscal Years 2002-08 Estimated Appendix VI: Fiscal Years 2002-08 Estimated Expenditures for Each Question Type Expenditures for Each Question Type Table 13 provides estimates for each question type and the benchmark appropriations by fiscal years from 2002 through 2008. Each estimate reflects assumptions about the type of questions on the assessments. For example, the multiple-choice estimate assumes that all states will use assessments with only multiple-choice questions. These estimates also assume that states implement the assessment plans reported to us. The benchmark appropriation is based on actual appropriations in 2002 and 2003 and on the benchmark funding level in NCLBA for 2004-07. We assumed a benchmark of $400 million in 2008, the same as in 2005, 2006, and 2007. Table 13: Estimated Expenditures for Each Question Type, Fiscal Years 2002-08 Fiscal year (in millions) Question type 2002 2003 2004 2005 2006 2007 2008 Total Multiple- choice $165 237 288 291 293 308 318 $1,901 Current question type $324 442 572 615 633 665 692 $3,944 Multiple- choice and open-ended $445 586 761 824 855 903 941 $5,313 Benchmark appropriation $366 376 390 400 400 400 400 $2,733 Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. Note: Fiscal years 2002 through 2008 sums may not equal the total because of rounding. Page 38 GAO-03-389 Title I Appendix VII: Comments from the Appendix VII: Comments from the Department of Education Department of Education Page 39 GAO-03-389 Title I Appendix VII: Comments from the Department of Education Page 40 GAO-03-389 Title I Appendix VII: Comments from the Department of Education Page 41 GAO-03-389 Title I Appendix VIII: GAO Contacts and Staff Appendix VIII: GAO Contacts and Staff Acknowledgments Acknowledgments Sherri Doughty (202) 512-7273 GAO Contacts Jason Palmer (202) 512-3825 In addition to those named above, Lindsay Bach, Cindy Decker, and Staff Patrick DiBattista made important contributions to this report. Acknowledgments Theresa Mechem provided assistance with graphics. (130126) Page 42 GAO-03-389 Title I The General Accounting Office, the audit, evaluation and investigative arm of GAO’s Mission Congress, exists to support Congress in meeting its constitutional responsibilities and to help improve the performance and accountability of the federal government for the American people. GAO examines the use of public funds; evaluates federal programs and policies; and provides analyses, recommendations, and other assistance to help Congress make informed oversight, policy, and funding decisions. GAO’s commitment to good government is reflected in its core values of accountability, integrity, and reliability. The fastest and easiest way to obtain copies of GAO documents at no cost is Obtaining Copies of through the Internet. GAO’s Web site (www.gao.gov) contains abstracts and full- GAO Reports and text files of current reports and testimony and an expanding archive of older products. The Web site features a search engine to help you locate documents Testimony using key words and phrases. You can print these documents in their entirety, including charts and other graphics. Each day, GAO issues a list of newly released reports, testimony, and correspondence. GAO posts this list, known as “Today’s Reports,” on its Web site daily. The list contains links to the full-text document files. To have GAO e-mail this list to you every afternoon, go to www.gao.gov and select “Subscribe to daily E-mail alert for newly released products” under the GAO Reports heading. Order by Mail or Phone The first copy of each printed report is free. Additional copies are $2 each. A check or money order should be made out to the Superintendent of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or more copies mailed to a single address are discounted 25 percent. Orders should be sent to: U.S. General Accounting Office 441 G Street NW, Room LM Washington, D.C. 20548 To order by Phone: Voice: (202) 512-6000 TDD: (202) 512-2537 Fax: (202) 512-6061 Contact: To Report Fraud, Web site: www.gao.gov/fraudnet/fraudnet.htm Waste, and Abuse in E-mail: email@example.com Federal Programs Automated answering system: (800) 424-5454 or (202) 512-7470 Jeff Nelligan, managing director, NelliganJ@gao.gov (202) 512-4800 Public Affairs U.S. General Accounting Office, 441 G Street NW, Room 7149 Washington, D.C. 20548
Title I: Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies
Published by the Government Accountability Office on 2003-05-08.
Below is a raw (and likely hideous) rendition of the original report. (PDF)