? Ilr~it~vl St.at.es General Accounting Office ;I .Lp.---- ~-._- ..-.--. GAO ’ ” Report to the Chairman, $ubcommittee on Commerce, Consumer, and Monetary / :I A ff’ai rs, Committee on Government i Opwat;ions, House oi‘ Representatives 1 --~ _“.___.-..^“l___.~_ -ll -l.l--l.l -I~ .January 0 1990 TAX ADMINISTRATION Monitoring the Accuracy and Administration of IRS’ 1989 Test Call Survey 4 7- ‘i United States 0 General Accounting Office Washington, D.C. 20648 CLA General Government Division B-234202 January 4,199O The Honorable Doug Barnard, Jr. Chairman, Subcommitteeon Commerce, Consumer, and Monetary Affairs, Committee on Government Operations, Houseof Representatives Dear Mr. Chairman: This report respondsto your request that we evaluate the Internal Reve- nue Service’s (IRS) administration of its Integrated Test Call Survey Sys- tem (ITC!SS) during the 1989 tax filing season.ITCSS was designedto measure the quality of service IRSprovides through its toll-free tele- phone system- a nationwide system in which IRS assistors answer tax- payers’ telephone inquiries. To accomplish this purpose, IRSdesigneda survey sample to produce statistical estimates on the accuracy of its tel- ephone assistors in answering a set of 62 tax law test questions. These / test questions were developed for tax law areas in which IRSdetermined that individual taxpayers commonly make inquiries when preparing their tax returns. IRS administered the test by placing anonymous calls to its telephone assistors and scoring their responsesto the test questions. You requested that we report to you on IRS’ administration of its 1989 test call survey and on the validity of the statistical estimates produced during the test. To respond to your request, we monitored and indepen- dently scored a statistically valid random sample of IRS test calls. As you know, we worked with IRS to develop ITCSS and mutually agreed in advance on what constituted a correct answer for each question. This report evaluates the validity of IRS’ overall national accuracy rate. Appendix I provides selectedITCS!3 filing seasonresults for IRS regions and call sites and by major tax law categoriesfor individual taxpayers. This report updates and supplements the preliminary results of our work, which we reported in testimony before your Subcommittee on March 16,1989. We did our work from January 1989 to August 1989 using generally acceptedgovernment auditing standards. IRS’overall ITCSS results for the 1989 tax filing seasonshowed that IRS Results in Qrief telephone assistors responded correctly 62.8 percent of the time to the survey’s tax law test questions. On the basis of our monitoring of a sta- tistically valid random sample of test calls, we agreewith the overall Page 1 GAO/GGD-90-36 IRS’ Test Call Survey -, B-234202 telephone assistanceaccuracy rate IRS reported. Also, overall, IRS fairly administered its test call survey. With few exceptions, IRS test callers (Taxpayer Service employeesresponsible for making the test calls) asked tax law test questions in a fair manner and scored telephone assis- tors’ responsesobjectively and accurately. The test question scoring criteria for correct assistor responseswe used in our assessmentare those on which we and IRSmutually agreed.Dur- ing the filing season,however, IRS reported a higher ITCSS accuracy rate that was basedon more liberal scoring criteria with which we did not agree.IRS’ deviation from the agreed-uponscoring criteria in reporting assistor accuracy was the subject of your March 16,1989, hearing. In July 1989, the Assistant Commissioner (Taxpayer Service) said that for the 1990 filing season,IRS would only report accuracy rates that were based on scoring criteria mutually agreedupon by IRS and us. , Becauseour tax laws are complicated, taxpayers often need assistance B&kground in understanding the tax laws and in preparing their tax returns. The principal vehicle IRS’ Taxpayer Service Division usesto assist taxpayers is a toll-free telephone program. IRS has assistedtaxpayers through this program for over two decades.Historically, IRS has consideredtelephone assistanceto be the most efficient method of helping taxpayers. Accord- ingly, it has devoted substantial staff resourcesto telephone assistance and encouragestaxpayers to use the telephone as a means of getting answers to their tax law questions. During the 1989 tax filing season,IRS employed over 6,000 telephone assistors at 32 telephone sites. These assistors answered about 18.8 mil- lion taxpayer calls on individual and businesstax law issues,procedural issues,and account-related matters. IRS’ telephone assistors are primar- ily composedof two groups- frontline and backup assistors. Frontline assistors initially take taxpayers’ calls and, if they are unable to answer the taxpayers’ questions, refer them to backup assistors who usually have more experience and expertise. Over the past 2 years, both Congressand the public have raised con- cerns over the quality of the responsesassistorshave provided to ques- tions designedto test their tax law knowledge. For the 1988 filing season,we did our sixth survey of assistors’ tax law knowledge and Page 2 GAO/GGD-90-30 IRS’ Test Call Survey c *- -1 B-234202 reported that they provided correct responsesto our questions 64 per- cent of the time and incorrect responses36 percent of the time.1Also during the 1988 filing season,IRS implemented ITCSS and found that assistors correctly respondedto its test questions over 70 percent of the time. IRSexpressedits belief that both our test results and its own indi- cated an unacceptably low rate of assistor accuracy. For the 1989 filing season,the Subcommittee indicated interest in our working with IRS to develop a reliable testing system that we could mon- itor and that would avoid the need for IRS and us to do separate tests. Accordingly, we worked with IRS to develop ITCSS and a monitoring sys- tem that would enable us to conclude whether or not IRS properly admin- istered ITCSS and accurately reported on its assistors’ performance. The centerpiece of ITCSS is a 62-question test covering what IRS identified as the sevenmajor individual tax law categoriesin which taxpayers ask questions. As shown in table 1, these 7 tax law categoriescontained 32 subcategoriesof tax law. ‘Tax Administration: Accessibility, Timeliness, and Accuracy of IRS’ Telephone Assistance Program (GAO/GGD-89-30, Feb. 2, 1989). Page 3 GAO/GGD-30-36 IRS’ Test Call Survey B-234202 Tabid 1: individual Tax Law Categories and Subcategories Tested by IRS - 1989 IRS Test Call Survey Filing information Pensions/Deferred Compensation l Filing Requirement l Pension & Annuity Income l Estimated Tax l All IRA Inquiries l Other Retirement Plans Dependents/Exemptions/Fiiing Status l Taxation of Social Security Benefits l Dependents l Lump Sum Distribution l Personal Exemptions l Filing Status - Head of Household Adjustments/Deductions l Filing Status - Other l Employee Business Expense l Other Adjustments to Income lndlviduai income l Medical & Dental Deductions l Wages, Alimony, & Unemployment l Tax Deductions Compensation l Interest Deductions l interest & Dividend income, Sch. 6 . Miscellaneous Deductions l Taxable Refunds & Other income l Gifts to Charity l Non-Taxable income Tax Computation/Credits/Payments Capltal Gains ELLosses . Standard Deduction l Schedule D l Itemized vs Standard Deduction l Sale/Exchange of Residence l Child & Dependent Care Credit l Other Gains/Losses l Self-Employment Tax l Earned Income Credit l Other Credits/Taxes/Payments l Supplemental Medicare Premium Source: Internal Revenue Service We reached agreement on the 62 test questions that comprised the test and on two specific categoriesof correct responses:(1) correct and (2) correct and complete. A correct answer was the minimal standard IRS expected its telephone assistors to meet and we, therefore, focused our monitoring to determine whether ITCSS accurately measured assistors’ responsesagainst that standard. Answers that exceededthis standard would be classified as correct and complete, but they would also be con- sidered as correct for monitoring and scoring purposes. It was agreed that all other answers would be scored as incorrect, meaning that the telephone assistor’s answer could lead taxpayers to a wrong result on their tax return. Appendix III provides examples of selectedITCSS test questions and the responsesrequired for both categoriesof correct responses. Page 4 GAO/GGD-90-36 IRS’ Test Call Survey 0.234202 IRS administered its test by having test callers (1) place anonymous calls to telephone assistors located at 29 telephone sites located within the continental United States and (2) score assistors’ responsesto the test questions. During an 1l-week period beginning February 6, 1989, eight test callers completed and scored 14,876 ITCSS test calls. Figure 1 shows the geographic distribution of these 29 call sites. For various technical and administrative reasons,IRS did not include three telephone sites in its test-Alaska, Hawaii, and Puerto Rico. F&e 1: Locations of Toll-Free Telephone Sites IRS Surveyed - 1989 IRS Test Call Survey : BeattIe, WA Detroit, MI ; Portland. OR Buffalo, NY St. Paul, MN Cleveland, OH Milwaukee, WI Boston, MA Chicago, IL Brooklyn, NY Oakland, CA Newark, NJ Denver. CO Philadelphia, PA El Monte, CA Baltimore, MD Phoenix, AZ Pittsburgh, PA Omaha, NE Richmond, VA Des Moines, IA Cincinnati, OH St. Louis, MO Nashville, TN Indianapolis. IN Atlanta, GA ~ Dallas, TX Jacksonville, FL Houston, TX Source: Internal Revenue Service To monitor the validity of IRS’ overall test and to comment on its accu- Objectives, Scope,and racy results, we listened in on and independently scored 577 randomly Methodology selectedITCSS test calls during an 8-week period of the tax filing season and compared our scoresfor those questions and answers to IRS’ scores. We basedour scoring on the scoring criteria to which we had mutually * agreed with IRS.Those criteria established specific acceptablecombina- tions of required assistor probes for factual information and/or responsesthat neededto be present in the conversation for the response to be considered correct. Page 5 GAO/GGDQO-36 IRS’ Test Call Survey 5234202 Our monitoring sample was randomly selectedfrom IRS’ test call survey plan. Overall, our sample called for us to monitor 830 test calls, covering all test callers, time periods, and test questions. We calculated that this sample size would allow us to report our accuracy results for the period at the 95percent level of confidence with a sampling error of plus or minus 2.6 percent. However, we were unable to monitor and score 244 test calls primarily becauseof (1) deviations from IRS’ calling schedule, (2) the inability of test callers to complete calls to the telephone sites, and (3) occasionalproblems with our monitoring equipment. In addition, we dropped nine calls from our sample becausetest callers deviated from the agreed-uponquestion scripts, thereby affecting the outcome of the call. Accordingly, the reduction in our sample size causedour sam- pling error to increaseto plus or minus 4.4 percent at the 95-percent level of confidence. Wemonitored how well IRS administered its test and discussedthe devel- opment of and planning for the test with IRS officials in the Taxpayer Service and Statistics of Income Divisions and with the project manager of the contractor IRS selectedto develop and implement ITCSS' computer- ized scoring responseprogram. Our objectives, scope,and methodology are discussedin additional detail in appendix II. Our monitoring results showed that IRS telephone assistors correctly ITCSSProduced a answered 391 of the 577 tax law test questions. For the same 577 test Valid Indicator of calls, IRSscored 377 of them as correct. Using the samemethod as IRS to Overall Assistor statistically weight our scoring, our results show a 67.2-percent IRS tele- phone assistanceaccuracy rate compared to IRS’ accuracy rate for the Performance monitored calls of 65.8 percent. The difference in these rates is not sta- tistically significant and, therefore, the overall 62.8-percent accuracy rate IRS reported for all ITCSS calls can be relied upon as a valid indicator of assistors performance. The variance in our and IRS' scoring of test calls was due primarily to differences in interpretation as to the adequacy of assistors’ probes and responses.Probing is important becausetaxpayers who call with ques- tions usually are not sufficiently familiar with the tax laws to know what information assistors need to answer their questions. Without knowing certain facts about a taxpayer’s situation or status, assistors cannot be certain that the responsethey give would actually apply to the taxpayer. Assistors, therefore, must elicit that information from the taxpayer or provide a conditional response. Page 6 GAO/GGBQ@-36 IRS’ Test Call Survey B234202 - --I- Generally, assistor probes and responsesclearly met or failed to meet the ,agreed-uponscoring criteria for a correct response.However, instances occurred where assistors’ probes and responsesvaried some- what from predetermined acceptableprobes and responses;therefore, judgments had to be made on whether the responsesexpressedwere acceptable.On 60 monitored calls, or about 10 percent of our sample, we disagreed with IRS test callers as to whether a given probe or response fully met the scoring criteria, For 38 of the 60 calls, we scored the responsesas correct and IRSscoredthem as incorrect. For the other 22 calls, we scored the assistors’ responsesas incorrect, but IRS scoredthem as correct. Figure 2 shows IRS’ overall telephone assistor accuracy rate for ITCSS during the 1989 tax filing seasoncompared to IRS’ and our results for the sample of test calls we monitored. Figure 2: IRS’ Overall Tort Rerultr Co:mpared to GAO- and IRS-Monltored Telt Call Results for the 1989 Tax Filing 75 Percent Correct Searon 70 65 60 65 SO 45 40 Correct Answer Rate Our scoring of assistors’ responsesto the 577 monitored test calls was based on scoring criteria that we and IRS mutually devised. As discussed Page 7 GAO/GGD-90.36 IRS’ Test Call Survey ! r . I B224202 in our March 1989 testimony before the Subcommittee,2IRS also reported a higher accuracy rate that was basedon more liberal scoring criteria than those on which we had agreed.About 2 weeks after the start of the test, IRSdetermined that, for certain questions, assistors were providing answers that IRS believed were “not wrong” but failed to meet our agreed-uponstandards for correct answers. IRS officials said that it would be unfair to imply to Congressor the public that assistors were providing wrong answers if that advice would not necessarily lead tax- payers to file inaccurate tax returns. Thus, IRS devised another category of response-“right” answers- that failed to meet minimal standards but which it proposed to add to the “correct and complete” and “cor- rect” categoriesin reporting accuracy statistics. We did not agreeto IRS’ revision of the scoring criteria. In our opinion, the responsesIRS categorized as “right” were incomplete and potentially misleading and would increase the likelihood that taxpayers following such advice would make errors. For example, to defer the capital gains tax on the sale of a principal residence,a taxpayer must replace and occupy a new residencewithin a specified time period. IRS considered an answer as “right” if only the time period for replacement was provided. We consideredthe answer as incomplete and potentially misleading becauseof the tax consequencesthat would result if the taxpayer did not meet the occupancy requirement. In July 1989, the Assistant Com- missioner (Taxpayer Service) said that for the 1990 filing seasonIRS would only report assistor accuracy rates that were basedon mutually agreed-uponscoring criteria. For the 1989 tax filing season,IRS fairly administered its test call sur- Cohclusion vey, and we agreethat its reported overall 62.8-percent assistor accu- racy rate is reliable. For the sample of ITCSS test calls that we monitored, the difference in the accuracy rates for correct answers between our scoring and IRS’ scoring of those calls was not statistically significant. Thus, we believe that with periodic oversight the test call system admin- istered by IRS can be used as the principal monitor of its assistors’ performance. In providing commentsto this report, the Commissionerof the Internal Agency Comments and RevenueService said that he was not satisfied with the telephone assis- Our Evaluation tance accuracy rate that IRS achieved in 1989 and that one of his major %S’ Telephone Assistance Program (GAO/T-GGD-89-13, Mar. 16,1989). Page 8 GAO/GGD-W-36 IRS’ Test Cdl Survey B-234202 goals is to improve this accuracy rate in 1990. The Commissioner agreed with our findings but recommendedthe deletion of the table that pre- sents accuracy rates for each IRS call site. He believes that becausethe sample size of the data pertaining to each call site is smaller than national or regional sample sizes,the confidence interval associated with any call site accuracy range is too wide to be meaningful (see app. IV). As discussedon page 13, we agreethat call site accuracy rate ranges are wider than for national or regional accuracy rate ranges. However, the potential variance in the accuracy of call site data varies from plus or minus 4.95 percent to plus or minus 6.4 percent-a range we believe useful for comparisons of call site performance. To mitigate the Commis- sioner’s concernsand to permit reader perspective, we added to the call site data table the accuracy rate ranges for each call site. As arranged with the Subcommittee,we are sending copies of this report to the Commissionerof Internal Revenueand other interested parties. We will make copies available to others upon request. The major contributors to this report are listed in appendix V. Please contact me on 272-7904 if you or your staff have any questions concern- ing the report. Sincerely yours, Paul L. Posner Associate Director, Tax Policy and Administration Issues Page 9 GAO/GGD-90-36 IRS’ Test Call Survey Cdntents Letber ITCSSAccuracy Rate Reflects Overall Quality of IRS Telephone Service Provided to Taxpayers Analysis of SelectedData 13 g Seasonby Tax Category, Region, and Call Site Appendix II 24 Objectives, Scope,and Methodology Appendix III 26 SelectedITCSSTest Sample Question 1 26 Sample Question 2 27 Questions and Required Responses Appendix IV 29 Cornments From the Internal Revenue Service Appendix V 30 Major Contributors to This Report Related GAO Products 31 Tables u Table I. 1: Estimated Regional Accuracy Rates and 15 Accuracy Rate Ranges- 1989 IRS Test Call Survey Results Page 10 GAO/GGD-90-30 IRS’ Test Call Survey ‘, Contents Table 1.2:Estimated National Accuracy Rates and 17 Accuracy Rate Rangesby Tax Law Category - 1989 IRS Test Call Survey Results Table 1.3:Estimated Regional Accuracy Rates and 20 Accuracy Rate Rangesby Tax Law Category - 1989 IRS Test Call Survey Results Table 1.4:Accuracy Rate Rangesfor IRS Toll-Free 22 Telephone Sites - 1989 IRS Test Call Survey Results Figures Figure 1.1:IRS’ National Level Accuracy 12 , Figure 1.2:Distribution of 62 Tax Law Questionsby Tax 14 Law Category I Figure 1.3:Estimated Regional Accuracy Rates 15 I Figure 1.4:Estimated National Accuracy Rates by Tax 16 Law Category Figure 1.5:Estimated Regional Accuracy Rates by the Tax 18 Law Categoriesof Filing Information, Exemptions, Individual Income, and Capital Gains Figure 1.6:Estimated Regional Accuracy Rates by the Tax 19 Law Categoriesof Pensions,Adjustments to Income, and Tax Computation Abbreviations IRS Internal RevenueService ITCSS Integrated Test Call Survey System Page 11 GAO/GGD-90-36 IRS’ Test Call Survey Appendix I , tipgrakd Test Cdl Survey SystemResultsfor- th/e 1989 Tax F’iling Seasonby Tax Law C&gory, Region,and Call Site ITCbSSAccuracy Rate The Integrated Test Call Survey System was developed by IRS to more accurately measurethe accuracy of IRS responsesto taxpayer telephone Reflects Overall inquiries. Accuracy measurement is important becauseIRS believes that the higher the telephone assistanceaccuracy rate the better the quality of service it provides to the public. Pr@ided to Taxpayers The 1989 test call survey system was designedby IRS to place 1,488 test calls per week for 11 weeks to 29 of its 32 call sites throughout the United States. Each test call came from a group of 62 questions dealing with tax law issuespertaining to individuals. All test questions were derived from tax law categoriesin which IRS believes most taxpayers ask questions. To be credited with a correct response,ITCSS implementa- tion guidance directed that each IRS telephone assistor in the 29 toll-free telephone assistancecall sites nationwide (1) obtain relevant facts from the taxpayer as necessarybefore attempting to give an answer and (2) ensure that an answer was tailored to satisfy the taxpayer’s needs. O&all National Level From February 6, 1989, to April 21, 1989, IRS National Office test callers ITCSS Accuracy Results completed and scored 14,876 test calls. Figure I.1 shows the national level accuracy results for these test calls. Number of Test Calls Scored as Incorrect (5,534) Number of Test Calls Scored as Correct (9,342) Page 12 GAO/GGLMO-36 IRS’ Test Call Survey . Appendix I Integrated Test Call Survey System Resulte for the 1989 Tax Filing Season by Tax Law Category, Region, and Call Site I The data presented in this section represent selectedresults obtained by Arhalysis of Selected IRS during its 1989 tax filing seasontest call survey sample. We should D&a point out that our monitoring sample was designedto evaluate the valid- ity of IRSoverall national accuracy rate, not the accuracy of IRS' statisti- cal results at the tax law category, region, or call site levels. It should be expected that ITCSS results by categories,regions, and call sites have larger sampling errors than the overall ITCSS results becauseit is a com- I mon statistical property that a subsectionof a sample has more variabil- ity than the whole sample. Accuracy rates are estimated becausethey are drawn from a statistical sample of test calls. Each estimate has a range of precision, or confi- denceinterval, associatedwith it. The size of this accuracy rate range varies by the size of the test call sample used to produce the confidence interval. Therefore, the variability of accuracy rate ranges and esti- mated accuracy rates relating to the tax law category, regional, and call site data tables and figures that follow is the result of differing sample sizesassociatedwith each level of data. For example, tax law category and regional data were based on larger sample sizesthan the call site sample sizesand, therefore, produced narrower confidence intervals. The narrower the interval the higher the probability that the estimated accuracy rate approximates the actual accuracy rate. All data ranges shown in this section have been calculated to express the results at the g&percent level of confidence. Percentage of ITCSS The 6‘2tax law questions comprising IRS' test call survey covered seven Questions by Tax Law tax law categoriesin which IRS determined that taxpayers commonly made telephone inquiries. Figure I.2 shows the distribution of the test Wegory call questions acrossthe seventax law categories. Page 13 GAO/GGD90-36 IRS’ Test Call Survey Appendix I lutegrated Test Call Survey System I&w&a for the 1988 Tax Filing Season by Tax Law Category, Region, and Call Site Flgurip 1.2: Distribution of 62 Tax Law Qus+ions by lax Law Category Tax Computation (15 questions) r7% Filing Information (4 questions) ‘c 24% Exemptions (7 questions) Individual Income (8 questions) Capital Gains (6 questions) Pensions (10 questions) I Adjustments to Income (12 questions) Estimated Accuracy Rates Figure I.3 shows the estimated accuracy rates achieved by each IRS for’ IRS Regions region, and table I.1 shows the specific accuracy rate range associated with each region’s estimate. These data indicate that the Central Region accuracy rate clearly exceededboth the North Atlantic and Mid-Atlantic Regions’ accuracy rates. Page 14 GAO/GGD99-36 IRS’ Test Call Survey Appendix I Int@ratxxl Test Call Survey System IUxmlts for the l&3@Tax IWng Season by Tax Law Catego~‘& Region, and Call Site Flgr/re 1.3: Estimated Regional Accuracy Estimated Percent Correct IRS R-ions and Natlonrl Accuracy Ratn Table 1.1: Estimated Reglonal Accuracy Rater and Accuracy Rate Ranger - Figures in percent 1989 IRS Teat Call Survey Rerultr Estimated accuracy Accuracy IRS region rate rate range Central 67.7 64.9 - 70.6 Mid-Atlantic 61.4 58.6 - 64.2 Midwest 63.7 61.3 I 66.1 North Atlantic 59.2 55.7 - 62.6 Southeast 62.7 59.7 - 65.8 Southwest 62.7 59.8 - 65.7 Western 62.6 59.8 - 65.5 Source: Internal Revenue Service Page 15 GAO/GGD90-36 IRS’ Test Call Survey Appendix I Integrated Test Call Survey System Results for the 1989 Tax Filing Season by Tax Law Category, Region, and Call Site -_~ - -~ Estimated National and Figure I.4 shows the estimated accuracy rates achieved by IRStelephone Re ‘ional Accuracy Rates assistors within each tax law category, and table I.2 shows the accuracy rate range data associatedwith these estimates. These data illustrate by ax Law Category that telephone assistors clearly had the most difficulty providing correct I responsesto questions dealing with capital gains. Rate+by Tax Law Category l(w) btlmatad Porcent Correct so 7o-l 60 60 40 30 20 10 0 Tax Law Catqorir Page 16 GAO/GGD90-30 IRS’ Test CaU Survey Appendix I Integrated Test Call Survey Syetemlbsuh.s for the 1989 Tax Filing Gleasonby Tax Law C&e&tory, Reeion, and Call Site Tal 1.2: Estimated Natlonal Accuracy Ral and Accuracy Rate Range8 by Tax Figures in percent La\iv Category - 1989 IRS Tort Call Estimated 8Ul ey AlwJltr accuracy Accuracy ” Tax law category rate rate range II Filing Information 68.3 65.3 - 71.3 Exemptions 66.7 63.6 - 69.8 Individual Income 62.7 59.6 - 65.8 Capital Gains 44.9 41.3 - 48.5 Pensions 65.3 62.4 - 68.2 Adjustments to Income 59.8 56.9 _ 62.7 / Tax Computation 67.7 65.3 - 70.1 Source: Internal Revenue Service Figures I.6 and I.6 show the estimated accuracy rates, and table I.3 shows the corresponding accuracy rate ranges for each IRS region in each tax law category. For purposes of comparison, we have included national accuracy rates and ranges for the sametax law categories. Page 17 GAO/GGD40-36 IRS’ Test Call Survey - Integrated Test Cdl Survey System Resulta for the 1989 Tax Filing Season by Tax Law Category, Reldon, and Call Site ii W 1.5:Estimated Regional Accuracy Rates by the Tax Law Categorleb of Filing Information, Exemptions, Individual Income, ant pital Oalns 100 tlmrtod Pwconl Cormct so so 70 so so 40 so 20 10 0 IRS Reglorm and Ntilonal Accuracy Rater Flllng lnfonation Exemptlone Individual Income I I Capltal Gain8 Page 18 GAO/GGD-SO-36 IRS’ Test Call Survey - Appendix I lntegratad Teat Call Survey System Results for the 1989 Tax FYUngSeason by Tax Law Category, IlegIon, and Call Site rFig COI 8 1.6: Estimated Regional Accuracy Rates by the lax Law Categories of Pensions, Adjustments to Income, and Tax ,utatlon 100 Eetlmrtod Porcont Correct m ‘IRS Roglons and National Accuracy Rates III Pensions Adjustments to Income ‘I Tax Computation Page 19 GAO/GGD99-36 IRS’ Test Call Survey Appendix I Integrated Test Call Survey System Results for the 1989 Tax Filing Season by Tax Law Category, Region, and CaU Site Tab14 1.3: Estimated Regional Accuracy Rates and Accuracy Rate Ranges by Tax Law Category - 1989 IRS Test Call Survey Rt3fiUh Estimated Accuracy Tax I __ w.-..-..._ category __..._ -... - ..___--~-- IRS region accuracy rate rate range Frling information Central 74.1 66.7 - 81.5 Mid-Atlantic 66.1 58.6 - 73.6 Midwest 66.6 60.0 - 73.2 1 North Atlantic 66.2 57.1 - 75.3 Southeast 70.5 62.6 - 78.4 Southwest 66.7 58.7 - 74.7 Western 67.9 60.3 - 75.5 --_- /....-.-- ~.~.-. -.___- -_-___ (National average) (68.3) (65.3) - (71.3) Exemlptions Central 72.7 65.0 - 80.4 I Mid-Atlantic 63.6 55.8 - 71.4 Midwest 69.3 62.7 - 75.9 North Atlantic 62.0 52.4 - 71.6 Southeast 67.2 58.9 - 75.5 Southwest 66.2 58.0 - 74.4 Western 66.4 58.6 - 74.2 .-. . -- _- -.- (National average) (66.7) (63.6) - (69.8) Individual income Central 67.5 59.6 - 75.4 Mid-Atlantic 59.5 51.7 - 67.3 Midwest 63.9 57.2 - 70.6 North Atlantic 59.7 50.3 - 69.1 Southeast 63.1 54.8 - 71.4 Southwest 62.9 54.7 - 71 .l Western 62.2 54.4 - 70.0 (National average) (62.7) (59.6) - (65.8) Capital gains Central 51.9 42.5 - 61.3 Mid-Atlantic 44.9 36.1 - 53.7 Midwest 43.7 35.9 - 51.5 North Atlantic 40.8 30.2 - 51.4 Southeast 41.5 32.0 - 51.0 Southwest 48.0 38.6 - 57.4 Western 44.3 35.3 - 53.3 (National average) (44.9) (41.3) - (48.5) Page 20 GAO/GGDQO-36 IRS’ Test Call Survey Appendtx I Integrated Test Call Survey System Resulti for the 1989 Tax F’iUng Seaeon by Tax Law Category, Region, and CaU Site Estimated Accuracy ?!rr.~~_c?~%?~.__. Pensions IRS region Central accuracy rate 69.1 rate range 61.7 - 76.5 / / Mid-Atlantic 64.6 57.4 - 71.8 Midwest 67.2 60.9 - 73.5 North Atlantic 59.6 50.6 - 68.6 Southeast 66.1 58.3 - 73.9 Southwest 66.8 59.2 - 74.4 Western 63.4 56.0 - 70.8 - .._.+.------ (National average) (65.3) (62.4) - (68.2) Adjustments to income Central 65.8 58.5 - 73.1 Mid-Atlantic 59.6 52.5 - 66.7 Midwest 59.5 53.2 - 658 North Atlantic 57.1 48.4 - 65.8 Southeast 57.9 50.1 - 65.7 Southwest 57.6 50.0 - 652 Western 62.2 55.0 - 69.4 (National average) (59.8) (56.9) - (62.7) Tax’computation Central 69.9 63.7 - 76.1 Mid-Atlantic 66.1 60.1 - 72.1 Midwest 70.6 65.5 - 75.7 North Atlantic 64.6 57.3 - 71.9 Southeast 68.0 61.6 - 74.4 Southwest 67.0 60.7 - 73.3 Western 67.7 61.7 - 73.7 (National average) (67.7) (65.3) - (70.1) Source: Internal Revenue Service Page 21 GAO/GGD-90-36 IRS’ Test Call Survey ., Appendix I Integrated Test Call Survey System Reeulta for the 1989 Tax F’iling Season by Tax Law Category, Re@on, and Call Site Estimated Accuracy Rates Table I.4 below shows for the 1989 tax filing seasonthe variations in for $RSToll-Free accuracy rate ranges for the 29 telephone sites tested by IRS. Telbphone Assistance Call Sikjs lab14 1.4: Accuracy Rate Ranges for IRS Toll- ree Telephone Sites - 1989 IRS Figures in percent feat 9, all Survey h8lJit8 Estimated Accuracy IRS telephone sites by region accuracy rate rate range Central Region Cincinnati 65.6 59.5 - 71.7 Cleveland 69.8 63.8 - 75.7 Detroit 65.2 59.1 - 71.3 Indianapolis 70.6 65.7 - 75.6 Mid-Atlantic Region Baltimore 64.1 57.9 - 70.2 Newark 52.0 45.6 - 58.4 -- Philadelphia 54.9 48.5 - 61.3 Pittsburah 72.6 66.8 - 78.3 Richmond 61.1 54.9 - 67.4 Midwest Reaion Chicago 58.2 52.8 - 63.5 Des Moines 69.7 63.8 - 75.7 Milwaukee 65.0 58.8 - 71.1 Omaha 73.0 67.3 - 78.7 St. Louis 62.5 56.2 - 68.7 St. Paul 69.1 63.2 - 75.1 North Atlantic Reaion Boston 67.7 62.7 - 72.8 Brooklyn 52.1 45.7 - 58.5 Buffalo 67.7 61.7 - 73.7 -~ Southeast Region Atlanta --_____ 57.2 51.9 - 62.6 Jacksonville 66.0 60.9 - 71.2 Nashville 67.4 62.3 - 72.5 (continued) Page 22 GAO/GGD-9036 IRS’ Test Call Survey Appendix I Integrated Test Call Survey System Results for the 1989Tax Filing Season by Tax Law Category, Re@on, and Call Site Estimated Accuracy IRS telephone sites by region accuracy rate rate range Southwest Reaion Dallas 59.6 54.3 - 64.9 Denver 64.6 58.4 - 70.7 Houston 65.5 59.4 - 71.6 Phoenix 62.0 55.8 - 68.2 Western Region El Monte 56.0 49.6 - 62.4 Oakland 65.1 59.9 - 70.3 Portland 71.3 65.5 - 77.1 Seattle 67.5 62.5 - 72.6 Source: Internal Revenue Service Page 23 GAO/GGDflO-36 IRS’ Test Call Survey Appendix II Objectives,Scope, and Methodology Our objectives were to report on IRS’ administration of its I989 test call survey and on the validity of the statistical estimates produced during this test. To evaluate how well IRS administered ITCSS, we interviewed Taxpayer Service Division officials, reviewed IRS planning documents and managerial records, and monitored a randomly selectedsample of test calls. We also interviewed officials and reviewed documents from IRS’ Statistics of Income Division and Mathematics Policy Research,Inc., (the contractor IRS selectedto develop and implement the computerized responsescoring program) for information pertaining to ITCSS' design and implementation. Finally, IRS’ internal audit reviewed ITCSS proce- dures and results, and we interviewed the IRSauditors and reviewed their evaluation documentation. We did our work at the IRS National Office in Washington, DC., from January 1989 to August 1989. To evaluate the validity of the statistical estimates produced by ITCSS, we monitored and scored a statistically valid random sample of survey test calls and compared our scoring of those calls with documentation showing how IRS scoredthe samecalls. IRS devised its test call survey sample plan to produce statistical estimates of the accuracy of its tele- phone assistors in answering scripted test questions involving tax law for individuals. ITCSS design methodology called for eight IRS test callers at the National Office to place a total of 16,368 randomly selectedtest calls over an 11-week period (Feb. 6,1989, through Apr. 21,1989) to 29 toll-free telephone assistancesites throughout the United States (see fig. 1). During the test period, IRS test callers actually completed and scored 14,876 test calls. Each test caller was scheduledto make 186 test calls per week to vari- ous call sites and at various times specified in the test call sample. The test call sample assignedto each test caller was randomly selectedfrom a pool of 62 tax law questions representing the sevenmajor tax law cat- egories in which IRS determined that individual taxpayers commonly ask questions. These seventax law categoriescontained 32 subcategoriesof tax law, as shown in table 1. The ITCSS design methodology was devel- oped to produce results that would have a sampling error of plus or minus 2 percent at the 95-percent level of confidence. We began monitoring IRS’test on Wednesday,February 22,1989, about 2 weeks after IRS started its test call sample, and the first workday that telephone monitoring equipment supplied by IRS was operable. We con- tinued our monitoring until Friday, April 14, 1989, a total of 38 test days. In order to comment on IRS’ accuracy results, we developed a sam- pling plan that called for us to listen to and score a randomly selected Page 24 GAO/GGIHO-36 IRS’ Test Call Survey Appendk II Objectives, Scope, and Methodology sample of 830 scheduledtest calls that covered all test callers, daily time periods, and test questions. We calculated that this sample size would allow us to report our accuracy results for the period at the 95-percent level of confidence with a sampling error of plus or minus 2.5 percent. To accomplish our monitoring, we developed monitoring records that incorporated the scripted test questions, probes, and responsesused by IRS’ test callers. We used an individual monitoring record to document the scoring of each test call and to note any test caller deviations from the scripted test calls or assistor variations from acceptableprobes. At the end of each day, we provided IRS with a listing of the calls we moni- tored, and IRS later provided us with documentation showing the ITCSS test callers’ scoring of the sametest calls. We compared our scoring with IRS’ scoring for each monitored test call and documented the results. We evaluated test caller deviations from the scripts to determine whether they could have had a material effect on the assistors’ responses.We determined that nine deviations were material (e.g., inap- propriate information provided by the test caller either led to or pre- empted an assistor’s response),and we deleted those calls from our monitoring sample. In addition to the nine calls deleted becauseof test caller script devia- tions, we were unable to monitor and score 244 test calls primarily due to (1) deviations from IRS’ calling schedulebecauseof test caller absencesand IRS staff meetings; (2) IRS’ inability to complete test calls as scheduled due to heavy call volumes at the sites called; and (3) occa- sional monitoring equipment problems, which impaired our ability to clearly hear the assistors’ responses.However, anticipating such prob- lems, we purposely oversampled to accommodatelost calls. Although we oversampled, the number of lost calls exceededour estimates and causedour sampling error to increase.Accordingly, the 677 test calls we monitored and scored are a statistically valid sample size that allows us to report our results with 95-percent confidence that our sampling error is no greater than plus or minus 4.4 percent. Page 25 GAO/GGD90-36 IRS’ Test Call Survey Appendix III S&cted ITCSSTest Questionsand &u.ired Responses This appendix presents two ITCSS test questions that were used in the 1989 test call survey. IRS and we agreed that these questions would not be used in the 1990 survey and, thus, we believe that they will provide readers of this report with concrete examples of the types of questions that comprise the test call survey. To score ITCSS test questions, IRS and we agreed on the specific responses that would be categorized as (1) correct or (2) correct and complete. A correct answer was the minimal standard IRS expected its telephone assistorsto meet. Answers that exceededthis standard would be classi- fied as correct and complete. It was further agreedthat all other answers would be scored as incorrect-meaning that the telephone assistor’s answer could lead taxpayers to a wrong result on their tax return. IRS’ reported 62.8-percent national accuracy rate for the 1989 tax filing seasonand our monitoring of how well IRS administered its test call survey were based on the agreed-uponscoring criteria for “correct” responses.Answers that met the correct and complete standard were also considered as correct for monitoring and scoring purposes. For 48 of the 62 ITCSS test questions, scoring criteria required that assis- tors probe callers to obtain ‘nformation that would be neededto answer their questions with a corr &ct response.Of the 48 questions that required assistors to probe, 29 questions required 1 probe, 16 questions required 2 probes, and 3 questions required 3 probes. Probing is impor- tant becausetaxpayers who call with questions usually are not familiar with the tax laws and frequently do not know what information assis- tors need to answer their questions correctly. Without knowing certain facts about a taxpayer’s situation or status, assistors cannot be certain that the responsethey give would actually apply to the taxpayer. Assis- tors, therefore, must elicit that information from the taxpayer or pro- vide a conditional response. The two sample questions that follow illustrate test questions that require no probing and questions that require multiple probing. To assist the caller in judging whether assistors covered the required probes and gave the correct responses,the required probing and responsepoints were enumerated individually. Tax law category: Capital gains and losses. Sample Question 1 Subcategory: Sale/Exchangeof residence. Page 20 GAO/GGD9036 IRS’ Test Call Survey Appendix IU Selected ITCSS Test 6)uestione and Required Reeponees Question: My husband (wife) and I have been working for a major corpo- ration in Germany and have decided to sell our home in the United States. We were told that we only have 2 years in which to replace the property. Doing that will be a real burden on us since we’ll still be out of the United States. Is there a way around that 2-year requirement for replacement?We are not eligible for the one-time exclusion for people 56 or older. Background: . Caller and spousehave been overseasfor 6 months. . Caller and spousehave not rented their U.S. home. l Caller and spousewill be abroad about 3 years. l Caller’s tax home is outside of United States. Probing points: None. Responsepoints: Rl: The replacement for your main home is extended to 4 years from the date of sale of your old home. R2: You must occupy the new home within the 4-year period. R3: Refer to Publication 523, Tax Information on Selling Your Home. Scoring: Correct: Rl and R2. Correct and complete: Rl -- and R2 and R3. Tax law category: Individual income. Sample Question 2 Subcategory:Wages,alimony, and unemployment compensation. Question: My father was unemployed part of last year. He only made $3,600 before he went on unemployment. Doeshe have to file a return? Background: Page 27 GAO/GGD-99-30 LRS’ Test Call Survey Appendix III Belected ITCSS Test Queetions and Required Responses 9 Father received $1,600 in unemployment compensation from the state and he made no contributions to the plan. . Father received no interest or other income. l Father is 61 and not blind. . Father is single with no dependents. . Father cannot be claimed as a dependent on caller’s (or anyone else’s) return. Probing points: Pl: How much unemployment compensation did your father receive? P2: How old is your father? -Or is your father 65 or older? P3: What is your father’s filing status? Or - is your father married? ResponsePoints: Rl: Yes, he must file a return. R2: His unemployment benefits are taxable. R3: His total income exceedsthe threshold for filing; Or his total income exceeds$4,950; Or his total income exceedshis standzd deduction and personal exemption. Scoring: Correct: Pl ---- and P2 and P3 and Rl and (R2 -or R3). Correct and complete: Pl ----- and P2 and P3 and Rl and R2 and R3. Y Page 28 GAO/GGD-90-36 IRS’ Test Call Survey Appendix IV CommentsF’romthe Internal RevenueService DEPARTMENT OF THE TREASURY INTERNAL REVENUE SERVICE WASHINGTON, D.C. 20224 Mr. Richard L. Fogel Assistant Comptroller General United States General Accounting Office Washington, DC 20548 Dear Mr. Fogel: Wehave reviewed your recent draft report entitled, ‘l’ax Administration: Monitoring the Accuracy and Administration of IRS’ 1989 Test Call Survey,” which was produced at the request of the Chairman, Subcommittee on Commerce,Consumer, and Monetary Affairs, House Committee on Government Operations. We generally agree with the report’s findings which validate the design and our administration of the Integrated Test Call Survey System (I’ICSS). However, we recommenddeletion of the tables in Appendix I that present accuracy rates for each telephone site. Because of the smaller sample pertaining to each site, it is not possible to achieve statistical validity without presenting data in ranges that are too wide to be meaningful. For this reason, it has been IRS policy not to release individual call site data. We have no objection to publishing call site data at the end of the next filing season if we can work with GAOto assure statistically valid data. Wewould also like to work with GAOto provide any data that is necessary to release the report for the 1990 filing season as soon as possible after the end of the filing season. Webelieve that earlier release of the report would avoid public confusion and the consequent increase in the volume of calls when the report is released at the beginning of the next filing season. It would also help us focus on remedial actions in planning for the next filing season. The IRS is not satisfied with the accuracy rate that we achieved last year. One of my major goals is to improve the taxpayer service accuracy rate for 1990 and we are taking steps to achieve this improvement. For example, a test site in Boston provides IRS telephone assistors with a computerized data system designed to ensure that taxpayers are asked all necessary questions and correct answers are provided by telephone assistors. Taxpayer Service staff throughout the country have been provided with written desk guides that use these techniques to teach assistors to fully and accurately respond to taxpayer inquiries. We have also used the test call data from the 1989 filing season to modify our training of telephone assistors to improve weak areas. These and other actions lead us to believe that the 1990 filing season will see substantial improvements in our telephone tax assistance. Best regards. Page 29 GAO/GGD90-36 IRS’ Test Call Survey Appendix V Mgor Contributors to This Report Ge$eral Government Robert P. Glick, Assignment Manager Diqision, Washington, Martin S. Morris, Tax Attorney William F. Bley, Evaluator-in-Charge D.$ Susan Ragland, Evaluator / Maria Z. Oliver, Evaluator Harry M. Conley III, Statistician P&ram Evaluation ar$Methodology Division, Washington, D.C. Page 30 GAO/GGD-90-36 IRS Test Call Survey Page 31 GAO/GGD-90-36 IRS’ Test Call Survey / l!t&i& GAO Products Accessibility, Timeliness, and Accuracy of IRS’ Telephone Assistance Program (GAO/GGD-89-30, Feb. 2, 1!%9). (268293) Page 82 GAO/GGDgo-36 IBS’ Test Call Survey -._-, I -,__ ~.sI_II,-l.l”I,l .* “.-.-ll.*.~ .._ *I -..- I...“..m”-.------_-- ox-- ‘I’t~lt~phorrt~ 202-275-624 I i ._“l--- -. --
Tax Administration: Monitoring the Accuracy and Administration of IRS' 1989 Test Call Survey
Published by the Government Accountability Office on 1990-01-04.
Below is a raw (and likely hideous) rendition of the original report. (PDF)