The Internal Revenue Service is collecting a lot more than taxes this year—it's also acquiring a huge volume of personal information on taxpayers' digital activities, from eBay auctions to Facebook posts and, for the first time ever, credit card and e-payment transaction records, as it expands its search for tax cheats to places it's never gone before.
The IRS, under heavy pressure to help Washington out of its budget quagmire by chasing down an estimated $300 billion in revenue lost to evasions and errors each year, will start using "robo-audits" of tax forms and third-party data the IRS hopes will help close this so-called "tax gap." But the agency reveals little about how it will employ its vast, new network scanning powers.
Tax lawyers and watchdogs are concerned about the sweeping changes being implemented with little public discussion or clear guidelines, and Congressional staff sources say the IRS use of "big data" will be a key issue when the next IRS chief comes to the Senate for approval. Acting commissioner Steven T. Miller replaced Douglas Shulman last November.
"It's well-known in the tax community, but not many people outside of it are aware of this big expansion of data and computer use," says Edward Zelinsky, a tax law expert and professor at Benjamin N. Cardozo School of Law and Yale Law School. "I am sure people will be concerned about the use of personal information on databases in government, and those concerns are well-taken. It's appropriate to watch it carefully. There should be safeguards." He adds that taxpayers should know that whatever people do and say electronically can and will be used against them in IRS enforcement.
IRS's big data tracking. Consumers are already familiar with Internet "cookies" that track their movements and send them targeted ads that follow them to different websites. The IRS has brought in private industry experts to employ similar digital tracking—but with the added advantage of access to Social Security numbers, health records, credit card transactions and many other privileged forms of information that marketers don't see.
"Private industry would be envious if they knew what our models are," boasted Dean Silverman, the agency's high-tech top gun who heads a group recruited from the private sector to update the IRS, in a comment reported in trade publications. The IRS did not respond to a request for an interview.
In trade presentations and public documents, the agency has said it will use a massively parallel computer system that can analyze data from different networks to find irregularities and suspicious activities.
Much of the work already has been automated to process and analyze electronic tax returns in current "robo-audits" that flag unusual behavior patterns. With IRS audit staff reduced by budget cuts this year, the agency will be forced to rely on computer-generated audits more than ever.
The agency declined to comment on how it will use its new technology. But agency officials have been outlining plans at industry conferences, working with IBM, EMC and other private-sector specialists. In presentations, officials have said they may use the big data for:
• Charting and analyzing social media such as Facebook
• Targeting audits by matching tax filings to social media or electronic payments
• Tracking individual Internet addresses and emailing patterns
• Sorting data in 32,000 categories of metadata and 1 million unique "attributes"
• Machine learning across "neural" networks
• Statistical and agent-based modeling
• Relationship analysis based on Social Security numbers and other personal identifiers
Officials have said much of the data will be used only for research. The agency's economic forecasts and data are a key part of Washington's budget infrastructure. Former commissioner Douglas Shulman said in an IRS statement that the technology will employ "billions of pieces of data" to target enforcement and to "detect and combat noncompliance."