Global e-Business Association
[ Article ]
The e-Business Studies - Vol. 17, No. 5, pp.107-116
ISSN: 1229-9936 (Print) 2466-1716 (Online)
Print publication date Oct 2016
Final publication date 30 Oct 2016
Received 25 Sep 2016 Revised 19 Oct 2016 Accepted 27 Oct 2016
DOI: https://doi.org/10.20462/tebs.2016.10.17.5.107

Big Data as a Solution to Shrinking the Shadow Economy

Myungki Nam* ; Sangwon Lee**
*Senior Researcher, K-ICT Big Data Center, National Information Society Agency nmkok@nia.or.kr
**Associate Professor, Division of Information and Electronic Commerce, Wonkwang University sangwonlee@wku.ac.kr
빅데이터를 활용한 지하경제 양성화 방안
남명기* ; 이상원**
*한국정보화진흥원 K-ICT 빅데이터센터 선임연구원 nmkok@nia.or.kr
**원광대학교 정보전자상거래학부 부교수 sangwonlee@wku.ac.kr

Abstract

This paper presents a theoretical solution based on Big Data for shrinking Korea’s shadow economy. Korea has a relatively large shadow economy, which was approximately 25.3% of its GDP, compared to other OECD countries. The ratio of the self-employed (who are more prone to tax evasion) to the total labor force is also relatively high. Therefore, this paper focuses on constructing policies aimed at the self-employed. By analyzing case studies from Australia and the United States of America, we suggest three solutions to shrinking the shadow economy, which are customized to Korea. First, we create asmall business benchmark system matched to Korea’s industrial and regional characteristics. Second, we providea way to specify tax return patterns using National Tax Services retention data in order to detect future tax evaders. Third, we suggest a plan for examining various unstructured information from social networking services (and other Internet-related sources in order to extract criminal activities such as sharing information on tax evasion in online communities or encouraging customers to pay with cash by offering a discount.

초록

본 논문은 한국의 지하경제를 양성화하기 위해, 빅데이터에 기반하여 이론적 방안을 제시한다. 한국은 다른 OECD 국가와 비교할 때, GDP의 약 25.3 퍼센트의 상대적으로 큰 지하경제를 가지고 있다. 또한, 전체 노동인구와 비교할 때, 탈루 경향이 있는 자영업자의 비율이 상대적으로 높다. 따라서, 본 논문에서는 자영업자를 목표로 정책을 수립하는데 초점을 두고 있다. 우리는 호주와 미국의 사례연구를 분석하여, 한국 실정에 맞는 세 가지의 지하경제 양성화 방안을 제시하고자 한다. 첫째, 한국의 산업과 지역의 특성에 맞춰서 자영업자 벤치마크 제도를 구축하고, 둘째, 미래 탈세자를 발견하기 위해, 국세청의 보유 자료를 활용하여 탈세예방 유형을 구체화하는 방법을 제시하며, 셋째, 사회관계망서비스로부터 다양한 비구조화정보를 조사하기 위한 계획을 제안하려고 한다. 온라인 커뮤니티 안에서 탈세 정보를 공유하거나, 할인 제안을 통해 현찰로 지불을 독려하는 방법으로, 범죄적 행위를 추출하기 위해 다른 인터넷 관련 자료도 활용하려고 한다.

Keywords:

Big Data, Shadow Economy, Tax Return Pattern, Self-employment, Unstructured Information

키워드:

빅데이터, 지하경제, 소득신고유형, 자영업, 비구조적 정보

Contents


Ⅰ. Introduction

In South Korea, the Park Geun-hye government has proclaimed the “welfare without increased tax” policy and has stressed shrinking of the shadow economy as one of its core tasks. It plans to secure welfare finances by shrinking Korea’s shadow economy without increasing citizens’ tax burden. The shadow economy is costly for society, thus justifying the government’s effort to constrain it. It also causes many socioeconomic problems such as hindering equality within the official economy. Moreover, with its shadow economy being on a larger scale than in other advanced economies, Korea needs systematic measures to shrink it.

Efforts to this end such as using real names in transactions, encouraging credit card use, and the issuing of cash receipts have been made by the Korean government and are not really new proposals. However, with technology advancing in this information era and more sophisticated crimes emerging, innovative measures that differ from past efforts are required.

Compared to other OECD countries, Korea has a very high ratio of self-employed to total employed. With this high ratio of self-employed, whose cash transaction ratio is also high, a higher potential for tax evasion arises, which in turn has an enormous impact on the shadow economy of Korea. Therefore, a measure that reflects the current social reality and incorporates the latest technology urgently needs to be developed. In this paper, we aim to seek ways to prevent tax evasion by the self-employed by using Big Data for tax purposes and to present a measure to shrink the shadow economy.


Ⅱ. Current Status of Korea’s Shadow Economy and Related Issues

According to Schneider et al., the more advanced a country is, the smaller is its shadow economy and the more the shadow economy is shrinking. However, Korea’s shadow economy was 24.7% of GDP, being at the higher end among OECD countries (Schneider & Buehn, 2012).

Of the several reasons for Korea’s large shadow economy, the most important is the percentage of the country’s self-employed, which, at 28.8%, is notably high compared to that of major advanced economies such as the United States (7.0%), Japan (12.3%) and the United Kingdom (13.9%). As it is difficult to detect their actual incomes, the self-employed are highly likely to evade taxes. Korea’s high ratio of self-employed has been cited as the most significant reason for the shadow economy in studies other than Schneider’s (Kim, 2013). Therefore, a concrete new measure different from past efforts is needed to shrink the shadow economy. To address this need, we present measures to shrink the shadow economy of the self-employed by using Big Data.


Ⅲ. Concept of Big Data and Case Studies: International Examples

1. Definition of Big Data

Big Data is yet to be defined concretely, but many institutions such as McKinsey, IDC, and Gartner define it as the data beyond what existing systems can collect, store, manage, and analyze, even as new technology evolves to effectively draw value from the data. The characteristics of Big Data are generally expressed as 3V or 4V: an exorbitant volume of data in terms of physical and conceptual scope (volume); generated in real-time with fast velocity (velocity); a variety of data such as digital images and pictures, web documents on social networking services (SNSs), and GPS signals of mobile phones (variety); and the new value created through them (value).

In the United States, the country that is most actively utilizing Big Data, the government announced a “Big Data R&D Initiative” in March 2012 and adopted a strategy of utilizing Big Data in six departments including its Department of Defense and its Department of Energy. Korea’s National Informatization Strategies Council also announced a “Big Data Masterplan” in consultation with relevant ministries in November 2012, laying a foundation for use of Big Data. According to the global consulting group McKinsey’s report (McKinsey, 2012), the US governments are expected to create substantial value by utilizing Big Data (IGT, 2012), and other countries’ cases where Big Data is being used for tax purposes provide a new perspective to Korea.

Shadow Economy Ratio of Major OECD Nations as a Percent of GDP (Unit: %)

Relative Ratio of Variables Affecting the Shadow Economy of Major OECD Nations (Unit: %)

2. Cases Where Big Data Is Being Used for Tax Purposes

1) Australia’s National Tax Bureau: Small Business Benchmark project

Australia’s National Tax Bureau introduced the “Small Business Benchmark” (SBB) project in 2009 to prevent tax evasion due to small businesses’ transactions in cash and to discover their actual income. This project provides a benchmark or business ratio that allows small businesses to compare themselves with other businesses in a similar industry. The project characterizes businesses in more than 100 industries, using various data such as daily business records, cash deposits, and cash withdrawals. Data are shared among different government agencies, banks, realtors, and relevant industry authorities; thus, data can be corrected in the event of omission or wrong entry. Australia’s federal government allocated approximately ten million Australian dollars (about 10.08 billion KRW as of September 2013) to the SBB project, proof that the project is a major tax evasion prevention mechanism (IBM, 2012).

2) New York State’s Tax Bureau: Case Identification and Selection System

The “Case Identification and Selection System” (CISS), the tax evasion prevention system of New York State, analyzes income reports and reimbursement requests and detects potential tax evasion cases. The major strength of this system is that the automated program swiftly identifies and ranks potentially suspicious cases. While New York State conducted a tax audit based on random samples in the past, since the introduction of CISS, it audits only samples identified by the program, and has thus gained efficiency and accuracy. Because of the CISS, New York State’s detection of tax-related crimes has increased by more than 350% from 2003 to 2010 (ETAAC, 2013).

3) US Internal Revenue Service (IRS): Return Review Program

The “Return Review Program” (RRP) tax evasion prevention program of the IRS, implemented in cooperation with SAS, a company specializing in Big Data, identifies cheating patterns through various data analyses. In particular, the analysis of unstructured information and SNSs using SAS Text Miner technology is worth noting. Text mining technology focuses on analyzing unstructured information such as handwritten letters and call logs of call centers and identifies patterns similar to cheating, thereby sorting out potential crimes. Then, it analyzes SNSs such as Facebook or Twitter to detect crime ring information on the Internet. It analyzes online communities, websites, and active links to detect, through relevant search words, crime rings such as agents for false income reports. Currently, RRP is applied to all individual taxpayers. Thus, the United States is in the process of constructing a more comprehensive system, which will focus not only on individual taxpayers but also on tax agents assisting with tax evasion in its analysis.


Ⅳ. Measures to Shrink the Shadow Economy of the Self-Employed by Using Big Data

In order to shrink the shadow economy of the self-employed, departing from the fragmented analysis of the past, we would like to present the following three measures based on Big Data. The first involves a shift from analyzing structured and limited information to building a comprehensive database that incorporates even unstructured information. The second involves a shift from observing entire businesses to certain individual businesses. The third calls for a shift from reactive problem solving to proactive prediction and prevention.

Using Big Data to Shrink the Shadow Economy of the Self-Employed

1. Setting Benchmarks for the Self-Employed per Industry and Region

The SBB project of Australia, which sets benchmarks based on large-scale data, has achieved an outstanding record in discovering the incomes of small businesses and preventing tax evasion due to transactions in cash. However, shortcomings were also reported in that geographical distances were not taken into consideration and that industries should be further classified into subcategories. We believe that geographical distances should be reflected when adopting the Australian SBB project and that further industry-subclassified benchmarks should be set for the self-employed in order to prevent tax evasion by the self-employed, whose income cannot easily be discovered as compared to salaried workers; thus, tax evasion cannot easily be prevented.

Accordingly, we calculated a “ratio to sales” for different cost items such as COG, overhead, lease, transportation, and so on based on structured information from the National Tax Services (NTS) of Korea and non-governmental, large-scale structured and unstructured information. We then calculated the volume of generated profits and set the standard range; exceeding that range would imply potential tax evasion. Moreover, in view of certain expense items such as house/office rentals influenced by geographic distances, we set Korea-specific benchmarks for the self-employed, differentiating between large and small cities based on size of population. Examples of major variables in the benchmarks set for confectioneries are as follows. The benchmarks of total expenditure shown in Tables 4 and 5 are applicable only to the specified industries.

For Korea, it is necessary to set benchmarks beginning with industries with higher tax evasion rates. Once benchmarks are set, they are updated annually through Big Data analysis reflecting business fluctuations and market situations. Benchmarks are also expanded every year to more industries. The self-employed obtain a negative score when they surpass their respective benchmark ranges, receiving a written notice as a first step when their total score reaches a certain level. The self-employed who receive a written notice are expected to explain why their reported income deviates from the standard benchmarks of their industry. If they fail to submit a satisfactory explanation, they will be subject to tax audit. This motivates the self-employed to voluntarily seek a tax audit by a tax expert; this kind of indirect tax audit reduces the burden on tax authorities.

Example of 2014 Benchmarks for the Self-Employed per Industry (Basic) – Confectionaries (Unit: 1000 Won)

Example of 2013 Benchmarks for the Self-Employed per Industry (Total Expenditure) – Confectionaries (Unit: 1000 Won)

The benchmarks for the self-employed in Korea, which will be posted on the NTS homepage, can be seen in a positive light in that they help to see the amount of tax payable and motivate the self-employed to voluntarily report their taxable income. If used in tandem with reported business owner trends, such as asset acquisition status including real estate vs. reported income, value added ratio as compared to the same industry, credit card sales ratio, tip-off on tax evasion, and accusation letter regarding credit card use, these benchmarks will be effective in detecting tax-related crimes.

2. Introduction of a System for Preventing Tax Evasion and Undue Reimbursement of the Self-Employed

By researching Big Data, one can discover abnormal signs from the comprehensive data, analyze past behavioral information through prediction modeling, and thus detect actions showing patterns similar to cheating. Accordingly, just as the US IRS detects patterns similar to cheating through text mining or prediction modeling and the New York State Tax Bureau identifies and investigates cases of potential tax evasion by analyzing income reports and reimbursement requests, Korea should also introduce a system to analyze behaviors of the self-employed, assess their patterns, and uncover their intention to evade tax. Furthermore, Korea should not only examine patterns similar to past tax evasion but also detect abnormal signs in advance by analyzing bank accounts, addresses, telephone numbers, and relationships between taxpayers and their family members and acquaintances. Korea also should put a system in place to monitor skillful tax evasion groups and rings and to examine the potential for tax-related crimes through analysis of information on SNSs.

Recently, the NTS has been able to utilize the financial transaction information of the Korea Financial Intelligence Unit(FIU) in tax audits. The FIU information is meaningful because it can secure huge datasets for the analysis of Big Data when tax evasion of the high-income self-employed is investigated. The Ministry of Strategy and Finance also announced that it would utilize FIU data to collect information on mainly cash-based industries such as large-scale entertainment businesses and high-end housing rental businesses and on industries prone to tax evasion. It also announced that it would utilize FIU data to strictly verify the appropriateness of falsely reported expenses. This indicates that FIU data will play a most vital role in Big Data analysis in the future.

3. Prevention of New Types of Tax Evasion through Big Data Analysis

With information technology advancing further and smart devices becoming popular, the NTS should also utilize digital technology to discover and prevent tax evasion. If a SNS-based tax-evasion-detection system is constructed, it will be possible to analyze Big Data in real-time, discover secretive tax evasion through cash transactions by the high-income self-employed, and conduct research on the ever-evolving methods of tax evasion.

For instance, by using SNSs such as Facebook and Twitter, blogs and cafes of portal sites such as Daum and Naver, and messenger services such as Kakaotalk and Line clues to tax evasion of the self-employed can be found in real-time. When consumers are creating their daily records digitally and large-scale data is being accumulated because of the expanded use of mobile devices such as smartphones and tablet PCs as well as popularization of social media, online real-time analysis of tax evasion by the self-employed will be a much more effective tax audit than past methods have been. For example, postings such as “dental implant to be discounted when paid in cash” or “plastic surgery to be discounted if paid in cash,” or online testimonials can provide clues to self-employed business owners’ potential for tax evasion. In particular, as more consumers consult via Kakaotalk or SNSs, Big Data analysis of unstructured information such as texts or images will prove a powerful tool to shrink the shadow economy of the high-income self-employed.

Moreover, as the government focuses on shrinking the shadow economy, the high-income self-employed use various new ways and means to evade tax. In so doing, they will also try new tax evasion measures that are widely practiced in other countries. Resorting to old ways of fixing problems after the fact rather than in advance will cause delays in tracing new tax evasion methods, making it difficult to detect them. Alternatively, if tax authorities learn of new tax evasion methods used by various self-employed people including the high-income self-employed by constructing Big Data such as data from the Internet, and if they analyze tax-evasion-related information sharing among the self-employed in real-time, they will be able to discover ever-evolving tax evasion methods more swiftly and employ more flexible tax policies.

In order to legalize the assessment standard for the self-employed by using Big Data and to shrink the shadow economy by preventing tax evasion, tax authorities need to advance efforts to secure more meaningful and accurate information. The data provided by general citizens can also prove important; therefore, it will be effective to provide citizens with a means to easily report via easily accessible SNSs (such as Twitter or Facebook) on business owners who evade taxes through various means such as cash discounts or remittances to overseas bank accounts.

For successful utilization of Big Data, resources, (i.e., various data sets), technology, (i.e., the ability to have and utilize Big Data platforms), and personnel (i.e., Big Data scientists or curators) are needed. In this regard: (1) Korea has advanced information technology. (2) It has transformed substantial documented information including public information into computerized data. (3) The increasing use of smart devices is enabling the generation and retention of enormous amounts of unstructured information. However, Korea’s Big Data industry is not yet fully developed as compared to that of advanced countries - it lacks the required technology and personnel. However, Big Data is likely to be used in more and more cases not only in Korea’s public sector but also in the private sector.

If Korea refers to overseas cases where Big Data have been successfully utilized in securing more tax revenue and uses Big Data in solving problems related to the shadow economy, it could achieve many things that could not otherwise be implemented owing to the lack of sufficient personnel and enormous cost. The measures proposed in this paper, namely, setting benchmarks for the self-employed based on Big Data, preventing tax evasion through prediction modeling based on past data, and preventing new types of tax evasion through analysis of SNSs will be useful in shrinking the shadow economy of the self-employed.

Of course, issues such as use of personal information or being watched by “big brother” associated with Big Data analysis must be addressed. It is therefore necessary to develop a national consensus as to how and to what extent Big Data can be used for the public interest. Furthermore, planning must be accompanied by legal measures wherever necessary.

[Figure 1]

Flow Chart of the Shrinking Shadow Economy of the Self-Employed Based On Big Data


Ⅴ. Conclusions

For successful utilization of Big Data, resources, (i.e., various data sets), technology, (i.e., the ability to have and utilize Big Data platforms), and personnel (i.e., Big Data scientists or curators) are needed. In this regard: (1) Korea has advanced information technology. (2) It has transformed substantial documented information including public information into computerized data. (3) The increasing use of smart devices is enabling the generation and retention of enormous amounts of unstructured information. However, Korea’s Big Data industry is not yet fully developed as compared to that of advanced countries—it lacks the required technology and personnel. However, Big Data is likely to be used in more and more cases not only in Korea’s public sector but also in the private sector.

If Korea refers to overseas cases where Big Data have been successfully utilized in securing more tax revenue and uses Big Data in solving problems related to the shadow economy, it could achieve many things that could not otherwise be implemented owing to the lack of sufficient personnel and enormous cost. The measures proposed in this paper, namely, setting benchmarks for the self-employed based on Big Data, preventing tax evasion through prediction modeling based on past data, and preventing new types of tax evasion through analysis of SNSs will be useful in shrinking the shadow economy of the self-employed.

Of course, issues such as use of personal information or being watched by “big brother” associated with Big Data analysis must be addressed. It is therefore necessary to develop a national consensus as to how and to what extent Big Data can be used for the public interest. Furthermore, planning must be accompanied by legal measures wherever necessary.

References

  • Schneider, F., & A. Buehn, (2012), “Shadow economies in highly developed OECD countries: What are the driving forces?”, IZA DP, 6891.
  • Kim, M., (2013), “Solutions to Shrink the Shadow Economy,”, Hyundai Research Institute Monthly Economic Review, p527.
  • McKinsey Global Institute, (2012, “Big Data: The next frontier for innovation, competition, and productivity,”, Jun).
  • The Australian Government Inspector-General of Taxation (IGT), (2012, “Review into the ATO’s use of benchmarking to target the cash economy,”, July).
  • IBM, (2011, “New York State Tax: How predictive modeling improves tax revenues and citizen equity,”, Smarter Planet Leadership Series, IBM Corporation, March).
  • Electronic Tax Administration Advisory Committee (ETAAC), (2013, “Annual Report to Congress,” 3415, June).

[Figure 1]

[Figure 1]
Flow Chart of the Shrinking Shadow Economy of the Self-Employed Based On Big Data

Contents

ABSTRACT
Ⅰ. Introduction
Ⅱ. Current Status of Korea’s Shadow Economy and Related Issues
Ⅲ. Concept of Big Data and Case Studies: International Examples
Ⅳ. Measures to Shrink the Shadow Economy of the Self-Employed by Using Big Data
Ⅴ. Conclusion
References
국문초록

<Table 1>

Shadow Economy Ratio of Major OECD Nations as a Percent of GDP (Unit: %)

2006 2007 2008 2009 2010 Average
Source: Friedrich Schneider and Andreas Buehn, “Shadow economies in highly developed OECD countries: What are the driving forces?” IZA DP No.6891, 2012.10.
Korea Rep. 25.9 25.8 25.6 24.5 24.7 25.3
Australia 13.7 13.7 13.2 13.5 13.4 13.5
United Kingdom 12.3 12.4 12.1 12.9 12.0 12.3
United States 8.4 8.6 8.6 9.3 9.1 8.8
Average 19.6 19.3 19.2 18.3 18.3 18.9

<Table 2>

Relative Ratio of Variables Affecting the Shadow Economy of Major OECD Nations (Unit: %)

Personal
Income Tax
Indirect
Taxes
Tax
Morale
Unemploy
-ment
Self-employ
ment
GDP
Growth
Business
Freedom
Source: Friedrich Schneider and Andreas Buehn, “Shadow economies in highly developed OECD countries: What are the driving forces?” IZA DP No.6891, 2012.10.
Korea Rep. 5.7 27.3 3.4 9.8 44.3 1.4 8.0
Australia 21.3 25.4 7.4 15.8 19.3 0.9 9.9
United Kingdom 18.2 30.8 8.1 14.3 18.0 0.6 9.9
United States 27.5 5.1 13.2 22.0 16.0 0.9 15.4
Average 13.1 29.4 9.5 16.9 22.2 0.9 8.1

<Table 3>

Using Big Data to Shrink the Shadow Economy of the Self-Employed

As Is To Be To Do
Structured, limited
information
Structured, unstructured, and
comprehensive data
Secure accurate and meaningful data
Update/manage data constantly
Observe entire
businesses
Observe certain individual
businesses
Classify various businesses per industry
Customized detection of tax evasion per
classified industry
Reactive
problem-solving
Proactive prediction &
prevention
Design tax evasion prediction model
Detect new tax evasion attempts

<Table 4>

Example of 2014 Benchmarks for the Self-Employed per Industry (Basic) – Confectionaries (Unit: 1000 Won)

Sales ~ 100,000 100,000 ~ 500,000 500,000 ~
COG / Sales 31% ~ 40% 35% ~ 41% 33% ~ 40%
Average 36% 38% 36%
Total Expenditure / Sales
(Big Cities)
(Small/Medium Cities)
76% ~ 86%
72% ~ 82%
84% ~ 91%
80% ~ 87%
86% ~ 93%
82% ~ 89%
Average 81% 88% 89%

<Table 5>

Example of 2013 Benchmarks for the Self-Employed per Industry (Total Expenditure) – Confectionaries (Unit: 1000 Won)

Sales ~ 100,000 100,000 ~ 500,000 500,000 ~
Overhead / Sales 31% ~ 40% 35% ~ 41% 33% ~ 40%
Average 36% 38% 36%
Rental / Sales
(Big Cities)
(Small/Medium Cities)
76% ~ 86%
72% ~ 82%
84% ~ 91%
80% ~ 87%
86% ~ 93%
82% ~ 89%
Transportation / Sales 81% 88% 89%