8 research outputs found
Design Principles for Robust Fraud Detection: The Case of Stock Market Manipulations
We address the challenge of building an automated fraud detection system with robust classifiers that mitigate countermeasures from fraudsters in the field of information-based securities fraud. Our work involves developing design principles for robust fraud detection systems and presenting corresponding design features. We adopt an instrumentalist perspective that relies on theory-based linguistic features and ensemble learning concepts as justificatory knowledge for building robust classifiers. We perform a naive evaluation that assesses the classifiers’ performance to identify suspicious stock recommendations, and a robustness evaluation with a simulation that demonstrates a response to fraudster countermeasures. The results indicate that the use of theory-based linguistic features and ensemble learning can significantly increase the robustness of classifiers and contribute to the effectiveness of robust fraud detection. We discuss implications for supervisory authorities, industry, and individual users
Recommended from our members
Tax Gap: IRS Can Improve Efforts to Address Tax Evasion by Networks of Businesses and Related Entities
A letter report issued by the Government Accountability Office with an abstract that begins "A taxpayer can control a group of related entities--such as trusts, corporations, or partnerships--in a network. These networks can serve a variety of legitimate business purposes, but they also can be used in complex tax evasion schemes that are difficult for the Internal Revenue Service (IRS) to identify. GAO was asked to (1) describe what IRS knows about network tax evasion and how well IRS's traditional enforcement programs address it and (2) assess IRS's progress in addressing network tax evasion and opportunities, if any, for making further progress. To do this, GAO reviewed relevant documentation about IRS programs and interviewed appropriate officials about those programs and IRS's plans for addressing such tax evasion. GAO also interviewed relevant experts and agency officials in developing criteria needed to perform the assessment.
Relational Data Pre-Processing Techniques for Improved Securities Fraud Detection
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarely structured appropriately for knowledge discovery, and they often contain variables whose meanings change across different subsets of the data. We describe how these challenges were addressed in a collaborative analysis project undertaken by the University of Massachusetts Amherst and the National Association of Securities Dealers (NASD). We describe several methods for data preprocessing that we applied to transform a large, dynamic, and relational dataset describing nearly the entirety of the U.S. securities industry, and we show how these methods made the dataset suitable for learning statistical relational models. To better utilize social structure, we first applied known consolidation and link formation techniques to associate individuals with branch office locations. In addition, we developed an innovative technique to infer professional associations by exploiting dynamic employment histories. Finally, we applied normalization techniques to create a suitable class label that adjusts for spatial, temporal, and other heterogeneity within the data. We show how these pre-processing techniques combine to provide the necessary foundation for learning high-performing statistical models of fraudulent activity