13 research outputs found
Coupled IGMM-GANs for improved generative adversarial anomaly detection
Detecting anomalies and outliers in data has a number of applications including hazard sensing, fraud detection, and systems management. While generative adversarial networks seem like a natural fit for addressing these challenges, we find that existing GAN based anomaly detection algorithms perform poorly due to their inability to handle multimodal patterns. For this purpose we introduce an infinite Gaussian mixture model coupled with (bi-directional) generative adversarial networks, IGMM-GAN, that facilitates multimodal anomaly detection. We illustrate our methodology and its improvement over existing GAN anomaly detection on the MNIST dataset
Data-Driven Approaches to NBA Team Evaluation and Building
Gemstone Team PROCESSIn the National Basketball Association (NBA), it has historically been difficult to
build and sustain a team that can consistently compete for championships. Given
this challenge, we have developed a series of analyses to support NBA teams in
making data-driven decisions. Relying on a variety of datasets, we examined
several facets related to the construction of NBA rosters and their performance. In
our analysis of on-court performance, we have used clustering algorithms to
classify teams in terms of play style, and determined which play styles tend to
lead to success. In our analysis of roster construction and transactions, we have
investigated the relative value of draft picks and the impact of trades involving
draft picks, as well as the effect of roster continuity (i.e. maintaining the same
players across seasons) on team success. Additionally, we have developed a
model for predicting player contract values and performance versus contract
value, which will help teams in identifying the most cost-effective players to
acquire. Ultimately, this assembly of analyses, in conjunction, can be used to
inform any NBA team’s decisions in its pursuit of success
Coupled IGMM-GANs with Applications to Anomaly Detection in Human Mobility Data
Detecting anomalous activity in human mobility data has a number of applications, including road hazard sensing, telematics-based insurance, and fraud detection in taxi services and ride sharing. In this article, we address two challenges that arise in the study of anomalous human trajectories: (1) a lack of ground truth data on what defines an anomaly and (2) the dependence of existing methods on significant pre-processing and feature engineering. Although generative adversarial networks (GANs) seem like a natural fit for addressing these challenges, we find that existing GAN-based anomaly detection algorithms perform poorly due to their inability to handle multimodal patterns. For this purpose, we introduce an infinite Gaussian mixture model coupled with (bidirectional) GANs—IGMM-GAN—that is able to generate synthetic, yet realistic, human mobility data and simultaneously facilitates multimodal anomaly detection. Through the estimation of a generative probability density on the space of human trajectories, we are able to generate realistic synthetic datasets that can be used to benchmark existing anomaly detection methods. The estimated multimodal density also allows for a natural definition of outlier that we use for detecting anomalous trajectories. We illustrate our methodology and its improvement over existing GAN anomaly detection on several human mobility datasets, along with MNIST
A Muffin-Theorem Generator
Consider the following FUN problem. Given m,s you want to divide m muffins among s students so that everyone gets m/(s) muffins; however, you want to maximize the minimum piece so that nobody gets crumbs. Let f(m,s) be the size of the smallest piece in an optimal procedure.
We study the case where ceil(2m/s)=3 because (1) many of our hardest open problems were of this form until we found this method, (2) we have used the technique to generate muffin-theorems, and (3) we conjecture this can be used to solve the general case. We give (1) an algorithm to find an upper bound for f(m,s) when ceil(2m/s)(and some ways to speed up that algorithm if certain conjectures are true), (2) an algorithm that uses the information from (1) to try to find a lower bound on f(m,s) (a procedure) which matches the upper bound, (3) an algorithm that uses the information from (1) to generate muffin-theorems, and (4) an algorithm that we think works well in practice to find f(m,s) for any m,s