49 research outputs found

    A Latent Source Model for Nonparametric Time Series Classification

    Full text link
    For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%.Comment: Advances in Neural Information Processing Systems (NIPS 2013

    Novel nonparametric method for classifying time series

    Get PDF
    Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (pages 67-68).In supervised classification, one attempts to learn a model of how objects map to labels by selecting the best model from some model space. The choice of model space encodes assumptions about the problem. We propose a setting for model specification and selection in supervised learning based on a latent source model. In this setting, we specify the model by a small collection of unknown latent sources and posit that there is a stochastic model relating latent sources and observations. With this setting in mind, we propose a nonparametric classification method that is entirely unaware of the structure of these latent sources. Instead, our method relies on the data as a proxy for the unknown latent sources. We perform classification by computing the conditional class probabilities for an observation based on our stochastic model. This approach has an appealing and natural interpretation - that an observation belongs to a certain class if it sufficiently resembles other examples of that class. We extend this approach to the problem of online time series classification. In the binary case, we derive an estimator for online signal detection and an associated implementation that is simple, efficient, and scalable. We demonstrate the merit of our approach by applying it to the task of detecting trending topics on Twitter. Using a small sample of Tweets, our method can detect trends before Twitter does 79% of the time, with a mean early advantage of 1.43 hours, while maintaining a 95% true positive rate and a 4% false positive rate. In addition, our method provides the flexibility to perform well under a variety of tradeoffs between types of error and relative detection time.by Stanislav Nikolov.M. Eng

    Probabilistic Model of Onset Detection Explains Paradoxes in Human Time Perception

    Get PDF
    A very basic computational model is proposed to explain two puzzling findings in the time perception literature. First, spontaneous motor actions are preceded by up to 1–2 s of preparatory activity (Kornhuber and Deecke, 1965). Yet, subjects are only consciously aware of about a quarter of a second of motor preparation (Libet et al., 1983). Why are they not aware of the early part of preparation? Second, psychophysical findings (Spence et al., 2001) support the principle of attention prior entry (Titchener, 1908), which states that attended stimuli are perceived faster than unattended stimuli. However, electrophysiological studies reported no or little corresponding temporal difference between the neural signals for attended and unattended stimuli (McDonald et al., 2005; Vibell et al., 2007). We suggest that the key to understanding these puzzling findings is to think of onset detection in probabilistic terms. The two apparently paradoxical phenomena are naturally predicted by our signal detection theoretic model

    An abnormally enlarged frontal sinus - a case of pneumosinus dilatans

    Get PDF
    During routine autopsy of a 62-y-old female cadaver, an unusually enlarged frontal sinus was observed. The sinus was abnormally over-developed in both width and height, as the sinus cavity spreads deeply into the frontal tubera. Numerous septa divided the sinus cavity. Because of the obvious dilation of the frontal sinus and the lack of localized bone destruction and hyperostosis, a rare condition called `pneumosinus dilatans` probably occurs in this interesting case

    Clinically applicable deep learning for diagnosis and referral in retinal disease

    Get PDF
    The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting
    corecore