49 research outputs found
A Latent Source Model for Nonparametric Time Series Classification
For classifying time series, a nearest-neighbor approach is widely used in
practice with performance often competitive with or better than more elaborate
methods such as neural networks, decision trees, and support vector machines.
We develop theoretical justification for the effectiveness of
nearest-neighbor-like classification of time series. Our guiding hypothesis is
that in many applications, such as forecasting which topics will become trends
on Twitter, there aren't actually that many prototypical time series to begin
with, relative to the number of time series we have access to, e.g., topics
become trends on Twitter only in a few distinct manners whereas we can collect
massive amounts of Twitter data. To operationalize this hypothesis, we propose
a latent source model for time series, which naturally leads to a "weighted
majority voting" classification rule that can be approximated by a
nearest-neighbor classifier. We establish nonasymptotic performance guarantees
of both weighted majority voting and nearest-neighbor classification under our
model accounting for how much of the time series we observe and the model
complexity. Experimental results on synthetic data show weighted majority
voting achieving the same misclassification rate as nearest-neighbor
classification while observing less of the time series. We then use weighted
majority to forecast which news topics on Twitter become trends, where we are
able to detect such "trending topics" in advance of Twitter 79% of the time,
with a mean early advantage of 1 hour and 26 minutes, a true positive rate of
95%, and a false positive rate of 4%.Comment: Advances in Neural Information Processing Systems (NIPS 2013
Novel nonparametric method for classifying time series
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (pages 67-68).In supervised classification, one attempts to learn a model of how objects map to labels by selecting the best model from some model space. The choice of model space encodes assumptions about the problem. We propose a setting for model specification and selection in supervised learning based on a latent source model. In this setting, we specify the model by a small collection of unknown latent sources and posit that there is a stochastic model relating latent sources and observations. With this setting in mind, we propose a nonparametric classification method that is entirely unaware of the structure of these latent sources. Instead, our method relies on the data as a proxy for the unknown latent sources. We perform classification by computing the conditional class probabilities for an observation based on our stochastic model. This approach has an appealing and natural interpretation - that an observation belongs to a certain class if it sufficiently resembles other examples of that class. We extend this approach to the problem of online time series classification. In the binary case, we derive an estimator for online signal detection and an associated implementation that is simple, efficient, and scalable. We demonstrate the merit of our approach by applying it to the task of detecting trending topics on Twitter. Using a small sample of Tweets, our method can detect trends before Twitter does 79% of the time, with a mean early advantage of 1.43 hours, while maintaining a 95% true positive rate and a 4% false positive rate. In addition, our method provides the flexibility to perform well under a variety of tradeoffs between types of error and relative detection time.by Stanislav Nikolov.M. Eng
Probabilistic Model of Onset Detection Explains Paradoxes in Human Time Perception
A very basic computational model is proposed to explain two puzzling findings in the time perception literature. First, spontaneous motor actions are preceded by up to 1–2 s of preparatory activity (Kornhuber and Deecke, 1965). Yet, subjects are only consciously aware of about a quarter of a second of motor preparation (Libet et al., 1983). Why are they not aware of the early part of preparation? Second, psychophysical findings (Spence et al., 2001) support the principle of attention prior entry (Titchener, 1908), which states that attended stimuli are perceived faster than unattended stimuli. However, electrophysiological studies reported no or little corresponding temporal difference between the neural signals for attended and unattended stimuli (McDonald et al., 2005; Vibell et al., 2007). We suggest that the key to understanding these puzzling findings is to think of onset detection in probabilistic terms. The two apparently paradoxical phenomena are naturally predicted by our signal detection theoretic model
An abnormally enlarged frontal sinus - a case of pneumosinus dilatans
During routine autopsy of a 62-y-old female cadaver, an unusually enlarged frontal sinus was observed. The sinus was abnormally over-developed in both width and height, as the sinus cavity spreads deeply into the frontal tubera. Numerous septa divided the sinus cavity. Because of the obvious dilation of the frontal sinus and the lack of localized bone destruction and hyperostosis, a rare condition called `pneumosinus dilatans` probably occurs in this interesting case
Clinically applicable deep learning for diagnosis and referral in retinal disease
The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting
Recommended from our members
The genetic history of the Southern Arc: a bridge between West Asia and Europe
By sequencing 727 ancient individuals from the Southern Arc (Anatolia and its neighbors in Southeastern Europe and West Asia) over 10,000 years, we contextualize its Chalcolithic period and Bronze Age (about 5000 to 1000 BCE), when extensive gene flow entangled it with the Eurasian steppe. Two streams of migration transmitted Caucasus and Anatolian/Levantine ancestry northward, and the Yamnaya pastoralists, formed on the steppe, then spread southward into the Balkans and across the Caucasus into Armenia, where they left numerous patrilineal descendants. Anatolia was transformed by intra–West Asian gene flow, with negligible impact of the later Yamnaya migrations. This contrasts with all other regions where Indo-European languages were spoken, suggesting that the homeland of the Indo-Anatolian language family was in West Asia, with only secondary dispersals of non-Anatolian Indo-Europeans from the steppe