82 research outputs found
A Latent Source Model for Nonparametric Time Series Classification
For classifying time series, a nearest-neighbor approach is widely used in
practice with performance often competitive with or better than more elaborate
methods such as neural networks, decision trees, and support vector machines.
We develop theoretical justification for the effectiveness of
nearest-neighbor-like classification of time series. Our guiding hypothesis is
that in many applications, such as forecasting which topics will become trends
on Twitter, there aren't actually that many prototypical time series to begin
with, relative to the number of time series we have access to, e.g., topics
become trends on Twitter only in a few distinct manners whereas we can collect
massive amounts of Twitter data. To operationalize this hypothesis, we propose
a latent source model for time series, which naturally leads to a "weighted
majority voting" classification rule that can be approximated by a
nearest-neighbor classifier. We establish nonasymptotic performance guarantees
of both weighted majority voting and nearest-neighbor classification under our
model accounting for how much of the time series we observe and the model
complexity. Experimental results on synthetic data show weighted majority
voting achieving the same misclassification rate as nearest-neighbor
classification while observing less of the time series. We then use weighted
majority to forecast which news topics on Twitter become trends, where we are
able to detect such "trending topics" in advance of Twitter 79% of the time,
with a mean early advantage of 1 hour and 26 minutes, a true positive rate of
95%, and a false positive rate of 4%.Comment: Advances in Neural Information Processing Systems (NIPS 2013
Novel nonparametric method for classifying time series
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (pages 67-68).In supervised classification, one attempts to learn a model of how objects map to labels by selecting the best model from some model space. The choice of model space encodes assumptions about the problem. We propose a setting for model specification and selection in supervised learning based on a latent source model. In this setting, we specify the model by a small collection of unknown latent sources and posit that there is a stochastic model relating latent sources and observations. With this setting in mind, we propose a nonparametric classification method that is entirely unaware of the structure of these latent sources. Instead, our method relies on the data as a proxy for the unknown latent sources. We perform classification by computing the conditional class probabilities for an observation based on our stochastic model. This approach has an appealing and natural interpretation - that an observation belongs to a certain class if it sufficiently resembles other examples of that class. We extend this approach to the problem of online time series classification. In the binary case, we derive an estimator for online signal detection and an associated implementation that is simple, efficient, and scalable. We demonstrate the merit of our approach by applying it to the task of detecting trending topics on Twitter. Using a small sample of Tweets, our method can detect trends before Twitter does 79% of the time, with a mean early advantage of 1.43 hours, while maintaining a 95% true positive rate and a 4% false positive rate. In addition, our method provides the flexibility to perform well under a variety of tradeoffs between types of error and relative detection time.by Stanislav Nikolov.M. Eng
Probabilistic Model of Onset Detection Explains Paradoxes in Human Time Perception
A very basic computational model is proposed to explain two puzzling findings in the time perception literature. First, spontaneous motor actions are preceded by up to 1–2 s of preparatory activity (Kornhuber and Deecke, 1965). Yet, subjects are only consciously aware of about a quarter of a second of motor preparation (Libet et al., 1983). Why are they not aware of the early part of preparation? Second, psychophysical findings (Spence et al., 2001) support the principle of attention prior entry (Titchener, 1908), which states that attended stimuli are perceived faster than unattended stimuli. However, electrophysiological studies reported no or little corresponding temporal difference between the neural signals for attended and unattended stimuli (McDonald et al., 2005; Vibell et al., 2007). We suggest that the key to understanding these puzzling findings is to think of onset detection in probabilistic terms. The two apparently paradoxical phenomena are naturally predicted by our signal detection theoretic model
An abnormally enlarged frontal sinus - a case of pneumosinus dilatans
During routine autopsy of a 62-y-old female cadaver, an unusually enlarged frontal sinus was observed. The sinus was abnormally over-developed in both width and height, as the sinus cavity spreads deeply into the frontal tubera. Numerous septa divided the sinus cavity. Because of the obvious dilation of the frontal sinus and the lack of localized bone destruction and hyperostosis, a rare condition called `pneumosinus dilatans` probably occurs in this interesting case
Recommended from our members
An automated system for continuous monitoring of CO2 geosequestration using multi-well offset VSP with permanent seismic sources and receivers: Stage 3 of the CO2CRC Otway Project
A permanent automated continuous seismic CO2 geosequestration monitoring system for was installed at CO2CRC Otway Project site (Victoria, Australia) in early 2020. The system is composed of five deviated ∼1600 m deep wells equipped with distributed acoustic sensing (DAS) acting as seismic receivers and nine seismic orbital vibrators (SOV) as seismic sources. DAS recording is performed continuously by three iDASv3 units. Each SOV operates for 2.5 h at a time, and hence all SOVs operating sequentially (during daytime only) produce in a single vintage every two days. Each vintage consists of 45 offset VSP transects covering predicted CO2 plume migration paths over ∼0.7 km2 area. An automated data processing implemented on-site reduces data size from ∼1.3 TB/day to ∼500 MB/day with the results transmitted to the office daily. The repeatability analysis based on pre-injection data (acquired from May to October 2020 before the injection start in December 2020) shows that variability of SOV performance is the main source of non-repeatability while borehole measurements are stable. An SOV waveform could reach NRMS value from 20 to 100 % within a few days. However, deconvolution of the seismograms with the waveform of the direct wave reduces the repeatability to within 10–15 % NRMS
Association of BMI, lipid-lowering medication, and age with prevalence of type 2 diabetes in adults with heterozygous familial hypercholesterolaemia: a worldwide cross-sectional study
Background: Statins are the cornerstone treatment for patients with heterozygous familial hypercholesterolaemia but research suggests it could increase the risk of type 2 diabetes in the general population. A low prevalence of type 2 diabetes was reported in some familial hypercholesterolaemia cohorts, raising the question of whether these patients are protected against type 2 diabetes. Obesity is a well known risk factor for the development of type 2 diabetes. We aimed to investigate the associations of known key determinants of type 2 diabetes with its prevalence in people with heterozygous familial hypercholesterolaemia. Methods: This worldwide cross-sectional study used individual-level data from the EAS FHSC registry and included adults older than 18 years with a clinical or genetic diagnosis of heterozygous familial hypercholesterolaemia who had data available on age, BMI, and diabetes status. Those with known or suspected homozygous familial hypercholesterolaemia and type 1 diabetes were excluded. The main outcome was prevalence of type 2 diabetes overall and by WHO region, and in relation to obesity (BMI ≥30·0 kg/m2) and lipid-lowering medication as predictors. The study population was divided into 12 risk categories based on age (tertiles), obesity, and receiving statins, and the risk of type 2 diabetes was investigated using logistic regression. Findings: Among 46 683 adults with individual-level data in the FHSC registry, 24 784 with heterozygous familial hypercholesterolaemia were included in the analysis from 44 countries. 19 818 (80%) had a genetically confirmed diagnosis of heterozygous familial hypercholesterolaemia. Type 2 diabetes prevalence in the total population was 5·7% (1415 of 24 784), with 4·1% (817 of 19 818) in the genetically diagnosed cohort. Higher prevalence of type 2 diabetes was observed in the Eastern Mediterranean (58 [29·9%] of 194), South-East Asia and Western Pacific (214 [12·0%] of 1785), and the Americas (166 [8·5%] of 1955) than in Europe (excluding the Netherlands; 527 [8·0%] of 6579). Advancing age, a higher BMI category (obesity and overweight), and use of lipid-lowering medication were associated with a higher risk of type 2 diabetes, independent of sex and LDL cholesterol. Among the 12 risk categories, the probability of developing type 2 diabetes was higher in people in the highest risk category (aged 55–98 years, with obesity, and receiving statins; OR 74·42 [95% CI 47·04–117·73]) than in those in the lowest risk category (aged 18–38 years, without obesity, and not receiving statins). Those who did not have obesity, even if they were in the upper age tertile and receiving statins, had lower risk of type 2 diabetes (OR 24·42 [15·57–38·31]). The corresponding results in the genetically diagnosed cohort were OR 65·04 (40·67–104·02) for those with obesity in the highest risk category and OR 20·07 (12·73–31·65) for those without obesity. Interpretation: Adults with heterozygous familial hypercholesterolaemia in most WHO regions have a higher type 2 diabetes prevalence than in Europe. Obesity markedly increases the risk of diabetes associated with age and use of statins in these patients. Our results suggest that heterozygous familial hypercholesterolaemia does not protect against type 2 diabetes, hence managing obesity is essential to reduce type 2 diabetes in this patient population. Funding: Pfizer, Amgen, MSD, Sanofi-Aventis, Daiichi-Sankyo, and Regeneron
Clinically applicable deep learning for diagnosis and referral in retinal disease
The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting
- …
