199,267 research outputs found
Tolerance analysis approach based on the classification of uncertainty (aleatory / epistemic)
Uncertainty is ubiquitous in tolerance analysis problem. This paper deals with tolerance analysis formulation, more particularly, with the uncertainty which is necessary to take into account into the foundation of this formulation. It presents: a brief view of the uncertainty classification: Aleatory uncertainty comes from the inherent uncertain nature and phenomena, and epistemic uncertainty comes from the lack of knowledge, a formulation of the tolerance analysis problem based on this classification, its development: Aleatory uncertainty is modeled by probability distributions while epistemic uncertainty is modeled by intervals; Monte Carlo simulation is employed for probabilistic analysis while nonlinear optimization is used for interval analysis.“AHTOLA” project (ANR-11- MONU-013
Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
PURPOSE: The medical literature relevant to germline genetics is growing
exponentially. Clinicians need tools monitoring and prioritizing the literature
to understand the clinical implications of the pathogenic genetic variants. We
developed and evaluated two machine learning models to classify abstracts as
relevant to the penetrance (risk of cancer for germline mutation carriers) or
prevalence of germline genetic mutations. METHODS: We conducted literature
searches in PubMed and retrieved paper titles and abstracts to create an
annotated dataset for training and evaluating the two machine learning
classification models. Our first model is a support vector machine (SVM) which
learns a linear decision rule based on the bag-of-ngrams representation of each
title and abstract. Our second model is a convolutional neural network (CNN)
which learns a complex nonlinear decision rule based on the raw title and
abstract. We evaluated the performance of the two models on the classification
of papers as relevant to penetrance or prevalence. RESULTS: For penetrance
classification, we annotated 3740 paper titles and abstracts and used 60% for
training the model, 20% for tuning the model, and 20% for evaluating the model.
The SVM model achieves 89.53% accuracy (percentage of papers that were
correctly classified) while the CNN model achieves 88.95 % accuracy. For
prevalence classification, we annotated 3753 paper titles and abstracts. The
SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 %
accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts
as relevant to penetrance or prevalence. By facilitating literature review,
this tool could help clinicians and researchers keep abreast of the burgeoning
knowledge of gene-cancer associations and keep the knowledge bases for clinical
decision support tools up to date
Construction of direction selectivity in V1: from simple to complex cells
Despite detailed knowledge about the anatomy and physiology of the primary visual cortex (V1), the immense number of feed-forward and recurrent connections onto a given V1 neuron make it difficult to understand how the physiological details relate to a given neuron’s functional properties. Here, we focus on a well-known functional property of many V1 complex cells: phase-invariant direction selectivity (DS). While the energy model explains its construction at the conceptual level, it remains unclear how the mathematical operations described in this model are implemented by cortical circuits. To understand how DS of complex cells is constructed in cortex, we apply a nonlinear modeling framework to extracellular data from macaque V1. We use a modification of spike-triggered covariance (STC) analysis to identify multiple biologically plausible "spatiotemporal features" that either excite or suppress a cell. We demonstrate that these features represent the true inputs to the neuron more accurately, and the resulting nonlinear model compactly describes how these inputs are combined to result in the functional properties of the cell. In a population of 59 neurons, we find that both simple and complex V1 cells are selective to combinations of excitatory and suppressive motion features. Because the strength of DS and simple/complex classification is well predicted by our models, we can use simulations with inputs matching thalamic and simple cells to assess how individual model components contribute to these measures. Our results unify experimental observations regarding the construction of DS from thalamic feed-forward inputs to V1: based on the differences between excitatory and inhibitory inputs, they suggest a connectivity diagram for simple and complex cells that sheds light on the mechanism underlying the DS of cortical cells. More generally, they illustrate how stage-wise nonlinear combination of multiple features gives rise to the processing of more abstract visual information
Narrative structure analysis with education and training videos for e-learning
This paper deals with the problem ofstructuralizing education and training videos for high-level semantics extraction and nonlinear media presentation in e-learning applications. Drawing guidance from production knowledge in instructional media, we propose six main narrative structures employed in education and training videos for both motivation and demonstration during learning and practical training. We devise a powerful audiovisual feature set, accompanied by a hierarchical decision tree-based classification system to determine and discriminate between these structures. Based on a two-liered hierarchical model, we demonstrate that we can achieve an accuracy of 84.7% on a comprehensive set of education and training video data.<br /
Highly comparative feature-based time-series classification
A highly comparative, feature-based approach to time series classification is
introduced that uses an extensive database of algorithms to extract thousands
of interpretable features from time series. These features are derived from
across the scientific time-series analysis literature, and include summaries of
time series in terms of their correlation structure, distribution, entropy,
stationarity, scaling properties, and fits to a range of time-series models.
After computing thousands of features for each time series in a training set,
those that are most informative of the class structure are selected using
greedy forward feature selection with a linear classifier. The resulting
feature-based classifiers automatically learn the differences between classes
using a reduced number of time-series properties, and circumvent the need to
calculate distances between time series. Representing time series in this way
results in orders of magnitude of dimensionality reduction, allowing the method
to perform well on very large datasets containing long time series or time
series of different lengths. For many of the datasets studied, classification
performance exceeded that of conventional instance-based classifiers, including
one nearest neighbor classifiers using Euclidean distances and dynamic time
warping and, most importantly, the features selected provide an understanding
of the properties of the dataset, insight that can guide further scientific
investigation
- …