9,238 research outputs found
Recommended from our members
Multi-class protein fold classification using a new ensemble machine learning approach.
Protein structure classification represents an important process in understanding the associations
between sequence and structure as well as possible functional and evolutionary relationships.
Recent structural genomics initiatives and other high-throughput experiments have populated the
biological databases at a rapid pace. The amount of structural data has made traditional methods
such as manual inspection of the protein structure become impossible. Machine learning has been
widely applied to bioinformatics and has gained a lot of success in this research area. This work
proposes a novel ensemble machine learning method that improves the coverage of the classifiers
under the multi-class imbalanced sample sets by integrating knowledge induced from different base
classifiers, and we illustrate this idea in classifying multi-class SCOP protein fold data. We have
compared our approach with PART and show that our method improves the sensitivity of the
classifier in protein fold classification. Furthermore, we have extended this method to learning over
multiple data types, preserving the independence of their corresponding data sources, and show
that our new approach performs at least as well as the traditional technique over a single joined
data source. These experimental results are encouraging, and can be applied to other bioinformatics
problems similarly characterised by multi-class imbalanced data sets held in multiple data
sources
Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network
The first step in identifying fruits on trees is to develop garden robots for different purposes
such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit
orchards and the unevenness of the various objects throughout it, usage of the controlled conditions
is very difficult. As a result, these operations should be performed in natural conditions, both
in light and in the background. Due to the dependency of other garden robot operations on the
fruit identification stage, this step must be performed precisely. Therefore, the purpose of this
paper was to design an identification algorithm in orchard conditions using a combination of video
processing and majority voting based on different hybrid artificial neural networks. The different
steps of designing this algorithm were: (1) Recording video of different plum orchards at different
light intensities; (2) converting the videos produced into its frames; (3) extracting different color
properties from pixels; (4) selecting effective properties from color extraction properties using
hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority
voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial
neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly
algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third
channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue
(LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation
intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution
and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation
criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled
“Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant
number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio
Using Topological Data Analysis for diagnosis pulmonary embolism
Pulmonary Embolism (PE) is a common and potentially lethal condition. Most
patients die within the first few hours from the event. Despite diagnostic
advances, delays and underdiagnosis in PE are common.To increase the diagnostic
performance in PE, current diagnostic work-up of patients with suspected acute
pulmonary embolism usually starts with the assessment of clinical pretest
probability using plasma d-Dimer measurement and clinical prediction rules. The
most validated and widely used clinical decision rules are the Wells and Geneva
Revised scores. We aimed to develop a new clinical prediction rule (CPR) for PE
based on topological data analysis and artificial neural network. Filter or
wrapper methods for features reduction cannot be applied to our dataset: the
application of these algorithms can only be performed on datasets without
missing data. Instead, we applied Topological data analysis (TDA) to overcome
the hurdle of processing datasets with null values missing data. A topological
network was developed using the Iris software (Ayasdi, Inc., Palo Alto). The PE
patient topology identified two ares in the pathological group and hence two
distinct clusters of PE patient populations. Additionally, the topological
netowrk detected several sub-groups among healthy patients that likely are
affected with non-PE diseases. TDA was further utilized to identify key
features which are best associated as diagnostic factors for PE and used this
information to define the input space for a back-propagation artificial neural
network (BP-ANN). It is shown that the area under curve (AUC) of BP-ANN is
greater than the AUCs of the scores (Wells and revised Geneva) used among
physicians. The results demonstrate topological data analysis and the BP-ANN,
when used in combination, can produce better predictive models than Wells or
revised Geneva scores system for the analyzed cohortComment: 18 pages, 5 figures, 6 tables. arXiv admin note: text overlap with
arXiv:cs/0308031 by other authors without attributio
Cross-modal Recurrent Models for Weight Objective Prediction from Multimodal Time-series Data
We analyse multimodal time-series data corresponding to weight, sleep and
steps measurements. We focus on predicting whether a user will successfully
achieve his/her weight objective. For this, we design several deep long
short-term memory (LSTM) architectures, including a novel cross-modal LSTM
(X-LSTM), and demonstrate their superiority over baseline approaches. The
X-LSTM improves parameter efficiency by processing each modality separately and
allowing for information flow between them by way of recurrent
cross-connections. We present a general hyperparameter optimisation technique
for X-LSTMs, which allows us to significantly improve on the LSTM and a prior
state-of-the-art cross-modal approach, using a comparable number of parameters.
Finally, we visualise the model's predictions, revealing implications about
latent variables in this task.Comment: To appear in NIPS ML4H 2017 and NIPS TSW 201
- …