8,064 research outputs found
6 Access Methods and Query Processing Techniques
The performance of a database management system (DBMS) is fundamentally dependent on the access methods and query processing techniques available to the system. Traditionally, relational DBMSs have relied on well-known access methods, such as the ubiquitous B +-tree, hashing with chaining, and, in som
Protein Tertiary Model Assessment Using Granular Machine Learning Techniques
The automatic prediction of protein three dimensional structures from its amino acid sequence has become one of the most important and researched fields in bioinformatics. As models are not experimental structures determined with known accuracy but rather with prediction it’s vital to determine estimates of models quality. We attempt to solve this problem using machine learning techniques and information from both the sequence and structure of the protein. The goal is to generate a machine that understands structures from PDB and when given a new model, predicts whether it belongs to the same class as the PDB structures (correct or incorrect protein models). Different subsets of PDB (protein data bank) are considered for evaluating the prediction potential of the machine learning methods. Here we show two such machines, one using SVM (support vector machines) and another using fuzzy decision trees (FDT). First using a preliminary encoding style SVM could get around 70% in protein model quality assessment accuracy, and improved Fuzzy Decision Tree (IFDT) could reach above 80% accuracy. For the purpose of reducing computational overhead multiprocessor environment and basic feature selection method is used in machine learning algorithm using SVM.
Next an enhanced scheme is introduced using new encoding style. In the new style, information like amino acid substitution matrix, polarity, secondary structure information and relative distance between alpha carbon atoms etc is collected through spatial traversing of the 3D structure to form training vectors. This guarantees that the properties of alpha carbon atoms that are close together in 3D space and thus interacting are used in vector formation. With the use of fuzzy decision tree, we obtained a training accuracy around 90%. There is significant improvement compared to previous encoding technique in prediction accuracy and execution time. This outcome motivates to continue to explore effective machine learning algorithms for accurate protein model quality assessment.
Finally these machines are tested using CASP8 and CASP9 templates and compared with other CASP competitors, with promising results. We further discuss the importance of model quality assessment and other information from proteins that could be considered for the same
Automated identification of river hydromorphological features using UAV high resolution aerial imagery
European legislation is driving the development of methods for river ecosystem protection in light of concerns over water quality and ecology. Key to their success is the accurate and rapid characterisation of physical features (i.e., hydromorphology) along the river. Image pattern recognition techniques have been successfully used for this purpose. The reliability of the methodology depends on both the quality of the aerial imagery and the pattern recognition technique used. Recent studies have proved the potential of Unmanned Aerial Vehicles (UAVs) to increase the quality of the imagery by capturing high resolution photography. Similarly, Artificial Neural Networks (ANN) have been shown to be a high precision tool for automated recognition of environmental patterns. This paper presents a UAV based framework for the identification of hydromorphological features from high resolution RGB aerial imagery using a novel classification technique based on ANNs. The framework is developed for a 1.4 km river reach along the river Dee in Wales, United Kingdom. For this purpose, a Falcon 8 octocopter was used to gather 2.5 cm resolution imagery. The results show that the accuracy of the framework is above 81%, performing particularly well at recognising vegetation. These results leverage the use of UAVs for environmental policy implementation and demonstrate the potential of ANNs and RGB imagery for high precision river monitoring and river management
Is Big Data Sufficient for a Reliable Detection of Non-Technical Losses?
Non-technical losses (NTL) occur during the distribution of electricity in
power grids and include, but are not limited to, electricity theft and faulty
meters. In emerging countries, they may range up to 40% of the total
electricity distributed. In order to detect NTLs, machine learning methods are
used that learn irregular consumption patterns from customer data and
inspection results. The Big Data paradigm followed in modern machine learning
reflects the desire of deriving better conclusions from simply analyzing more
data, without the necessity of looking at theory and models. However, the
sample of inspected customers may be biased, i.e. it does not represent the
population of all customers. As a consequence, machine learning models trained
on these inspection results are biased as well and therefore lead to unreliable
predictions of whether customers cause NTL or not. In machine learning, this
issue is called covariate shift and has not been addressed in the literature on
NTL detection yet. In this work, we present a novel framework for quantifying
and visualizing covariate shift. We apply it to a commercial data set from
Brazil that consists of 3.6M customers and 820K inspection results. We show
that some features have a stronger covariate shift than others, making
predictions less reliable. In particular, previous inspections were focused on
certain neighborhoods or customer classes and that they were not sufficiently
spread among the population of customers. This framework is about to be
deployed in a commercial product for NTL detection.Comment: Proceedings of the 19th International Conference on Intelligent
System Applications to Power Systems (ISAP 2017
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Computerized Analysis of Magnetic Resonance Images to Study Cerebral Anatomy in Developing Neonates
The study of cerebral anatomy in developing neonates is of great importance for
the understanding of brain development during the early period of life. This
dissertation therefore focuses on three challenges in the modelling of cerebral
anatomy in neonates during brain development. The methods that have been
developed all use Magnetic Resonance Images (MRI) as source data.
To facilitate study of vascular development in the neonatal period, a set of image
analysis algorithms are developed to automatically extract and model cerebral
vessel trees. The whole process consists of cerebral vessel tracking from
automatically placed seed points, vessel tree generation, and vasculature
registration and matching. These algorithms have been tested on clinical Time-of-
Flight (TOF) MR angiographic datasets.
To facilitate study of the neonatal cortex a complete cerebral cortex segmentation
and reconstruction pipeline has been developed. Segmentation of the neonatal
cortex is not effectively done by existing algorithms designed for the adult brain
because the contrast between grey and white matter is reversed. This causes pixels
containing tissue mixtures to be incorrectly labelled by conventional methods. The
neonatal cortical segmentation method that has been developed is based on a novel
expectation-maximization (EM) method with explicit correction for mislabelled
partial volume voxels. Based on the resulting cortical segmentation, an implicit
surface evolution technique is adopted for the reconstruction of the cortex in
neonates. The performance of the method is investigated by performing a detailed
landmark study.
To facilitate study of cortical development, a cortical surface registration algorithm
for aligning the cortical surface is developed. The method first inflates extracted
cortical surfaces and then performs a non-rigid surface registration using free-form
deformations (FFDs) to remove residual alignment. Validation experiments using
data labelled by an expert observer demonstrate that the method can capture local
changes and follow the growth of specific sulcus
A Deep Belief Network and Case Reasoning Based Decision Model for Emergency Rescue
The frequent occurrence of major public emergencies in China has caused significant human and economic losses. To carry out successful rescue operations in such emergencies, decisions need to be made as efficiently as possible. Using earthquakes as an example of a public emergency, this paper combines the Deep Belief Network (DBN) and Case-Based Reasoning (CBR) models to improve the case representation and case retrieval steps in the decision-making process, then designs and constructs a decision-making model. The validity of the model is then verified by an example. The results of this study can be applied to maximize the efficiency of emergency rescue decisions
- …