Search CORE

1,080 research outputs found

Hierarchical Classification of Scientific Taxonomies with Autonomous Underwater Vehicles

Author: Bewley Michael Stuart
Publication venue: Faculty of Engineering and Information Technologies, School of Aerospace, Mechanical and Mechatronic Engineering
Publication date: 01/01/2016
Field of study

Autonomous Underwater Vehicles (AUVs) have catalysed a significant shift in the way marine habitats are studied. It is now possible to deploy an AUV from a ship, and capture tens of thousands of georeferenced images in a matter of hours. There is a growing body of research investigating ways to automatically apply semantic labels to this data, with two goals. The task of manually labelling a large number of images is time consuming and error prone. Further, there is the potential to change AUV surveys from being geographically defined (based on a pre-planned route), to permitting the AUV to adapt the mission plan in response to semantic observations. This thesis focusses on frameworks that permit a unified machine learning approach with applicability to a wide range of geographic areas, and diverse areas of interest for marine scientists. This can be addressed through the use of hierarchical classification; in which machine learning algorithms are trained to predict not just a binary or multi-class outcome, but a hierarchy of related output labels which are not mutually exclusive, such as a scientific taxonomy. In order to investigate classification on larger hierarchies with greater geographic diversity, the BENTHOZ-2015 data set was assembled as part of a collaboration between five Australian research groups. Existing labelled data was re-mapped to the CATAMI hierarchy, in total more than 400,000 point labels, conforming to a hierarchy of around 150 classes. The common hierarchical classification approach of building a network of binary classifiers was applied to the BENTHOZ-2015 data set, and a novel application of Bayesian Network theory and probability calibration was used as a theoretical foundation for the approach, resulting in improved classifier performance. This was extended to a more complex hidden node Bayesian Network structure, which permits inclusion of additional sensor modalities, and tuning for better performance in particular geographic regions

Sydney eScholarship

Automatic synthesis of fuzzy systems: An evolutionary overview with a genetic programming perspective

Author: Koshiyama AS
Tanscheit R
Vellasco MMBR
Publication venue: WILEY PERIODICALS, INC
Publication date: 01/03/2019
Field of study

Studies in Evolutionary Fuzzy Systems (EFSs) began in the 90s and have experienced a fast development since then, with applications to areas such as pattern recognition, curve‐fitting and regression, forecasting and control. An EFS results from the combination of a Fuzzy Inference System (FIS) with an Evolutionary Algorithm (EA). This relationship can be established for multiple purposes: fine‐tuning of FIS's parameters, selection of fuzzy rules, learning a rule base or membership functions from scratch, and so forth. Each facet of this relationship creates a strand in the literature, as membership function fine‐tuning, fuzzy rule‐based learning, and so forth and the purpose here is to outline some of what has been done in each aspect. Special focus is given to Genetic Programming‐based EFSs by providing a taxonomy of the main architectures available, as well as by pointing out the gaps that still prevail in the literature. The concluding remarks address some further topics of current research and trends, such as interpretability analysis, multiobjective optimization, and synthesis of a FIS through Evolving methods

UCL Discovery

Automatic detection of constraints in software documentation

Author: Ghosh Joy
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2021
Field of study

2021 Fall.Includes bibliographical references.Software documentation is an important resource when maintaining and evolving software, as it supports developers in program understanding. To keep it up to date, developers need to verify that all the constraints affected by a change in source code are consistently described in the documentation. The process of detecting all the constraints in the documentation and cross-checking the constraints in the source code is time-consuming. An approach capable of automatically identifying software constraints in documentation could facilitate the process of detecting constraints, which are necessary to cross-check documentation and source code. In this thesis, we explore different machine learning algorithms to build binary classification models that assign sentences extracted from software documentation to one of two categories: constraints and non-constraints. The models are trained on a data set that consists of 368 manually tagged sentences from four open-source software systems. We evaluate the performance of the different models (Decision tree, Naive Bayes, Support Vector Machine, Fine-tuned BERT) based on precision, recall and F1-score. Our best model (i.e., a decision tree featuring bigrams) was able to achieve 74.0% precision, 83.8% recall and an F1-score of 0.79. This suggests that our results are promising and that it is possible to build machine learning based models for the automatic detection of constraints in the software documentation

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources

Author: Wohlgenannt Gerhard
Publication venue: 'Peter Lang, International Academic Publishers'
Publication date: 01/04/2020
Field of study

The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach

Directory of Open Access Books (DOAB)

Towards the detection and analysis of performance regression introducing code changes

Author: ALShoaibi Deema Adeeb
Publication venue: RIT Scholar Works
Publication date: 01/11/2022
Field of study

In contemporary software development, developers commonly conduct regression testing to ensure that code changes do not affect software quality. Performance regression testing is an emerging research area from the regression testing domain in software engineering. Performance regression testing aims to maintain the system\u27s performance. Conducting performance regression testing is known to be expensive. It is also complex, considering the increase of committed code and developing team members working simultaneously. Many automated regression testing techniques have been proposed in prior research. However, challenges in the practice of locating and resolving performance regression still exist. Directing regression testing to the commit level provides solutions to locate the root cause, yet it hinders the development process. This thesis outlines motivations and solutions to address locating performance regression root causes. First, we challenge a deterministic state-of-art approach by expanding the testing data to find improvement areas. The deterministic approach was found to be limited in searching for the best regression-locating rule. Thus, we presented two stochastic approaches to develop models that can learn from historical commits. The goal of the first stochastic approach is to view the research problem as a search-based optimization problem seeking to reach the highest detection rate. We are applying different multi-objective evolutionary algorithms and conducting a comparison between them. This thesis also investigates whether simplifying the search space by combining objectives would achieve comparative results. The second stochastic approach addresses the severity of class imbalance any system could have since code changes introducing regression are rare but costly. We formulate the identification of problematic commits that introduce performance regression as a binary classification problem that handles class imbalance. Further, the thesis provides an exploratory study on the challenges developers face in resolving performance regression. The study is based on the questions posted on a technical form directed to performance regression. We collected around 2k questions discussing the regression of software execution time, and all were manually analyzed. The study resulted in a categorization of the challenges. We also discussed the difficulty level of performance regression issues within the development community. This study provides insights to help developers during the software design and implementation to avoid regression causes

RIT Scholar Works

Time-slice analysis of dyadic human activity

Author: Ziaeefard Maryam
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2017
Field of study

La reconnaissance d’activités humaines à partir de données vidéo est utilisée pour la surveillance ainsi que pour des applications d’interaction homme-machine. Le principal objectif est de classer les vidéos dans l’une des k classes d’actions à partir de vidéos entièrement observées. Cependant, de tout temps, les systèmes intelligents sont améliorés afin de prendre des décisions basées sur des incertitudes et ou des informations incomplètes. Ce besoin nous motive à introduire le problème de l’analyse de l’incertitude associée aux activités humaines et de pouvoir passer à un nouveau niveau de généralité lié aux problèmes d’analyse d’actions. Nous allons également présenter le problème de reconnaissance d’activités par intervalle de temps, qui vise à explorer l’activité humaine dans un intervalle de temps court. Il a été démontré que l’analyse par intervalle de temps est utile pour la caractérisation des mouvements et en général pour l’analyse de contenus vidéo. Ces études nous encouragent à utiliser ces intervalles de temps afin d’analyser l’incertitude associée aux activités humaines. Nous allons détailler à quel degré de certitude chaque activité se produit au cours de la vidéo. Dans cette thèse, l’analyse par intervalle de temps d’activités humaines avec incertitudes sera structurée en 3 parties. i) Nous présentons une nouvelle famille de descripteurs spatiotemporels optimisés pour la prédiction précoce avec annotations d’intervalle de temps. Notre représentation prédictive du point d’intérêt spatiotemporel (Predict-STIP) est basée sur l’idée de la contingence entre intervalles de temps. ii) Nous exploitons des techniques de pointe pour extraire des points d’intérêts afin de représenter ces intervalles de temps. iii) Nous utilisons des relations (uniformes et par paires) basées sur les réseaux neuronaux convolutionnels entre les différentes parties du corps de l’individu dans chaque intervalle de temps. Les relations uniformes enregistrent l’apparence locale de la partie du corps tandis que les relations par paires captent les relations contextuelles locales entre les parties du corps. Nous extrayons les spécificités de chaque image dans l’intervalle de temps et examinons différentes façons de les agréger temporellement afin de générer un descripteur pour tout l’intervalle de temps. En outre, nous créons une nouvelle base de données qui est annotée à de multiples intervalles de temps courts, permettant la modélisation de l’incertitude inhérente à la reconnaissance d’activités par intervalle de temps. Les résultats expérimentaux montrent l’efficience de notre stratégie dans l’analyse des mouvements humains avec incertitude.Recognizing human activities from video data is routinely leveraged for surveillance and human-computer interaction applications. The main focus has been classifying videos into one of k action classes from fully observed videos. However, intelligent systems must to make decisions under uncertainty, and based on incomplete information. This need motivates us to introduce the problem of analysing the uncertainty associated with human activities and move to a new level of generality in the action analysis problem. We also present the problem of time-slice activity recognition which aims to explore human activity at a small temporal granularity. Time-slice recognition is able to infer human behaviours from a short temporal window. It has been shown that temporal slice analysis is helpful for motion characterization and for video content representation in general. These studies motivate us to consider timeslices for analysing the uncertainty associated with human activities. We report to what degree of certainty each activity is occurring throughout the video from definitely not occurring to definitely occurring. In this research, we propose three frameworks for time-slice analysis of dyadic human activity under uncertainty. i) We present a new family of spatio-temporal descriptors which are optimized for early prediction with time-slice action annotations. Our predictive spatiotemporal interest point (Predict-STIP) representation is based on the intuition of temporal contingency between time-slices. ii) we exploit state-of-the art techniques to extract interest points in order to represent time-slices. We also present an accumulative uncertainty to depict the uncertainty associated with partially observed videos for the task of early activity recognition. iii) we use Convolutional Neural Networks-based unary and pairwise relations between human body joints in each time-slice. The unary term captures the local appearance of the joints while the pairwise term captures the local contextual relations between the parts. We extract these features from each frame in a time-slice and examine different temporal aggregations to generate a descriptor for the whole time-slice. Furthermore, we create a novel dataset which is annotated at multiple short temporal windows, allowing the modelling of the inherent uncertainty in time-slice activity recognition. All the three methods have been evaluated on TAP dataset. Experimental results demonstrate the effectiveness of our framework in the analysis of dyadic activities under uncertaint

CorpusUL

Intelligent data leak detection through behavioural analysis

Author: Costeira Ricardo
Santos Henrique
Publication venue: International Academy, Research, and Industry Association (IARIA)
Publication date: 01/01/2016
Field of study

In this paper we discuss a solution to detect data leaks in an intelligent and furtive way through a real time analysis of the user’s behaviour while handling classified information. Data is based on experiences with real world use cases and a variety of data preparation and data analysis techniques have been tried. Results show the feasibility of the approach, but also the necessity to correlate with other security events to improve the precision.UID/CEC/00319/201

Universidade do Minho: RepositoriUM