276,733 research outputs found
Advanced Visual Computing for Image Saliency Detection
Saliency detection is a category of computer vision algorithms that aims to filter out the most salient object in a given image. Existing saliency detection methods can generally be categorized as bottom-up methods and top-down methods, and the prevalent deep neural network (DNN) has begun to show its applications in saliency detection in recent years. However, the challenges in existing methods, such as problematic pre-assumption, inefficient feature integration and absence of high-level feature learning, prevent them from superior performances. In this thesis, to address the limitations above, we have proposed multiple novel models with favorable performances. Specifically, we first systematically reviewed the developments of saliency detection and its related works, and then proposed four new methods, with two based on low-level image features, and two based on DNNs. The regularized random walks ranking method (RR) and its reversion-correction-improved version (RCRR) are based on conventional low-level image features, which exhibit higher accuracy and robustness in extracting the image boundary based foreground / background queries; while the background search and foreground estimation (BSFE) and dense and sparse labeling (DSL) methods are based on DNNs, which have shown their dominant advantages in high-level image feature extraction, as well as the combined strength of multi-dimensional features. Each of the proposed methods is evaluated by extensive experiments, and all of them behave favorably against the state-of-the-art, especially the DSL method, which achieves remarkably higher performance against sixteen state-of-the-art methods (including ten conventional methods and six learning based methods) on six well-recognized public datasets. The successes of our proposed methods reveal more potential and meaningful applications of saliency detection in real-life computer vision tasks
Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis
In this study, a machine learning approach based on time series models, different feature engineering, feature extraction, and feature derivation is proposed to improve air passenger forecasting. Different types of datasets were created to extract new features from the core data. An experiment was undertaken with artificial neural networks to test the performance of neurons in the hidden layer, to optimise the dimensions of all layers and to obtain an optimal choice of connection weights – thus the nonlinear optimisation problem could be solved directly. A method of tuning deep learning models using H2O (which is a feature-rich, open source machine learning platform known for its R and Spark integration and its ease of use) is also proposed, where the trained network model is built from samples of selected features from the dataset in order to ensure diversity of the samples and to improve training. A successful application of deep learning requires setting numerous parameters in order to achieve greater model accuracy. The number of hidden layers and the number of neurons, are key parameters in each layer of such a network. Hyper-parameter, grid search, and random hyper-parameter approaches aid in setting these important parameters. Moreover, a new ensemble strategy is suggested that shows potential to optimise parameter settings and hence save more computational resources throughout the tuning process of the models. The main objective, besides improving the performance metric, is to obtain a distribution on some hold-out datasets that resemble the original distribution of the training data. Particular attention is focused on creating a modified version of Principal Component Analysis (PCA) using a different correlation matrix – obtained by a different correlation coefficient based on kinetic energy to derive new features. The data were collected from several airline datasets to build a deep prediction model for forecasting airline passenger numbers. Preliminary experiments show that fine-tuning provides an efficient approach for tuning the ultimate number of hidden layers and the number of neurons in each layer when compared with the grid search method. Similarly, the results show that the modified version of PCA is more effective in data dimension reduction, classes reparability, and classification accuracy than using traditional PCA.</div
Extended pipeline for content-based feature engineering in music genre recognition
We present a feature engineering pipeline for the construction of musical
signal characteristics, to be used for the design of a supervised model for
musical genre identification. The key idea is to extend the traditional
two-step process of extraction and classification with additive stand-alone
phases which are no longer organized in a waterfall scheme. The whole system is
realized by traversing backtrack arrows and cycles between various stages. In
order to give a compact and effective representation of the features, the
standard early temporal integration is combined with other selection and
extraction phases: on the one hand, the selection of the most meaningful
characteristics based on information gain, and on the other hand, the inclusion
of the nonlinear correlation between this subset of features, determined by an
autoencoder. The results of the experiments conducted on GTZAN dataset reveal a
noticeable contribution of this methodology towards the model's performance in
classification task.Comment: ICASSP 201
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
Recently, substantial research effort has focused on how to apply CNNs or
RNNs to better extract temporal patterns from videos, so as to improve the
accuracy of video classification. In this paper, however, we show that temporal
information, especially longer-term patterns, may not be necessary to achieve
competitive results on common video classification datasets. We investigate the
potential of a purely attention based local feature integration. Accounting for
the characteristics of such features in video classification, we propose a
local feature integration framework based on attention clusters, and introduce
a shifting operation to capture more diverse signals. We carefully analyze and
compare the effect of different attention mechanisms, cluster sizes, and the
use of the shifting operation, and also investigate the combination of
attention clusters for multimodal integration. We demonstrate the effectiveness
of our framework on three real-world video classification datasets. Our model
achieves competitive results across all of these. In particular, on the
large-scale Kinetics dataset, our framework obtains an excellent single model
accuracy of 79.4% in terms of the top-1 and 94.0% in terms of the top-5
accuracy on the validation set. The attention clusters are the backbone of our
winner solution at ActivityNet Kinetics Challenge 2017. Code and models will be
released soon.Comment: The backbone of the winner solution at ActivityNet Kinetics Challenge
201
- …