161,062 research outputs found
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
Nonparametric Feature Extraction from Dendrograms
We propose feature extraction from dendrograms in a nonparametric way. The
Minimax distance measures correspond to building a dendrogram with single
linkage criterion, with defining specific forms of a level function and a
distance function over that. Therefore, we extend this method to arbitrary
dendrograms. We develop a generalized framework wherein different distance
measures can be inferred from different types of dendrograms, level functions
and distance functions. Via an appropriate embedding, we compute a vector-based
representation of the inferred distances, in order to enable many numerical
machine learning algorithms to employ such distances. Then, to address the
model selection problem, we study the aggregation of different dendrogram-based
distances respectively in solution space and in representation space in the
spirit of deep representations. In the first approach, for example for the
clustering problem, we build a graph with positive and negative edge weights
according to the consistency of the clustering labels of different objects
among different solutions, in the context of ensemble methods. Then, we use an
efficient variant of correlation clustering to produce the final clusters. In
the second approach, we investigate the sequential combination of different
distances and features sequentially in the spirit of multi-layered
architectures to obtain the final features. Finally, we demonstrate the
effectiveness of our approach via several numerical studies
Feature Extraction and Classification of Automatically Segmented Lung Lesion Using Improved Toboggan Algorithm
The accurate detection of lung lesions from computed tomography (CT) scans is essential for clinical diagnosis. It provides valuable information for treatment of lung cancer. However, the process is exigent to achieve a fully automatic lesion detection. Here, a novel segmentation algorithm is proposed, it's an improved toboggan algorithm with a three-step framework, which includes automatic seed point selection, multi-constraints lesion extraction and the lesion refinement. Then, the features like local binary pattern (LBP), wavelet, contourlet, grey level co-occurence matrix (GLCM) are applied to each region of interest of the segmented lung lesion image to extract the texture features such as contrast, homogeneity, energy, entropy and statistical extraction like mean, variance, standard deviation, convolution of modulated and normal frequencies. Finally, support vector machine (SVM) and K-nearest neighbour (KNN) classifiers are applied to classify the abnormal region based on the performance of the extracted features and their performance is been compared. The accuracy of 97.8% is been obtained by using SVM classifier when compared to KNN classifier. This approach does not require any human interaction for lesion detection. Thus, the improved toboggan algorithm can achieve precise lung lesion segmentation in CT images. The features extracted also helps to classify the lesion region of lungs efficiently
Video Feature Extraction Based on Modified LLE Using Adaptive Nearest Neighbor Approach
Locally linear embedding (LLE) is an unsupervised learning algorithm which computes the low dimensional, neighborhood preserving embeddings of high dimensional data. LLE attempts to discover non-linear structure in high dimensional data by exploiting the local symmetries of linear reconstructions. In this paper, video feature extraction is done using modified LLE alongwith adaptive nearest neighbor approach to find the nearest neighbor and the connected components. The proposed feature extraction method is applied to a video. The video feature description gives a new tool for analysis of video
Face Detection with Effective Feature Extraction
There is an abundant literature on face detection due to its important role
in many vision applications. Since Viola and Jones proposed the first real-time
AdaBoost based face detector, Haar-like features have been adopted as the
method of choice for frontal face detection. In this work, we show that simple
features other than Haar-like features can also be applied for training an
effective face detector. Since, single feature is not discriminative enough to
separate faces from difficult non-faces, we further improve the generalization
performance of our simple features by introducing feature co-occurrences. We
demonstrate that our proposed features yield a performance improvement compared
to Haar-like features. In addition, our findings indicate that features play a
crucial role in the ability of the system to generalize.Comment: 7 pages. Conference version published in Asian Conf. Comp. Vision
201
A new kernel method for hyperspectral image feature extraction
Hyperspectral image provides abundant spectral information for remote discrimination of subtle differences in ground covers. However, the increasing spectral dimensions, as well as the information redundancy, make the analysis and interpretation of hyperspectral images a challenge. Feature extraction is a very important step for hyperspectral image processing. Feature extraction methods aim at reducing the dimension of data, while preserving as much information as possible. Particularly, nonlinear feature extraction methods (e.g. kernel minimum noise fraction (KMNF) transformation) have been reported to benefit many applications of hyperspectral remote sensing, due to their good preservation of high-order structures of the original data. However, conventional KMNF or its extensions have some limitations on noise fraction estimation during the feature extraction, and this leads to poor performances for post-applications. This paper proposes a novel nonlinear feature extraction method for hyperspectral images. Instead of estimating noise fraction by the nearest neighborhood information (within a sliding window), the proposed method explores the use of image segmentation. The approach benefits both noise fraction estimation and information preservation, and enables a significant improvement for classification. Experimental results on two real hyperspectral images demonstrate the efficiency of the proposed method. Compared to conventional KMNF, the improvements of the method on two hyperspectral image classification are 8 and 11%. This nonlinear feature extraction method can be also applied to other disciplines where high-dimensional data analysis is required
- …