973,874 research outputs found
Multi-label Ferns for Efficient Recognition of Musical Instruments in Recordings
In this paper we introduce multi-label ferns, and apply this technique for
automatic classification of musical instruments in audio recordings. We compare
the performance of our proposed method to a set of binary random ferns, using
jazz recordings as input data. Our main result is obtaining much faster
classification and higher F-score. We also achieve substantial reduction of the
model size
Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks
We present a novel learning-based approach to estimate the
direction-of-arrival (DOA) of a sound source using a convolutional recurrent
neural network (CRNN) trained via regression on synthetic data and Cartesian
labels. We also describe an improved method to generate synthetic data to train
the neural network using state-of-the-art sound propagation algorithms that
model specular as well as diffuse reflections of sound. We compare our model
against three other CRNNs trained using different formulations of the same
problem: classification on categorical labels, and regression on spherical
coordinate labels. In practice, our model achieves up to 43% decrease in
angular error over prior methods. The use of diffuse reflection results in 34%
and 41% reduction in angular prediction errors on LOCATA and SOFA datasets,
respectively, over prior methods based on image-source methods. Our method
results in an additional 3% error reduction over prior schemes that use
classification based networks, and we use 36% fewer network parameters
Manifold Learning in MR spectroscopy using nonlinear dimensionality reduction and unsupervised clustering
Purpose To investigate whether nonlinear dimensionality reduction improves unsupervised classification of 1H MRS brain tumor data compared with a linear method. Methods In vivo single-voxel 1H magnetic resonance spectroscopy (55 patients) and 1H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. Results An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With 1H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. Conclusion Purpose To investigate whether nonlinear dimensionality reduction improves unsupervised classification of 1H MRS brain tumor data compared with a linear method. Methods In vivo single-voxel 1H magnetic resonance spectroscopy (55 patients) and 1H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. Results An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With 1H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. Conclusion The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of 1H MRSI data after cluster analysis
Classification accuracy increase using multisensor data fusion
The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.)
but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the
confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification
products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral
data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since
this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed
for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution
SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and
multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised
clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network).
This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced
by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion
of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types
of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results
of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other
established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport
objects, forest, roads, rail roads, etc
Feature Reduction using a Singular Value Decomposition for the Iterative Guided Spectral Class Rejection Hybrid Classifier
Feature reduction in a remote sensing dataset is often desirable to decrease the processing
time required to perform a classification and improve overall classification accuracy. This work introduces
a feature reduction method based on the singular value decomposition (SVD). This feature reduction
technique was applied to training data from two multitemporal datasets of Landsat TM/ETM+ imagery
acquired over a forested area in Virginia, USA and Rondonia, Brazil. Subsequent parallel iterative guided
spectral class rejection (pIGSCR) forest/nonforest
classifications were performed to determine the quality
of the feature reduction. The classifications of the Virginia data were five times faster using SVDbased
feature reduction without affecting the classification accuracy. Feature reduction using the SVD was also
compared to feature reduction using principal components analysis (PCA). The highest average accuracies
for the Virginia dataset (88.34%) and for the Rondonia dataset (93.31%) were achieved using the SVD.
The results presented here indicate that SVDbased
feature reduction can produce statistically significantly
better classifications than PCA
An Empirical Study On Sampling Approaches For 3D Image Classification Using Deep Learning
A 3D classification method requires more training data than a 2D image classification method to achieve good performance. These training data usually come in the form of multiple 2D images (e.g., slices in a CT scan) or point clouds (e.g., 3D CAD modeling) for volumetric object representation. The amount of data required to complete this higher dimension problem comes with the cost of requiring more processing time and space. This problem can be mitigated with data size reduction (i.e., sampling). In this thesis, we empirically study and compare the classification performance and deep learning training time of PointNet utilizing uniform random sampling and farthest point sampling, and SampleNet which utilizes a reduction approach based on weighted average of nearest neighbor points, and Multi-view Convolution Neural Network (MVCNN). Contrary to recent research which claimed that SampleNet performs outright better than simple form of sampling approaches used by PointNet, our experimental results show that SampleNet may not significantly reduce processing time and yet it achieves a poorer classification performance. Additionally, resolution reduction for the views in MVCNN achieves poor accuracy when compared to view reduction. Moreover, our experimental result shows that simple sampling approaches used by PointNet as well as using simple view reduction when using a multi-view classifier can maintain accuracy while decreasing processing time for the 3D classification task
Optimized kernel minimum noise fraction transformation for hyperspectral image classification
This paper presents an optimized kernel minimum noise fraction transformation (OKMNF) for feature extraction of hyperspectral imagery. The proposed approach is based on the kernel minimum noise fraction (KMNF) transformation, which is a nonlinear dimensionality reduction method. KMNF can map the original data into a higher dimensional feature space and provide a small number of quality features for classification and some other post processing. Noise estimation is an important component in KMNF. It is often estimated based on a strong relationship between adjacent pixels. However, hyperspectral images have limited spatial resolution and usually have a large number of mixed pixels, which make the spatial information less reliable for noise estimation. It is the main reason that KMNF generally shows unstable performance in feature extraction for classification. To overcome this problem, this paper exploits the use of a more accurate noise estimation method to improve KMNF. We propose two new noise estimation methods accurately. Moreover, we also propose a framework to improve noise estimation, where both spectral and spatial de-correlation are exploited. Experimental results, conducted using a variety of hyperspectral images, indicate that the proposed OKMNF is superior to some other related dimensionality reduction methods in most cases. Compared to the conventional KMNF, the proposed OKMNF benefits significant improvements in overall classification accuracy
- …