973,874 research outputs found

    Multi-label Ferns for Efficient Recognition of Musical Instruments in Recordings

    Full text link
    In this paper we introduce multi-label ferns, and apply this technique for automatic classification of musical instruments in audio recordings. We compare the performance of our proposed method to a set of binary random ferns, using jazz recordings as input data. Our main result is obtaining much faster classification and higher F-score. We also achieve substantial reduction of the model size

    Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks

    Full text link
    We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels. We also describe an improved method to generate synthetic data to train the neural network using state-of-the-art sound propagation algorithms that model specular as well as diffuse reflections of sound. We compare our model against three other CRNNs trained using different formulations of the same problem: classification on categorical labels, and regression on spherical coordinate labels. In practice, our model achieves up to 43% decrease in angular error over prior methods. The use of diffuse reflection results in 34% and 41% reduction in angular prediction errors on LOCATA and SOFA datasets, respectively, over prior methods based on image-source methods. Our method results in an additional 3% error reduction over prior schemes that use classification based networks, and we use 36% fewer network parameters

    Manifold Learning in MR spectroscopy using nonlinear dimensionality reduction and unsupervised clustering

    Get PDF
    Purpose To investigate whether nonlinear dimensionality reduction improves unsupervised classification of 1H MRS brain tumor data compared with a linear method. Methods In vivo single-voxel 1H magnetic resonance spectroscopy (55 patients) and 1H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. Results An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With 1H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. Conclusion Purpose To investigate whether nonlinear dimensionality reduction improves unsupervised classification of 1H MRS brain tumor data compared with a linear method. Methods In vivo single-voxel 1H magnetic resonance spectroscopy (55 patients) and 1H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. Results An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With 1H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. Conclusion The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of 1H MRSI data after cluster analysis

    Classification accuracy increase using multisensor data fusion

    Get PDF
    The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport objects, forest, roads, rail roads, etc

    Feature Reduction using a Singular Value Decomposition for the Iterative Guided Spectral Class Rejection Hybrid Classifier

    Get PDF
    Feature reduction in a remote sensing dataset is often desirable to decrease the processing time required to perform a classification and improve overall classification accuracy. This work introduces a feature reduction method based on the singular value decomposition (SVD). This feature reduction technique was applied to training data from two multitemporal datasets of Landsat TM/ETM+ imagery acquired over a forested area in Virginia, USA and Rondonia, Brazil. Subsequent parallel iterative guided spectral class rejection (pIGSCR) forest/nonforest classifications were performed to determine the quality of the feature reduction. The classifications of the Virginia data were five times faster using SVDbased feature reduction without affecting the classification accuracy. Feature reduction using the SVD was also compared to feature reduction using principal components analysis (PCA). The highest average accuracies for the Virginia dataset (88.34%) and for the Rondonia dataset (93.31%) were achieved using the SVD. The results presented here indicate that SVDbased feature reduction can produce statistically significantly better classifications than PCA

    An Empirical Study On Sampling Approaches For 3D Image Classification Using Deep Learning

    Get PDF
    A 3D classification method requires more training data than a 2D image classification method to achieve good performance. These training data usually come in the form of multiple 2D images (e.g., slices in a CT scan) or point clouds (e.g., 3D CAD modeling) for volumetric object representation. The amount of data required to complete this higher dimension problem comes with the cost of requiring more processing time and space. This problem can be mitigated with data size reduction (i.e., sampling). In this thesis, we empirically study and compare the classification performance and deep learning training time of PointNet utilizing uniform random sampling and farthest point sampling, and SampleNet which utilizes a reduction approach based on weighted average of nearest neighbor points, and Multi-view Convolution Neural Network (MVCNN). Contrary to recent research which claimed that SampleNet performs outright better than simple form of sampling approaches used by PointNet, our experimental results show that SampleNet may not significantly reduce processing time and yet it achieves a poorer classification performance. Additionally, resolution reduction for the views in MVCNN achieves poor accuracy when compared to view reduction. Moreover, our experimental result shows that simple sampling approaches used by PointNet as well as using simple view reduction when using a multi-view classifier can maintain accuracy while decreasing processing time for the 3D classification task

    Optimized kernel minimum noise fraction transformation for hyperspectral image classification

    Get PDF
    This paper presents an optimized kernel minimum noise fraction transformation (OKMNF) for feature extraction of hyperspectral imagery. The proposed approach is based on the kernel minimum noise fraction (KMNF) transformation, which is a nonlinear dimensionality reduction method. KMNF can map the original data into a higher dimensional feature space and provide a small number of quality features for classification and some other post processing. Noise estimation is an important component in KMNF. It is often estimated based on a strong relationship between adjacent pixels. However, hyperspectral images have limited spatial resolution and usually have a large number of mixed pixels, which make the spatial information less reliable for noise estimation. It is the main reason that KMNF generally shows unstable performance in feature extraction for classification. To overcome this problem, this paper exploits the use of a more accurate noise estimation method to improve KMNF. We propose two new noise estimation methods accurately. Moreover, we also propose a framework to improve noise estimation, where both spectral and spatial de-correlation are exploited. Experimental results, conducted using a variety of hyperspectral images, indicate that the proposed OKMNF is superior to some other related dimensionality reduction methods in most cases. Compared to the conventional KMNF, the proposed OKMNF benefits significant improvements in overall classification accuracy
    corecore