274,001 research outputs found

    Ant Colony Optimization Based Subset Feature Selection in Speech Processing: Constructing Graphs with Degree Sequences

    Get PDF
    Feature selection or the process of selecting the most discriminating feature subset is an essential practice in speech processing that significantly affects the performance of classification. However, the volume of features that presents in speech processing makes the feature selection perplexing. Moreover, finding the optimal feature subset is a NP-hard problem (2n). Thus, a good searching strategy is required to avoid evaluating large number of combinations in the whole feature subsets. As a result, in recent years, many heuristic based search algorithms are developed to address this NP-hard problem. One of the several meta heuristic algorithms that is applied in many application domains to solve feature selection problem is Ant Colony Optimization (ACO) based algorithms.  ACO based algorithms are nature-inspired from the foraging behavior of actual ants. The success of an ACO based feature selection algorithm depends on the choice of the construction graph with respect to runtime behavior. While most ACO based feature selection algorithms use fully connected graphs, this paper proposes ACO based algorithm that uses graphs with prescribed degree sequences. In this method, the degree of the graph representing the search space will be predicted and the construction graph that satisfies the predicted degree will be generated. This research direction on graph representation for ACO algorithms may offer possibilities to reduce computation complexity from O(n2) to O(nm) in which m is the number of edges. This paper outlines some popular optimization based feature selection algorithms in the field of speech processing applications and overviewed ACO algorithm and its main variants. In addition to that, ACO based feature selection is explained and its application in various speech processing tasks is reviewed. Finally, a degree based graph construction for ACO algorithms is proposed

    Mining Brain Networks using Multiple Side Views for Neurological Disorder Identification

    Full text link
    Mining discriminative subgraph patterns from graph data has attracted great interest in recent years. It has a wide variety of applications in disease diagnosis, neuroimaging, etc. Most research on subgraph mining focuses on the graph representation alone. However, in many real-world applications, the side information is available along with the graph data. For example, for neurological disorder identification, in addition to the brain networks derived from neuroimaging data, hundreds of clinical, immunologic, serologic and cognitive measures may also be documented for each subject. These measures compose multiple side views encoding a tremendous amount of supplemental information for diagnostic purposes, yet are often ignored. In this paper, we study the problem of discriminative subgraph selection using multiple side views and propose a novel solution to find an optimal set of subgraph features for graph classification by exploring a plurality of side views. We derive a feature evaluation criterion, named gSide, to estimate the usefulness of subgraph patterns based upon side views. Then we develop a branch-and-bound algorithm, called gMSV, to efficiently search for optimal subgraph features by integrating the subgraph mining process and the procedure of discriminative feature selection. Empirical studies on graph classification tasks for neurological disorders using brain networks demonstrate that subgraph patterns selected by the multi-side-view guided subgraph selection approach can effectively boost graph classification performances and are relevant to disease diagnosis.Comment: in Proceedings of IEEE International Conference on Data Mining (ICDM) 201

    Joint Hypergraph Learning and Sparse Regression for Feature Selection

    Get PDF
    In this paper, we propose a unified framework for improved structure estimation and feature selection. Most existing graph-based feature selection methods utilise a static representation of the structure of the available data based on the Laplacian matrix of a simple graph. Here on the other hand, we perform data structure learning and feature selection simultaneously. To improve the estimation of the manifold representing the structure of the selected features, we use a higher order description of the neighbour- hood structures present in the available data using hypergraph learning. This allows those features which participate in the most significant higher order relations to be se- lected, and the remainder discarded, through a sparsification process. We formulate a single objective function to capture and regularise the hypergraph weight estimation and feature selection processes. Finally, we present an optimization algorithm to re- cover the hyper graph weights and a sparse set of feature selection indicators. This process offers a number of advantages. First, by adjusting the hypergraph weights, we preserve high-order neighborhood relations reflected in the original data, which cannot be modeled by a simple graph. Moreover, our objective function captures the global discriminative structure of the features in the data. Comprehensive experiments on 9 benchmark data sets show that our method achieves statistically significant improve- ment over state-of-art feature selection methods, supporting the effectiveness of the proposed method

    MCFS: Min-cut-based feature-selection

    Get PDF
    In this paper, MCFS (Min-Cut-based feature-selection) is presented, which is a feature-selection algorithm based on the representation of the features in a dataset by means of a directed graph. The main contribution of our work is to show the usefulness of a general graph-processing technique in the feature-selection problem for classification datasets. The vertices of the graphs used herein are the features together with two special-purpose vertices (one of which denotes high correlation to the feature class of the dataset, and the other denotes a low correlation to the feature class). The edges are functions of the correlations among the features and also between the features and the classes. A classic max-flow min-cut algorithm is applied to this graph. The cut returned by this algorithm provides the selected features. We have compared the results of our proposal with well-known feature-selection techniques. Our algorithm obtains results statistically similar to those achieved by the other techniques in terms of number of features selected, while additionally significantly improving the accuracy.Ministerio de Ciencia, Innovación y Universidades RTI2018-098 062-A-I00Ministerio de Economía y Competitividad TIN2017-82113-C2-1-

    Identifying the most informative features using a structurally interacting elastic net

    Get PDF
    Feature selection can efficiently identify the most informative features with respect to the target feature used in training. However, state-of-the-art vector-based methods are unable to encapsulate the relationships between feature samples into the feature selection process, thus leading to significant information loss. To address this problem, we propose a new graph-based structurally interacting elastic net method for feature selection. Specifically, we commence by constructing feature graphs that can incorporate pairwise relationship between samples. With the feature graphs to hand, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature graph representation. This measure is used to obtain a structural interaction matrix where the elements represent the proposed information theoretic measure between feature pairs. We then formulate a new optimization model through the combination of the structural interaction matrix and an elastic net regression model for the feature subset selection problem. This allows us to (a) preserve the information of the original vectorial space, (b) remedy the information loss of the original feature space caused by using graph representation, and (c) promote a sparse solution and also encourage correlated features to be selected. Because the proposed optimization problem is non-convex, we develop an efficient alternating direction multiplier method (ADMM) to locate the optimal solutions. Extensive experiments on various datasets demonstrate the effectiveness of the proposed method

    Homophily Outlier Detection in Non-IID Categorical Data

    Full text link
    Most of existing outlier detection methods assume that the outlier factors (i.e., outlierness scoring measures) of data entities (e.g., feature values and data objects) are Independent and Identically Distributed (IID). This assumption does not hold in real-world applications where the outlierness of different entities is dependent on each other and/or taken from different probability distributions (non-IID). This may lead to the failure of detecting important outliers that are too subtle to be identified without considering the non-IID nature. The issue is even intensified in more challenging contexts, e.g., high-dimensional data with many noisy features. This work introduces a novel outlier detection framework and its two instances to identify outliers in categorical data by capturing non-IID outlier factors. Our approach first defines and incorporates distribution-sensitive outlier factors and their interdependence into a value-value graph-based representation. It then models an outlierness propagation process in the value graph to learn the outlierness of feature values. The learned value outlierness allows for either direct outlier detection or outlying feature selection. The graph representation and mining approach is employed here to well capture the rich non-IID characteristics. Our empirical results on 15 real-world data sets with different levels of data complexities show that (i) the proposed outlier detection methods significantly outperform five state-of-the-art methods at the 95%/99% confidence level, achieving 10%-28% AUC improvement on the 10 most complex data sets; and (ii) the proposed feature selection methods significantly outperform three competing methods in enabling subsequent outlier detection of two different existing detectors.Comment: To appear in Data Ming and Knowledge Discovery Journa

    Graphs from features: tree-based graph layout for feature analysis

    Get PDF
    Feature Analysis has become a very critical task in data analysis and visualization. Graph structures are very flexible in terms of representation and may encode important information on features but are challenging in regards to layout being adequate for analysis tasks. In this study, we propose and develop similarity-based graph layouts with the purpose of locating relevant patterns in sets of features, thus supporting feature analysis and selection. We apply a tree layout in the first step of the strategy, to accomplish node placement and overview based on feature similarity. By drawing the remainder of the graph edges on demand, further grouping and relationships among features are revealed. We evaluate those groups and relationships in terms of their effectiveness in exploring feature sets for data analysis. Correlation of features with a target categorical attribute and feature ranking are added to support the task. Multidimensional projections are employed to plot the dataset based on selected attributes to reveal the effectiveness of the feature set. Our results have shown that the tree-graph layout framework allows for a number of observations that are very important in user-centric feature selection, and not easy to observe by any other available tool. They provide a way of finding relevant and irrelevant features, spurious sets of noisy features, groups of similar features, and opposite features, all of which are essential tasks in different scenarios of data analysis. Case studies in application areas centered on documents, images and sound data demonstrate the ability of the framework to quickly reach a satisfactory compact representation from a larger feature set

    Political Text Scaling Meets Computational Semantics

    Full text link
    During the last fifteen years, automatic text scaling has become one of the key tools of the Text as Data community in political science. Prominent text scaling algorithms, however, rely on the assumption that latent positions can be captured just by leveraging the information about word frequencies in documents under study. We challenge this traditional view and present a new, semantically aware text scaling algorithm, SemScale, which combines recent developments in the area of computational linguistics with unsupervised graph-based clustering. We conduct an extensive quantitative analysis over a collection of speeches from the European Parliament in five different languages and from two different legislative terms, and show that a scaling approach relying on semantic document representations is often better at capturing known underlying political dimensions than the established frequency-based (i.e., symbolic) scaling method. We further validate our findings through a series of experiments focused on text preprocessing and feature selection, document representation, scaling of party manifestos, and a supervised extension of our algorithm. To catalyze further research on this new branch of text scaling methods, we release a Python implementation of SemScale with all included data sets and evaluation procedures.Comment: Updated version - accepted for Transactions on Data Science (TDS
    corecore