36,470 research outputs found
Adaptive feature selection based on the most informative graph-based features
In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
Designing Algorithms for Optimization of Parameters of Functioning of Intelligent System for Radionuclide Myocardial Diagnostics
The influence of the number of complex components of Fast Fourier transformation in analyzing the polar maps of radionuclide examination of myocardium at rest and stress on the functional efficiency of the system of diagnostics of pathologies of myocardium was explored, and there were defined their optimum values in the information sense, which allows increasing the efficiency of the algorithms of forming the diagnostic decision rules by reducing the capacity of the dictionary of features of recognition.The information-extreme sequential cluster algorithms of the selection of the dictionary of features, which contains both quantitative and category features were developed and the results of their work were compared. The modificatios of the algorithms of the selection of the dictionary were suggested, which allows increasing both the search speed of the optimal in the information sense dictionary and reducing its capacity by 40 %. We managed to get the faultless by the training matrix decision rules, the accuracy of which is in the exam mode asymptotically approaches the limit.It was experimentally confirmed that the implementation of the proposed algorithm of the diagnosing system training has allowed to reduce the minimum representative volume of the training matrix from 300 to 81 vectors-implementations of the classes of recognition of the functional myocardium state
- …