103,140 research outputs found

    Efficient feature selection filters for high-dimensional data

    Get PDF
    Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster

    Комбинированная схема отбора признаков для разработки банковских моделей

    Get PDF
    Machine learning methods have been successful in various aspects of bank lending. Banks have accumulated huge amounts of data about borrowers over the years of application. On the one hand, this made it possible to predict borrower behavior more accurately, on the other, it gave rise to the problem a problem of data redundancy, which greatly complicates the model development. Methods of feature selection, which allows to improve the quality of models, are apply to solve this problem. Feature selection methods can be divided into three main types: filters, wrappers, and embedded methods. Filters are simple and time-efficient methods that may help discover one-dimensional relations. Wrappers and embedded methods are more effective in feature selection, because they account for multi-dimensional relationships, but these methods are resource-consuming and may fail to process large samples with many features. In this article, the authors propose a combined feature selection scheme (CFSS), in which the first stages of selection use coarse filters, and on the final — wrappers for high-quality selection. This architecture lets us increase the quality of selection and reduce the time necessary to process large multi-dimensional samples, which are used in the development of industrial models. Experiments conducted by authors for four types of bank modelling tasks (survey scoring, behavioral scoring, customer response to cross-selling, and delayed debt collection) have shown that the proposed method better than classical methods containing only filters or only wrappers.Методы машинного обучения успешно применяются в самых разных областях банковского кредитования. За годы их применения банки накопили огромные массивы данных о заемщиках. С одной стороны, это позволило более точно предсказывать поведение заемщика, с другой — породило проблему избыточности данных, которая сильно усложняет разработку моделей. Чтобы решить эту проблему применяют методы отбора признаков, позволяющие повысить качество моделей. Эти методы делятся на три типа: фильтры, обертки и вложения. Фильтры являются простыми и быстрыми методами, при использовании которых можно находить одномерные зависимости. Обертки и вложения позволяют более качественно проводить отбор признаков, поскольку учитывают многомерную зависимость, однако данные методы требуют значительных вычислительных ресурсов и могут плохо работать на больших высокоразмерных выборках. В данной статье авторы предлагают схему комбинированного отбора признаков CFSS, в которой на первых этапах отбора используются грубые фильтры, а на финальных — обертки для более качественного отбора. Такая архитектура повышает качество и скорость отбора признаков на больших высокоразмерных выборках для промышленного моделирования в банковских задачах. Проведенные авторами эксперименты для четырех типов банковских задач (анкетный скоринг, поведенческий скоринг, отклик клиентов на кросс-сейл предложение и взыскание просроченной задолженности) показали, что предложенный метод работает лучше, чем классические методы, содержащие только фильтры или только обертки

    Exemplar Based Deep Discriminative and Shareable Feature Learning for Scene Image Classification

    Full text link
    In order to encode the class correlation and class specific information in image representation, we propose a new local feature learning approach named Deep Discriminative and Shareable Feature Learning (DDSFL). DDSFL aims to hierarchically learn feature transformation filter banks to transform raw pixel image patches to features. The learned filter banks are expected to: (1) encode common visual patterns of a flexible number of categories; (2) encode discriminative information; and (3) hierarchically extract patterns at different visual levels. Particularly, in each single layer of DDSFL, shareable filters are jointly learned for classes which share the similar patterns. Discriminative power of the filters is achieved by enforcing the features from the same category to be close, while features from different categories to be far away from each other. Furthermore, we also propose two exemplar selection methods to iteratively select training data for more efficient and effective learning. Based on the experimental results, DDSFL can achieve very promising performance, and it also shows great complementary effect to the state-of-the-art Caffe features.Comment: Pattern Recognition, Elsevier, 201

    Mental state estimation for brain-computer interfaces

    Get PDF
    Mental state estimation is potentially useful for the development of asynchronous brain-computer interfaces. In this study, four mental states have been identified and decoded from the electrocorticograms (ECoGs) of six epileptic patients, engaged in a memory reach task. A novel signal analysis technique has been applied to high-dimensional, statistically sparse ECoGs recorded by a large number of electrodes. The strength of the proposed technique lies in its ability to jointly extract spatial and temporal patterns, responsible for encoding mental state differences. As such, the technique offers a systematic way of analyzing the spatiotemporal aspects of brain information processing and may be applicable to a wide range of spatiotemporal neurophysiological signals

    Feature and Variable Selection in Classification

    Full text link
    The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not lend themselves to interpretable results, and the CPU and memory resources necessary to run on high-dimensional datasets severly limit the applications of the approaches. Variable and feature selection aim to remedy this by finding a subset of features that in some way captures the information provided best. In this paper we present the general methodology and highlight some specific approaches.Comment: Part of master seminar in document analysis held by Marcus Eichenberger-Liwick

    Coding of details in very low bit-rate video systems

    Get PDF
    In this paper, the importance of including small image features at the initial levels of a progressive second generation video coding scheme is presented. It is shown that a number of meaningful small features called details should be coded, even at very low data bit-rates, in order to match their perceptual significance to the human visual system. We propose a method for extracting, perceptually selecting and coding of visual details in a video sequence using morphological techniques. Its application in the framework of a multiresolution segmentation-based coding algorithm yields better results than pure segmentation techniques at higher compression ratios, if the selection step fits some main subjective requirements. Details are extracted and coded separately from the region structure and included in the reconstructed images in a later stage. The bet of considering the local background of a given detail for its perceptual selection breaks the concept ofPeer ReviewedPostprint (published version
    corecore