37,442 research outputs found

    Stable Feature Selection for Biomarker Discovery

    Full text link
    Feature selection techniques have been used as the workhorse in biomarker discovery applications for a long time. Surprisingly, the stability of feature selection with respect to sampling variations has long been under-considered. It is only until recently that this issue has received more and more attention. In this article, we review existing stable feature selection methods for biomarker discovery using a generic hierarchal framework. We have two objectives: (1) providing an overview on this new yet fast growing topic for a convenient reference; (2) categorizing existing methods under an expandable framework for future research and development

    Recurrent Pixel Embedding for Instance Grouping

    Full text link
    We introduce a differentiable, end-to-end trainable framework for solving pixel-level grouping problems such as instance segmentation consisting of two novel components. First, we regress pixels into a hyper-spherical embedding space so that pixels from the same group have high cosine similarity while those from different groups have similarity below a specified margin. We analyze the choice of embedding dimension and margin, relating them to theoretical results on the problem of distributing points uniformly on the sphere. Second, to group instances, we utilize a variant of mean-shift clustering, implemented as a recurrent neural network parameterized by kernel bandwidth. This recurrent grouping module is differentiable, enjoys convergent dynamics and probabilistic interpretability. Backpropagating the group-weighted loss through this module allows learning to focus on only correcting embedding errors that won't be resolved during subsequent clustering. Our framework, while conceptually simple and theoretically abundant, is also practically effective and computationally efficient. We demonstrate substantial improvements over state-of-the-art instance segmentation for object proposal generation, as well as demonstrating the benefits of grouping loss for classification tasks such as boundary detection and semantic segmentation

    Combination of Multiple Bipartite Ranking for Web Content Quality Evaluation

    Full text link
    Web content quality estimation is crucial to various web content processing applications. Our previous work applied Bagging + C4.5 to achive the best results on the ECML/PKDD Discovery Challenge 2010, which is the comibination of many point-wise rankinig models. In this paper, we combine multiple pair-wise bipartite ranking learner to solve the multi-partite ranking problems for the web quality estimation. In encoding stage, we present the ternary encoding and the binary coding extending each rank value to L1L - 1 (L is the number of the different ranking value). For the decoding, we discuss the combination of multiple ranking results from multiple bipartite ranking models with the predefined weighting and the adaptive weighting. The experiments on ECML/PKDD 2010 Discovery Challenge datasets show that \textit{binary coding} + \textit{predefined weighting} yields the highest performance in all four combinations and furthermore it is better than the best results reported in ECML/PKDD 2010 Discovery Challenge competition.Comment: 17 pages, 8 figures, 2 table

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    Adaptive Target Recognition: A Case Study Involving Airport Baggage Screening

    Full text link
    This work addresses the question whether it is possible to design a computer-vision based automatic threat recognition (ATR) system so that it can adapt to changing specifications of a threat without having to create a new ATR each time. The changes in threat specifications, which may be warranted by intelligence reports and world events, are typically regarding the physical characteristics of what constitutes a threat: its material composition, its shape, its method of concealment, etc. Here we present our design of an AATR system (Adaptive ATR) that can adapt to changing specifications in materials characterization (meaning density, as measured by its x-ray attenuation coefficient), its mass, and its thickness. Our design uses a two-stage cascaded approach, in which the first stage is characterized by a high recall rate over the entire range of possibilities for the threat parameters that are allowed to change. The purpose of the second stage is to then fine-tune the performance of the overall system for the current threat specifications. The computational effort for this fine-tuning for achieving a desired PD/PFA rate is far less than what it would take to create a new classifier with the same overall performance for the new set of threat specifications

    Transportation Life Cycle Assessment Synthesis: Life Cycle Assessment Learning Module Series

    Get PDF
    The Life Cycle Assessment Learning Module Series is a set of narrated, self-advancing slideshows on various topics related to environmental life cycle assessment (LCA). This research project produced the first 27 of such modules, which are freely available for download on the CESTiCC website http://cem.uaf.edu/cesticc/publications/lca.aspx. Each module is roughly 15- 20 minutes in length and is intended for various uses such as course components, as the main lecture material in a dedicated LCA course, or for independent learning in support of research projects. The series is organized into four overall topical areas, each of which contain a group of overview modules and a group of detailed modules. The A and α groups cover the international standards that define LCA. The B and β groups focus on environmental impact categories. The G and γ groups identify software tools for LCA and provide some tutorials for their use. The T and τ groups introduce topics of interest in the field of transportation LCA. This includes overviews of how LCA is frequently applied in that sector, literature reviews, specific considerations, and software tutorials. Future modules in this category will feature methodological developments and case studies specific to the transportation sector
    corecore