37,442 research outputs found
Stable Feature Selection for Biomarker Discovery
Feature selection techniques have been used as the workhorse in biomarker
discovery applications for a long time. Surprisingly, the stability of feature
selection with respect to sampling variations has long been under-considered.
It is only until recently that this issue has received more and more attention.
In this article, we review existing stable feature selection methods for
biomarker discovery using a generic hierarchal framework. We have two
objectives: (1) providing an overview on this new yet fast growing topic for a
convenient reference; (2) categorizing existing methods under an expandable
framework for future research and development
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
Combination of Multiple Bipartite Ranking for Web Content Quality Evaluation
Web content quality estimation is crucial to various web content processing
applications. Our previous work applied Bagging + C4.5 to achive the best
results on the ECML/PKDD Discovery Challenge 2010, which is the comibination of
many point-wise rankinig models. In this paper, we combine multiple pair-wise
bipartite ranking learner to solve the multi-partite ranking problems for the
web quality estimation. In encoding stage, we present the ternary encoding and
the binary coding extending each rank value to (L is the number of the
different ranking value). For the decoding, we discuss the combination of
multiple ranking results from multiple bipartite ranking models with the
predefined weighting and the adaptive weighting. The experiments on ECML/PKDD
2010 Discovery Challenge datasets show that \textit{binary coding} +
\textit{predefined weighting} yields the highest performance in all four
combinations and furthermore it is better than the best results reported in
ECML/PKDD 2010 Discovery Challenge competition.Comment: 17 pages, 8 figures, 2 table
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
Adaptive Target Recognition: A Case Study Involving Airport Baggage Screening
This work addresses the question whether it is possible to design a
computer-vision based automatic threat recognition (ATR) system so that it can
adapt to changing specifications of a threat without having to create a new ATR
each time. The changes in threat specifications, which may be warranted by
intelligence reports and world events, are typically regarding the physical
characteristics of what constitutes a threat: its material composition, its
shape, its method of concealment, etc. Here we present our design of an AATR
system (Adaptive ATR) that can adapt to changing specifications in materials
characterization (meaning density, as measured by its x-ray attenuation
coefficient), its mass, and its thickness. Our design uses a two-stage cascaded
approach, in which the first stage is characterized by a high recall rate over
the entire range of possibilities for the threat parameters that are allowed to
change. The purpose of the second stage is to then fine-tune the performance of
the overall system for the current threat specifications. The computational
effort for this fine-tuning for achieving a desired PD/PFA rate is far less
than what it would take to create a new classifier with the same overall
performance for the new set of threat specifications
Transportation Life Cycle Assessment Synthesis: Life Cycle Assessment Learning Module Series
The Life Cycle Assessment Learning Module Series is a set of narrated, self-advancing slideshows on various topics related to environmental life cycle assessment (LCA). This research project produced the first 27 of such modules, which are freely available for download on the CESTiCC website http://cem.uaf.edu/cesticc/publications/lca.aspx. Each module is roughly 15- 20 minutes in length and is intended for various uses such as course components, as the main lecture material in a dedicated LCA course, or for independent learning in support of research projects. The series is organized into four overall topical areas, each of which contain a group of overview modules and a group of detailed modules. The A and α groups cover the international standards that define LCA. The B and β groups focus on environmental impact categories. The G and γ groups identify software tools for LCA and provide some tutorials for their use. The T and τ groups introduce topics of interest in the field of transportation LCA. This includes overviews of how LCA is frequently applied in that sector, literature reviews, specific considerations, and software tutorials. Future modules in this category will feature methodological developments and case studies specific to the transportation sector
- …