276 research outputs found
Feature selection and nearest centroid classification for protein mass spectrometry
BACKGROUND: The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. RESULTS: This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. CONCLUSION: This study tested a number of popular feature selection methods using the nearest centroid classifier and found that several reportedly state-of-the-art algorithms in fact perform rather poorly when tested via stratified cross-validation. The revealed inconsistencies provide clear evidence that algorithm evaluation should be performed on several data sets using a consistent (i.e., non-randomized, stratified) cross-validation procedure in order for the conclusions to be statistically sound
A network flow algorithm for just-in-time project scheduling
We show the polynomial solvability of the PERT-COST project scheduling problem in the case of: (i) the objective being a piecewise-linear, convex (possibly, non- monotone) function of the job durations as well as of job start/finish times, and (ii) the precedence relations between jobs being presented in the form of a general (not necessary, acyclic) directed graph with arc lengths of any sign. For the latter problem, we present a network flow algorithm (of pseudo-linear complexity) which is easy to implement and which behaves well when the objective values grow slowly with the growth of the problem size while the number of breakpoints in the objective grows fast
Windsor Park: The Sinking Streets
At the encouragement of Nevada State Senator Dina Neal and law professors Ngai Pindell and Frank Fritz, undergraduate and graduate UNLV film students under the tutelage of film professor Brett Levner donned their masks and returned to the field to interview documentary subjects and bring awareness to a local community in the shadows searching for hope.https://digitalscholarship.unlv.edu/cfa_collaborate/1011/thumbnail.jp
Recommended from our members
Universal computing by DNA origami robots in a living animal
Biological systems are collections of discrete molecular objects that move around and collide with each other. Cells carry out elaborate processes by precisely controlling these collisions, but developing artificial machines that can interface with and control such interactions remains a significant challenge. DNA is a natural substrate for computing and has been used to implement a diverse set of mathematical problems1-3, logic circuits4-6 and robotics7-9. The molecule also naturally interfaces with living systems, and different forms of DNA-based biocomputing have previously been demonstrated10-13. Here we show that DNA origami14-16 can be used to fabricate nanoscale robots that are capable of dynamically interacting with each other17-18 in a living animal. The interactions generate logical outputs, which are relayed to switch molecular payloads on or off. As a proof-of-principle, we use the system to create architectures that emulate various logic gates (AND, OR, XOR, NAND, NOT, CNOT, and a half adder). Following an ex vivo prototyping phase, we successfully employed the DNA origami robots in living cockroaches (Blaberus discoidalis) to control a molecule that targets the cells of the animal
Recommended from our members
Barcoding cells using cell-surface programmable DNA-binding domains
We develop here a novel approach to barcode large numbers of cells through cell-surface expression of programmable zinc-finger DNA-binding domains (sZFs). We show sZFs enable double-stranded DNA to sequence-specifically label living cells, and also develop a sequential tagging approach to in situ image >3 cell types using just 3 fluorophores. Finally we demonstrate their broad versatility through ability to serve as surrogate reporters and facilitate selective cell capture and targeting
An improved FPTAS for mobile agent routing with time constraints
Author name used in this publication: T. C. E. ChengVersion of RecordPublishe
On PERT Networks with Alternatives
Management of projects often requires decisions concerning the choice of alternative activities. Then, the completion time of the whole project (i.e. the makerpan) is computed. In this paper, we aim at selecting the required activities simultaneously with the computation of the makespan. This problem is referred to as PERT Problem with Alternatives (PPA). The corresponding model is similar to a conventional PERT graph, except that two types of nodes are involved to represent either the choice between activities, or the fact that a set of activities should be completed before starting another set of activities. A formalization of the problem and some important properties concerning the optimal solution are given. Several well- solvable cases of the problem and a powerful decomposition algorithm running in polynomial time are presented. This decomposition is applicable for solving many real-life problems
On-line Part Scheduling in a Surface Treatment System
A real-time scheduling algorithm which guarantees an optimal makespan to each part which arrives in a line of chemical baths for surface treatment purpose is proposed. We first consider the case when the treatment preriods are much greater than the transportation times, which allows us to neglect these times. We then extend our approach to the case when transportation times cannot be neglected. Some numerical examples are provided to illustrate this approach
- …