365,407 research outputs found
EVNet: An Explainable Deep Network for Dimension Reduction
Dimension reduction (DR) is commonly utilized to capture the intrinsic
structure and transform high-dimensional data into low-dimensional space while
retaining meaningful properties of the original data. It is used in various
applications, such as image recognition, single-cell sequencing analysis, and
biomarker discovery. However, contemporary parametric-free and parametric DR
techniques suffer from several significant shortcomings, such as the inability
to preserve global and local features and the pool generalization performance.
On the other hand, regarding explainability, it is crucial to comprehend the
embedding process, especially the contribution of each part to the embedding
process, while understanding how each feature affects the embedding results
that identify critical components and help diagnose the embedding process. To
address these problems, we have developed a deep neural network method called
EVNet, which provides not only excellent performance in structural
maintainability but also explainability to the DR therein. EVNet starts with
data augmentation and a manifold-based loss function to improve embedding
performance. The explanation is based on saliency maps and aims to examine the
trained EVNet parameters and contributions of components during the embedding
process. The proposed techniques are integrated with a visual interface to help
the user to adjust EVNet to achieve better DR performance and explainability.
The interactive visual interface makes it easier to illustrate the data
features, compare different DR techniques, and investigate DR. An in-depth
experimental comparison shows that EVNet consistently outperforms the
state-of-the-art methods in both performance measures and explainability.Comment: 18 pages, 15 figures, accepted by TVC
Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery
<p>Abstract</p> <p>Background</p> <p>As a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers. However, its clinical use is not fully validated yet. An important factor to prevent this young technology to become a mainstream cancer diagnostic paradigm is that robustly identifying cancer molecular patterns from high-dimensional protein expression data is still a challenge in machine learning and oncology research. As a well-established dimension reduction technique, PCA is widely integrated in pattern recognition analysis to discover cancer molecular patterns. However, its global feature selection mechanism prevents it from capturing local features. This may lead to difficulty in achieving high-performance proteomic pattern discovery, because only features interpreting global data behavior are used to train a learning machine.</p> <p>Methods</p> <p>In this study, we develop a nonnegative principal component analysis algorithm and present a nonnegative principal component analysis based support vector machine algorithm with sparse coding to conduct a high-performance proteomic pattern classification. Moreover, we also propose a nonnegative principal component analysis based filter-wrapper biomarker capturing algorithm for mass spectral serum profiles.</p> <p>Results</p> <p>We demonstrate the superiority of the proposed algorithm by comparison with six peer algorithms on four benchmark datasets. Moreover, we illustrate that nonnegative principal component analysis can be effectively used to capture meaningful biomarkers.</p> <p>Conclusion</p> <p>Our analysis suggests that nonnegative principal component analysis effectively conduct local feature selection for mass spectral profiles and contribute to improving sensitivities and specificities in the following classification, and meaningful biomarker discovery.</p
Simultaneous Learning of Nonlinear Manifold and Dynamical Models for High-dimensional Time Series
The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.National Science Foundation (IIS 0308213, IIS 0329009, CNS 0202067
- …