Search CORE

61,424 research outputs found

Hyperspectral colon tissue cell classification

Author: Rajpoot Kashif
Rajpoot Nasir M. (Nasir Mahmood)
Turner Martin J.
Publication venue
Publication date: 01/01/2004
Field of study

A novel algorithm to discriminate between normal and malignant tissue cells of the human colon is presented. The microscopic level images of human colon tissue cells were acquired using hyperspectral imaging technology at contiguous wavelength intervals of visible light. While hyperspectral imagery data provides a wealth of information, its large size normally means high computational processing complexity. Several methods exist to avoid the so-called curse of dimensionality and hence reduce the computational complexity. In this study, we experimented with Principal Component Analysis (PCA) and two modifications of Independent Component Analysis (ICA). In the first stage of the algorithm, the extracted components are used to separate four constituent parts of the colon tissue: nuclei, cytoplasm, lamina propria, and lumen. The segmentation is performed in an unsupervised fashion using the nearest centroid clustering algorithm. The segmented image is further used, in the second stage of the classification algorithm, to exploit the spatial relationship between the labeled constituent parts. Experimental results using supervised Support Vector Machines (SVM) classification based on multiscale morphological features reveal the discrimination between normal and malignant tissue cells with a reasonable degree of accuracy

Warwick Research Archives Portal Repository

Lazy stochastic principal component analysis

Author: Li Li
Nguyen Dinh
Wojnowicz Michael
Zhao Xuan
Publication venue
Publication date: 21/09/2017
Field of study

Stochastic principal component analysis (SPCA) has become a popular dimensionality reduction strategy for large, high-dimensional datasets. We derive a simplified algorithm, called Lazy SPCA, which has reduced computational complexity and is better suited for large-scale distributed computation. We prove that SPCA and Lazy SPCA find the same approximations to the principal subspace, and that the pairwise distances between samples in the lower-dimensional space is invariant to whether SPCA is executed lazily or not. Empirical studies find downstream predictive performance to be identical for both methods, and superior to random projections, across a range of predictive models (linear regression, logistic lasso, and random forests). In our largest experiment with 4.6 million samples, Lazy SPCA reduced 43.7 hours of computation to 9.9 hours. Overall, Lazy SPCA relies exclusively on matrix multiplications, besides an operation on a small square matrix whose size depends only on the target dimensionality.Comment: To be published in: 2017 IEEE International Conference on Data Mining Workshops (ICDMW

arXiv.org e-Print Archive

Crossref

Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper

Author: A. Mousavi
Ambrosetti
Askin
Banks
Banks
Beylkin
Blum
Borgonovo
Braddock
Brodersen
Buchenneder
Bunke
Buonomo
Charaniya
Chen
Chi
Cloke
Cukier
De Pauw
Duffy
Durkee
Faghihi
Gaweda
Guyon
Hand
He
Hung
Jain
James
Joliffe
Kang
Kim
Kohavi
Krugera
Krzykacz-Hausmann
Kwak
Lallemand
Lavrač
Lemaire
Li
Li
Liu
McRae
Mirkin
Mladenić
Norvig
Park
Quevedo
Ragg
Robert
S. Poslad
S. Tavakoli
Saltelli
Saltelli
Shonkwiler
Sobol
Takagi
Talavera
Tavakoli
Unler
Uysal
Xing
Xu
Yang
Øksendal
Publication venue: 'Elsevier BV'
Publication date: 01/10/2013
Field of study

This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated.E

Crossref

Brunel University Research Archive

Recommended from our members

A dimensionality reduction method to select the most representative daylight illuminance distributions

Author: Jakubiec John Alstan
Kent Michael G
Schiavon Stefano
Publication venue: eScholarship, University of California
Publication date: 14/01/2020
Field of study

One challenge when evaluating daylight distribution is dealing with the large amount of temporal and spatial data, visualisations and variability in illuminances that are assessed in buildings. Using a dimensionality reduction method based on principal component analysis, we identified the most representative annual daylight distributions. We modelled a rectangular room containing an analysis grid of 3200 illuminance sensor points and simulated 3285 different temporal daylight conditions using an annual occupancy schedule ranging from 08:00 to 17:00 with one-hour sampling intervals in two locations: Singapore and Oakland, California. Our approach explained 98 % of the illuminance variability with three daylight distributions in Singapore, and 92 % using six in Oakland, California. Our dimensionality reduction strategy was also generalised using a complex building geometry showing the utility of the method. We think this approach can be used to provide a more efficient and reliable method to analyse daylight performance in building practice

eScholarship - University of California

Random Indexing K-tree

Author: De Vine Lance
De Vries Christopher M.
Geva Shlomo
Publication venue
Publication date: 01/01/2009
Field of study

Random Indexing (RI) K-tree is the combination of two algorithms for clustering. Many large scale problems exist in document clustering. RI K-tree scales well with large inputs due to its low complexity. It also exhibits features that are useful for managing a changing collection. Furthermore, it solves previous issues with sparse document vectors when using K-tree. The algorithms and data structures are defined, explained and motivated. Specific modifications to K-tree are made for use with RI. Experiments have been executed to measure quality. The results indicate that RI K-tree improves document cluster quality over the original K-tree algorithm.Comment: 8 pages, ADCS 2009; Hyperref and cleveref LaTeX packages conflicted. Removed clevere

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Optimal set of EEG features for emotional state classification and trajectory visualization in Parkinson's disease

Author: Ibrahim Norlinah Mohamed
Mohamad Khairiyah
Murugappan Murugappan
Omar Mohd Iqbal
Palaniappan Ramaswamy
Sundaraj Kenneth
Yuvaraj Rajamanickam
Publication venue: 'Elsevier BV'
Publication date: 31/07/2014
Field of study

In addition to classic motor signs and symptoms, individuals with Parkinson's disease (PD) are characterized by emotional deficits. Ongoing brain activity can be recorded by electroencephalograph (EEG) to discover the links between emotional states and brain activity. This study utilized machine-learning algorithms to categorize emotional states in PD patients compared with healthy controls (HC) using EEG. Twenty non-demented PD patients and 20 healthy age-, gender-, and education level-matched controls viewed happiness, sadness, fear, anger, surprise, and disgust emotional stimuli while fourteen-channel EEG was being recorded. Multimodal stimulus (combination of audio and visual) was used to evoke the emotions. To classify the EEG-based emotional states and visualize the changes of emotional states over time, this paper compares four kinds of EEG features for emotional state classification and proposes an approach to track the trajectory of emotion changes with manifold learning. From the experimental results using our EEG data set, we found that (a) bispectrum feature is superior to other three kinds of features, namely power spectrum, wavelet packet and nonlinear dynamical analysis; (b) higher frequency bands (alpha, beta and gamma) play a more important role in emotion activities than lower frequency bands (delta and theta) in both groups and; (c) the trajectory of emotion changes can be visualized by reducing subject-independent features with manifold learning. This provides a promising way of implementing visualization of patient's emotional state in real time and leads to a practical system for noninvasive assessment of the emotional impairments associated with neurological disorders

Kent Academic Repository