Search CORE

6,524 research outputs found

Characteristics of predictor sets found using differential prioritization

Author: A Bhattacharjee
C Ding
CH Ooi
CH Ooi
CH Ooi
CH Ooi
Chia Huey Ooi
DT Ross
EJ Yeoh
H Chai
I Guyon
J Khan
K Munagala
L Yu
M Park
MA Hall
Madhu Chetty
RC Beavis
S Dudoit
S Ramaswamy
SA Armstrong
Shyh Wei Teng
T Li
TA Knijnenburg
TR Golub
YH Yang
Publication venue: BioMed Central
Publication date: 01/06/2007
Field of study

Abstract Background Feature selection plays an undeniably important role in classification problems involving high dimensional datasets such as microarray datasets. For filter-based feature selection, two well-known criteria used in forming predictor sets are relevance and redundancy. However, there is a third criterion which is at least as important as the other two in affecting the efficacy of the resulting predictor sets. This criterion is the degree of differential prioritization (DDP), which varies the emphases on relevance and redundancy depending on the value of the DDP. Previous empirical works on publicly available microarray datasets have confirmed the effectiveness of the DDP in molecular classification. We now propose to establish the fundamental strengths and merits of the DDP-based feature selection technique. This is to be done through a simulation study which involves vigorous analyses of the characteristics of predictor sets found using different values of the DDP from toy datasets designed to mimic real-life microarray datasets. Results A simulation study employing analytical measures such as the distance between classes before and after transformation using principal component analysis is implemented on toy datasets. From these analyses, the necessity of adjusting the differential prioritization based on the dataset of interest is established. This conclusion is supported by comparisons against both simplistic rank-based selection and state-of-the-art equal-priorities scoring methods, which demonstrates the superiority of the DDP-based feature selection technique. Reapplying similar analyses to real-life multiclass microarray datasets provides further confirmation of our findings and of the significance of the DDP for practical applications. Conclusion The findings have been achieved based on analytical evaluations, not empirical evaluation involving classifiers, thus providing further basis for the usefulness of the DDP and validating the need for unequal priorities on relevance and redundancy during feature selection for microarray datasets, especially highly multiclass datasets.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A fuzzy-QFD approach for the enhancement of work equipment safety: a case study in the agriculture sector

Author: Fargnoli Mario
Haber Nicolas
Lombardi Mara
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2018
Field of study

The paper proposes a design for safety methodology based on the use of the Quality Function Deployment (QFD) method, focusing on the need to identify and analyse risks related to a working task in an effective manner, i.e. considering the specific work activities related to such a task. To reduce the drawbacks of subjectivity while augmenting the consistency of judgements, the QFD was augmented by both the Delphi method and the fuzzy logic approach. To verify such an approach, it was implemented through a case study in the agricultural sector. While the proposed approach needs to be validated through further studies in different contexts, its positive results in performing hazard analysis and risk assessment in a comprehensive and thorough manner can contribute practically to the scientific knowledge on the application of QFD in design for safety activities

Archivio della ricerca- Università di Roma La Sapienza

MRT Supportive Housing Evaluation: Final Report on Targeting of MRT-SH Services

Author: Gullick Margaret
Mahmud Mir Nahid
McGinnis Sandra
Polvere Lauren
Rees Chris E.
Publication venue: Scholars Archive
Publication date: 01/01/2020
Field of study

University at Albany, State University of New York (SUNY): Scholars Archive

Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data

Author: Chetty Madhu
Ooi Chia Huey
Teng Shyh Wei
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. RESULTS: We present two realistically evaluated correlation-based feature selection techniques which incorporate, in addition to the two existing criteria involved in forming a predictor set (relevance and redundancy), a third criterion called the degree of differential prioritization (DDP). DDP functions as a parameter to strike the balance between relevance and redundancy, providing our techniques with the novel ability to differentially prioritize the optimization of relevance against redundancy (and vice versa). This ability proves useful in producing optimal classification accuracy while using reasonably small predictor set sizes for nine well-known multiclass microarray datasets. CONCLUSION: For multiclass microarray datasets, especially the GCM and NCI60 datasets, DDP enables our filter-based techniques to produce accuracies better than those reported in previous studies which employed similarly realistic evaluation procedures

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Can’t Take a Joke? The Asymmetrical Nature of the Politicized Sense of Humor

Author: Gans Roger
Publication venue: DOCS@RWU
Publication date: 13/10/2015
Field of study

In an effort to tease out possible expressions of dispositional differences in people of different political ideologies, this study uses media preference and consumption data from the 2008 National Annenberg Election Survey (NAES08-Online) to examine characteristics of audiences for a range of television shows and genres. The individual shows include two political satires, The Daily Show with Jon Stewart, and The Colbert Report; a late-night comedy/variety show, The Tonight Show with Jay Leno; a hospital-based ensemble situation comedy, Scrubs; two animated comedies, The Simpsons, and The Family Guy; and two action-oriented dramas, 24, and CSI: Miami. The genres include comedies, dramas, sports and documentaries. The results of a series of one-way ANOVAs and regression analyses supported the hypotheses that conservatives do not enjoy humor as much as liberals, and that they enjoy political humor even less than non-political humor

DOCS@RWU

HELIN Digital Commons

GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization

Author: Chen Hung-I Harry
Chen Yidong
Chiu Yu-Chiao
Huang Yufei
Zhang Songyao
Zhang Tinghe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/12/2018
Field of study

Bioinformatics tools have been developed to interpret gene expression data at the gene set level, and these gene set based analyses improve the biologists' capability to discover functional relevance of their experiment design. While elucidating gene set individually, inter gene sets association is rarely taken into consideration. Deep learning, an emerging machine learning technique in computational biology, can be used to generate an unbiased combination of gene set, and to determine the biological relevance and analysis consistency of these combining gene sets by leveraging large genomic data sets. In this study, we proposed a gene superset autoencoder (GSAE), a multi-layer autoencoder model with the incorporation of a priori defined gene sets that retain the crucial biological features in the latent layer. We introduced the concept of the gene superset, an unbiased combination of gene sets with weights trained by the autoencoder, where each node in the latent layer is a superset. Trained with genomic data from TCGA and evaluated with their accompanying clinical parameters, we showed gene supersets' ability of discriminating tumor subtypes and their prognostic capability. We further demonstrated the biological relevance of the top component gene sets in the significant supersets. Using autoencoder model and gene superset at its latent layer, we demonstrated that gene supersets retain sufficient biological information with respect to tumor subtypes and clinical prognostic significance. Superset also provides high reproducibility on survival analysis and accurate prediction for cancer subtypes.Comment: Presented in the International Conference on Intelligent Biology and Medicine (ICIBM 2018) at Los Angeles, CA, USA and published in BMC Systems Biology 2018, 12(Suppl 8):14

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Reactive stroma and trastuzumab resistance in HER2-positive early breast cancer

Author: Ameye Lieveke
Azaria Amos
Denkert Carsten
Dvash Efrat
Joensuu Heikki
Kellokumpu-Lehtinen Pirkko-Liisa
Loi Sherene
Loibl Sibylle
Michiels Stefan
Pondé Noam
Richard Francois
Salgado Roberto
Salmon Asher
Salmon-Divon Mali
Sonnenblick Amir
Sotiriou Christos
Van den Eynden Gert
Zahavi Tamar
Publication venue
Publication date: 01/01/2020
Field of study

We investigated the value of reactive stroma as a predictor for trastuzumab resistance in patients with early HER2-positive breast cancer receiving adjuvant therapy. The pathological reactive stroma and the mRNA gene signatures that reflect reactive stroma in 209 HER2-positive breast cancer samples from the FinHer adjuvant trial were evaluated. Levels of stromal gene signatures were determined as a continuous parameter, and pathological reactive stromal findings were defined as stromal predominant breast cancer (SPBC; >= 50% stromal) and correlated with distant disease-free survival. Gene signatures associated with reactive stroma in HER2-positive early breast cancer (N = 209) were significantly associated with trastuzumab resistance in estrogen receptor (ER)-negative tumors (hazard ratio [HR] = 1.27 p interaction = 0.014 [DCN], HR = 1.58, p interaction = 0.027 [PLAU], HR = 1.71, p interaction = 0.019 [HER2STROMA, novel HER2 stromal signature]), but not in ER-positive tumors (HR = 0.73 p interaction = 0.47 [DCN], HR = 0.71, p interaction = 0.73 [PLAU], HR = 0.84; p interaction = 0.36 [HER2STROMA]). Pathological evaluation of HER2-positive/ER-negative tumors suggested an association between SPBC and trastuzumab resistance. Reactive stroma did not correlate with tumor-infiltrating lymphocytes (TILs), and the expected benefit from trastuzumab in patients with high levels of TILs was pronounced only in tumors with low stromal reactivity (SPBCPeer reviewe

DI-fusion

Helsingin yliopiston digitaalinen arkisto

University of Melbourne Institutional Repository

Mapping evolutionary process: a multi-taxa approach to conservation prioritization

Author: Buermann Wolfgang
Cameron Susan E
Chan Janice
Fuller Trevon
Graham Catherine H
Jarrín-V Pablo
Kieswetter Charles M
Mason Eliza
Milá Borja
Peralvo Manuel
Pollinger John P
Saatchi Sassan
Schlunegger Jasmin
Schneider Christopher J
Schweizer Rena
Smith Thomas B
Thomassen Henri A
Wang Ophelia
Wayne Robert K
Publication venue: Blackwell Publishing Ltd
Publication date: 01/01/2011
Field of study

Human-induced land use changes are causing extensive habitat fragmentation. As a result, many species are not able to shift their ranges in response to climate change and will likely need to adapt in situ to changing climate conditions. Consequently, a prudent strategy to maintain the ability of populations to adapt is to focus conservation efforts on areas where levels of intraspecific variation are high. By doing so, the potential for an evolutionary response to environmental change is maximized. Here, we use modeling approaches in conjunction with environmental variables to model species distributions and patterns of genetic and morphological variation in seven Ecuadorian amphibian, bird, and mammal species. We then used reserve selection software to prioritize areas for conservation based on intraspecific variation or species-level diversity. Reserves selected using species richness and complementarity showed little overlap with those based on genetic and morphological variation. Priority areas for intraspecific variation were mainly located along the slopes of the Andes and were largely concordant among species, but were not well represented in existing reserves. Our results imply that in order to maximize representation of intraspecific variation in reserves, genetic and morphological variation should be included in conservation prioritization

OPUS Augsburg

Crossref

PubMed Central

Carolina Digital Repository