Search CORE

12,298 research outputs found

Mining SOM expression portraits: Feature selection and integrating concepts of molecular function

Author: Hans Binder
Henry Wirth
Martin von Bergen
Publication venue
Publication date: 04/12/2011
Field of study

Background: 
Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function.

Results: 
Different expression scores based either on simple fold change-measures or on regularized Students t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues, as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and highly expressed genes using SOM data filtering. 

Conclusions:
The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones

Springer - Publisher Connector

Nature Precedings

Applying Genetic Algorithm to Generation of High-Dimensional Item Response Data

Author: ByoungWook Kim
JaMee Kim
WonGyu Lee
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

The item response data is the nm-dimensional data based on the responses made by m examinees to the questionnaire consisting of n items. It is used to estimate the ability of examinees and item parameters in educational evaluation. For estimates to be valid, the simulation input data must reflect reality. This paper presents the effective combination of the genetic algorithm (GA) and Monte Carlo methods for the generation of item response data as simulation input data similar to real data. To this end, we generated four types of item response data using Monte Carlo and the GA and evaluated how similarly the generated item response data represents the real item response data with the item parameters (item difficulty and discrimination). We adopt two types of measurement, which are root mean square error and Kullback-Leibler divergence, for comparison of item parameters between real data and four types of generated data. The results show that applying the GA to initial population generated by Monte Carlo is the most effective in generating item response data that is most similar to real item response data. This study is meaningful in that we found that the GA contributes to the generation of more realistic simulation input data

Crossref

Directory of Open Access Journals

Prediction Of Antimicrobial Peptides Based On Sequence Alignment And Secondary Structure Sequence And Segment Sequence.pdf

Author: Soh Meng Wah
Publication venue
Publication date: 01/08/2015
Field of study

Peptida antimicrobial (AMP) adalah sejenis peptide semula jadi yang penting untuk sistem imun. Penyelidik berminat untuk membuat ubat dengan AMP sebagai alternatif kerana bakteria semakin boleh menentang dengan antibiotik yang sedia ada. Walaubagaimanapun, eksperimen untuk mengekstrak AMP dari protein mahal dan mengambil masa. Oleh itu, alat pengiraan yang berkesan dan tepat meramalkan AMP baru amat dikehendaki untuk mengkaji ubat baru. Dalam projek ini, algoritma baru dicadangkan sebagai alat pengiraan dengan mengabungkan kaedah penjajaran urutan dan urutan struktur sekunder (SSS) dan urutan segmen (SS). Penjajaran urutan dilaksana berdasarkan HSPs maksimum skor yang diramalkan oleh BLASTP. Kaedah penjajaran urutan tidak dapat meramalkan semua urutan. Keputusan fasa penjajaran urutan adalah di 91.02 % bagi set data biasa, 80.88 % untuk urutan yang mempunyai persamaan <0.7, dan 96.02 % untuk CAMP set data. Bagi urutan yang tidak boleh diramalkan, ramalan diteruskan dengan menggunakan ciri-ciri SSS dan SS. Pengekstrakan ciri dan pilihan ciri dilakukan dan kemudian ciri-ciri tersebut digunakan untuk melatih pembelajaran mesin SVM bagi mengklasifikasikan urutan sama ada AMP atau bukan AMP. Keputusan ujian keseluruhan adalah 83.27% bagi set data biasa, 71.83% untuk urutan yang mempunyai persamaan <0.7, dan 91.49% untuk CAMP set data. Berbanding dengan fasa kedua kajian dulu yang menggabungkan dengan kaedah penjajaran jujukan, kajian ini mempunyai hasil yang rendah (<27%) dengan hanya menggunakan ramalan dengan SSS dan SS. Ini menunjukkan bahawa algoritma baru yang dicadangkan tidak sesuai untuk digunakan sebagai peramal AMP. ________________________________________________________________________________________________________________________ Antimicrobial peptides (AMPs) are natural peptides that are important for immune system. Researchers are interested in designing alternative drugs with AMPs because more bacteria are becoming resistant to the available antibiotics. However, the experiments to extract AMP from protein sequences are time consuming and costly. Thus, a computational tool with more effective and accurately predicting novel AMPs is highly demanded to provide more candidates and useful insights for drug design. In this study, a new algorithm is proposed as a computational tool by integrating the sequence alignment method and the secondary structure sequence (SSS) and segment sequence (SS). The sequence alignment is accomplished by the classification of test sequences based on the maximum high-scoring segment pairs (HSPs) score predicted by Basic Local Alignment Search Tool for protein (BLASTP). The results of sequence alignment phase are in 91.02% for normal dataset, 80.88% on <0.7 sequence similarity train set and 96.02% for CAMP dataset. Sequence alignment method is not able to predict all sequences and the unpredicted sequences is then predicted by utilizing the SSS and SS features. Feature extraction and feature selection is performed to obtain the features. These features are used to train the SVM model which is then be used to classify the sequences to whether it is AMP or non-AMP. The overall results of independent test is 83.27% for normal dataset, 71.83% for sequence with <0.7 similarity dataset and 91.49% for CAMP dataset. In comparison of second phase with past research that combines with sequence alignment method, this research has relatively low yield (<27%) contributed by the prediction utilizing SSS and SS features only. This indicates that the proposed algorithm is not suitable to be used as AMPs predictor

Repository@USM

Recommended from our members

Advances in manufacturing technology – XXII

Author: Cheng K
Harrison DJ
Makatsoris H
Publication venue: Brunel University
Publication date: 01/01/2008
Field of study

Brunel University Research Archive

Recommended from our members

Optical biopsy identification and grading of gliomas using label-free visible resonance Raman spectroscopy.

Author: Alfano Robert R
Cheng Gangge
Liu Cheng-Hui
Shi Lingyan
Wang Kai
Wu Binlin
Yu Xinguang
Zhang Chunyuan
Zhang Lin
Zhao Mingyue
Zhou Yan
Zhu Ke
Zong Rui
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Glioma is one of the most refractory types of brain tumor. Accurate tumor boundary identification and complete resection of the tumor are essential for glioma removal during brain surgery. We present a method based on visible resonance Raman (VRR) spectroscopy to identify glioma margins and grades. A set of diagnostic spectral biomarkers features are presented based on tissue composition changes revealed by VRR. The Raman spectra include molecular vibrational fingerprints of carotenoids, tryptophan, amide I/II/III, proteins, and lipids. These basic in situ spectral biomarkers are used to identify the tissue from the interface between brain cancer and normal tissue and to evaluate glioma grades. The VRR spectra are also analyzed using principal component analysis for dimension reduction and feature detection and support vector machine for classification. The cross-validated sensitivity, specificity, and accuracy are found to be 100%, 96.3%, and 99.6% to distinguish glioma tissues from normal brain tissues, respectively. The area under the receiver operating characteristic curve for the classification is about 1.0. The accuracies to distinguish normal, low grade (grades I and II), and high grade (grades III and IV) gliomas are found to be 96.3%, 53.7%, and 84.1% for the three groups, respectively, along with a total accuracy of 75.1%. A set of criteria for differentiating normal human brain tissues from normal control tissues is proposed and used to identify brain cancer margins, yielding a diagnostic sensitivity of 100% and specificity of 71%. Our study demonstrates the potential of VRR as a label-free optical molecular histopathology method used for in situ boundary line judgment for brain surgery in the margins

eScholarship - University of California

Optimal design for real-time quantitative monitoring of sand in gas flowline using computational intelligence assisted design framework

Author: Aminu Kuda Tijjani
Chen Yi
McGlinchey Don
Publication venue: 'Elsevier BV'
Publication date: 01/06/2019
Field of study

ResearchOnline@GCU

An artificial intelligence tool for heterogeneous team formation in the classroom

Author: Adams
Agresti
Agresti
Alberola
Alberto Palomares
Alexander
Ancona
André
Aritzeta
Bais
Balmaceda
Bantel
Behfar
Belbin
Belbin
Belbin
Bergey
Blasco-Arcas
Blignaut
Broucek
Cavanaugh
Christodoulopoulos
De Dreu
De Dreu
Dewiyanti
Dunne
Elena del Val
Ensley
Fares
Fay
Fisher
Francescato
Fransen
Fredrick
Furnham
Graf
Graffelman
Grigorenko
Hansen
Henry
Higgs
Hirji
Humphrey
Hwang
Jackson
Jefferies
Jewell
John
Jones
Juan M. Alberola
Kerr
King
Ku
Lin
Maria Dolores Teruel
Mathieu
Mathieu
Mathieu
McDonald
Meredith Belbin
Moreno
Mumma
Myers
Newcombe
Ohta
Ounnas
Park
Parker
Parker
Partington
Paulhus
Pfaff
Prichard
Pronin
Rahwan
Rajendran
Ratcheva
Ross
Rothman
Russell
Salas
Sancho-Thomas
Schneider
Senior
Senior
Senior
Smith
Sommerville
Spoelstra
Stahl
Stewart
Sundstrom
Tarricone
Tolmie
Val
van Aalst
van de Water
van Dierendonck
Victor Sanchez-Anguix
Wang
Wi
Yannibelli
Yannibelli
Publication venue: 'Elsevier BV'
Publication date: 16/04/2016
Field of study

Nowadays, there is increasing interest in the development of teamwork skills in the educational context. This growing interest is motivated by its pedagogical effectiveness and the fact that, in labour contexts, enterprises organize their employees in teams to carry out complex projects. Despite its crucial importance in the classroom and industry, there is a lack of support for the team formation process. Not only do many factors influence team performance, but the problem becomes exponentially costly if teams are to be optimized. In this article, we propose a tool whose aim it is to cover such a gap. It combines artificial intelligence techniques such as coalition structure generation, Bayesian learning, and Belbin's role theory to facilitate the generation of working groups in an educational context. This tool improves current state of the art proposals in three ways: i) it takes into account the feedback of other teammates in order to establish the most predominant role of a student instead of self-perception questionnaires; ii) it handles uncertainty with regard to each student's predominant team role; iii) it is iterative since it considers information from several interactions in order to improve the estimation of role assignments. We tested the performance of the proposed tool in an experiment involving students that took part in three different team activities. The experiments suggest that the proposed tool is able to improve different teamwork aspects such as team dynamics and student satisfaction

arXiv.org e-Print Archive

Crossref

RiuNet

Coventry University Pure Portal

2014 Annual Research Symposium Abstract Book

Author: Trinity College
Publication venue: Trinity College Digital Repository
Publication date: 01/04/2014
Field of study

2014 annual volume of abstracts for science research projects conducted by students at Trinity College

Trinity College

AI driven B-cell Immunotherapy Design

Author: Ascher David B.
da Silva Bruna Moreira
Geard Nicholas
Pires Douglas E. V.
Publication venue
Publication date: 03/09/2023
Field of study

Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. The past decade has also witnessed the evolution of computational approaches aiming to support immunotherapy design. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design, encompassing linear and conformational epitope prediction, paratope prediction, and antibody design. We mapped the most commonly used data sources, evaluation metrics, and method availability and thoroughly assessed their significance and limitations, discussing the main challenges ahead

arXiv.org e-Print Archive