Search CORE

6,514 research outputs found

Benchmark of structured machine learning methods for microbial identification from mass-spectrometry data

Author: Mahé Pierre
Vert Jean-Philippe
Vervier Kévin
Veyrieras Jean-Baptiste
Publication venue
Publication date: 13/05/2015
Field of study

Microbial identification is a central issue in microbiology, in particular in the fields of infectious diseases diagnosis and industrial quality control. The concept of species is tightly linked to the concept of biological and clinical classification where the proximity between species is generally measured in terms of evolutionary distances and/or clinical phenotypes. Surprisingly, the information provided by this well-known hierarchical structure is rarely used by machine learning-based automatic microbial identification systems. Structured machine learning methods were recently proposed for taking into account the structure embedded in a hierarchy and using it as additional a priori information, and could therefore allow to improve microbial identification systems. We test and compare several state-of-the-art machine learning methods for microbial identification on a new Matrix-Assisted Laser Desorption/Ionization Time-of-Flight mass spectrometry (MALDI-TOF MS) dataset. We include in the benchmark standard and structured methods, that leverage the knowledge of the underlying hierarchical structure in the learning process. Our results show that although some methods perform better than others, structured methods do not consistently perform better than their "flat" counterparts. We postulate that this is partly due to the fact that standard methods already reach a high level of accuracy in this context, and that they mainly confuse species close to each other in the tree, a case where using the known hierarchy is not helpful

arXiv.org e-Print Archive

HAL-MINES ParisTech

Rigid surface operators and S-duality: some proposals

Author: D. Hézard
D.H. Collingwood
G. Lusztig
J. Gomis
M. Henningson
M. Henningson
M. Henningson
N. Drukker
N. Wyllard
Niclas Wyllard
P.C. Argyres
S. Gukov
S. Gukov
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We study surface operators in the N=4 supersymmetric Yang-Mills theories with gauge groups SO(n) and Sp(2n). As recently shown by Gukov and Witten these theories have a class of rigid surface operators which are expected to be related by S-duality. The rigid surface operators are of two types, unipotent and semisimple. We make explicit proposals for how the S-duality map should act on unipotent surface operators. We also discuss semisimple surface operators and make some proposals for certain subclasses of such operators.Comment: 27 pages. v2: minor changes, added referenc

arXiv.org e-Print Archive

Crossref

Chalmers Research

Chalmers Publication Library

Roughness of molecular property landscapes and its impact on modellability

Author: Aldeghi Matteo
Coley Connor W.
Frey Nathan
Graff David E.
Jordan Kirk E.
Morrone Joseph A.
Pyzer-Knapp Edward O.
Publication venue
Publication date: 19/07/2022
Field of study

In molecular discovery and drug design, structure-property relationships and activity landscapes are often qualitatively or quantitatively analyzed to guide the navigation of chemical space. The roughness (or smoothness) of these molecular property landscapes is one of their most studied geometric attributes, as it can characterize the presence of activity cliffs, with rougher landscapes generally expected to pose tougher optimization challenges. Here, we introduce a general, quantitative measure for describing the roughness of molecular property landscapes. The proposed roughness index (ROGI) is loosely inspired by the concept of fractal dimension and strongly correlates with the out-of-sample error achieved by machine learning models on numerous regression tasks.Comment: 17 pages, 6 figures, 2 tables (SI with 17 pages, 16 figures

arXiv.org e-Print Archive

Interoperability of fingerprint sensors and matching algorithms

Author: Lugini Luca
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2014
Field of study

Biometric systems are widely deployed in governmental, military and commercial/civilian applications. There are a multitude of sensors and matching algorithms available from different vendors. This creates a competitive market for these products, which is good for the consumers but emphasizes the importance of interoperability. In fingerprint recognition, interoperability is the ability of a system to work with a diverse set of fingerprint devices. Variations induced by fingerprint sensors include image resolution, scanning area, gray levels, etc. Such variations can impact the quality of the extracted features, and cross-device matching performance. This is true even when dealing with fingerprint sensors of the same sensing technology. In this thesis, we perform a large-scale empirical study of the status of interoperability between fingerprint sensors and assess the performance consequence when interoperability is lacking. Additionally we develop a method to increase interoperability in fingerprint-based recognition systems deploying optical fingerprint sensors. A set of features to measure differences in fingerprint acquisition is designed and evaluated. Finally, different fusion schemes based on machine learning are tested end evaluated in order to exploit the designed set of features. Experimental results show that the proposed approach is able to reduce cross-device match error rates by a significant margin

The Research Repository @ WVU (West Virginia University)

Application of Graph Neural Networks and graph descriptors for graph classification

Author: Adamczyk Jakub
Publication venue
Publication date: 07/11/2022
Field of study

Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.Comment: Master's thesis submitted at AGH University of Science and Technolog

arXiv.org e-Print Archive

11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

Author: Abel R
Achenbach J
Adikwu UM
Ain QU
Al-Yamori R
Alhalabi Z
Aniceto N
Ansideri F
Baker D
Balducci A
Banting L
Barilla J
Barrett I
Basu D
Baumann K
Bender A
Bender A
Bender A
Berg E
Bergström F
Bermudez M
Bietz S
Bietz S
Bodnarchuk MS
Boeckler FM
Boeckler FM
Bojarski AJ
Bojarski AJ
Borbulevych OY
Buchholz M
Bulusu KC
Bureau R
Böckler FM
Böttcher S
Büttner FM
Cao Q
Cappel D
Cheeseright T
Clark RD
Clark T
Da Costa FB
Dahlgren M
De Graaf C
Demuth H-U
Dorfman R
Dubrucq K
Ecker GF
Edman K
Egelkraut-Holtus M
Eid S
Eigner-Pitto V
Engel J
Engkvist O
Epple M
Essex JW
Evers A
Exner TE
Fan T-P
Fechner U
Finkelmann AR
Firaha DS
Firth M
Fourches D
Fraaije JH
Frach R
Frach R
Fraczkiewicz R
Freitas A
Friedrich N-O
Friesner R
Fu X
Fuchs JE
Fulle S
Furtado F
Garg P
Gervasio FL
Ghafourian T
Glen R
Gracia RS
Grebner C
Guallar V
Göller AH
Günther MB
Günther S
Güssregen S
Haensele E
Heidrich J
Heil J
Hennig S
Herrmann G
Hessler G
Hilbig M
Himmler H-J
Hoffgaard F
Hogner A
Hollóczki O
Horinek D
Hošek P
Husch T
Ibezim A
Ihlenfeldt WD
Ihlenfeldt WD
Jardin C
Judson P
Jäger C
Kalinowski L
Kalliokoski T
Kast SM
Kast SM
Kast SM
Kibies P
Kibies P
Kirchmair J
Kirchner B
Kireeva N
Klute W
Koch O
Koch P
Kohlbacher O
Kolb P
Korth M
Kos A
Kramer C
Krilov G
Krotzky T
Krotzky T
Kuhn H
Kuhn MA
Kurczab R
Kühne R
Lange A
Lange A
Lanig H
Laufer S
Levine Z
Li X
Lifongo LL
Lin T
Lisurek M
Lokajíček MV
Mackey M
Masek BB
Mathea M
Matter H
Mbah CJ
Mbaze LM
McWilliams L
Mervin L
Mervin LH
Mittal S
Mohamad-Zobir SZ
Montanari F
Moser D
Mrugalla F
Mullen R
Murray DC
Nagy S
Nahum O
Naß A
Nguyen QD
Nogueira MS
Ntie-Kang F
Ntie-Kang F
Ntie-Kang F
Nwodo NJ
Oliveira Santos JS-D
Oliveira TB
Omoto K
Onlia I
Ostroumov D
Owen RM
Panecka J
Patel H
Pervov VS
Petrov A
Pisaková H
Pleik S
Polokoff M
Pongratz T
Pretzel J
Proschak E
Pryde DC
Pöhner IA
Rarey M
Rarey M
Rarey M
Rauh D
Renner G
Renner G
Richmond NJ
Rickmeyer T
Rippmann F
Ross GA
Ruff M
Rupp B
Saladino G
Saleh N
Sandmann A
Sandmann A
Schall C
Schmidt D
Schmidt TC
Schmidt TJ
Schmidtke P
Schneider G
Schomburg KT
Schram J
Schulz R
Schütter C
Segler MHS
Senderowitz H
Shaikh N
Shea J-E
Sherman W
Sievers-Engler A
Simoben CV
Simr P
Sippl W
Smith S
Solovev VP
Soltanshahi F
Sommer K
Sotriffer CA
Spiwok V
Stehle T
Steinbrecher TB
Steudle A
Sticht H
Strohfeldt S
Sánchez-García E
Tautermann CS
Torda AE
Torella R
Truszkowski A
Turk S
Tyrchan C
Tyrchan C
Ulander J
Ulander J
Van den Broek K
Van den Broek K
Van Oeyen A
Volkamer A
Wade RC
Waldman M
Waller MP
Wang L
Warszycki D
Weber J
Wessjohann L
Westerhoff LM
Whitley DC
Wieczorek V
Wolber G
Yosipof A
Zdrazil B
Zielesny A
Zimmermann MO
Zoufir A
Śmieja M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/03/2016
Field of study

Spiral - Imperial College Digital Repository

Application of LANDSAT to the surveillance of lake eutrophication in the Great Lakes basin

Author: Adams M. S.
Gannon J. E.
Rogers R. H.
Scherz J. P.
Smith V. E.
Woelkerling W. J.
Publication venue
Publication date
Field of study

The author has identified the following significant results. A step-by-step procedure for establishing and monitoring the trophic status of inland lakes with the use of LANDSAT data, surface sampling, laboratory analysis, and aerial observations were demonstrated. The biomass was related to chlorophyll-a concentrations, water clarity, and trophic state. A procedure was developed for using surface sampling, LANDSAT data, and linear regression equations to produce a color-coded image of large lakes showing the distribution and concentrations of water quality parameters, causing eutrophication as well as parameters which indicate its effects. Cover categories readily derived from LANDSAT were those for which loading rates were available and were known to have major effects on the quality and quantity of runoff and lake eutrophication. Urban, barren land, cropland, grassland, forest, wetlands, and water were included

NASA Technical Reports Server

Drug side-effect prediction using machine learning methods

Author: Khan Muhammad
Publication venue
Publication date: 11/12/2017
Field of study

Drug toxicity (or adverse side effects) is a pressing health problem which is also an impediment to the development of therapeutically effective drugs. Despite many on-going efforts to determine the toxicity beforehand, computational prediction of drug side-effects remains a challenging task. This thesis presents an approach to predict side-effects by utilizing side-information sources for the drugs, while simultaneously comparing state-of-the-art machine learning methods to improve accuracy. Specifically, the thesis implements a data-analysis pipeline for obtaining side-information that are useful for the prediction task. This thesis then formulates the drug side-effect prediction as a machine learning problem: Given disease indications and structural features (as side-information sources) of drugs, for which some measurements of side-effect exist, predict sideeffect for a new drug. As case studies, the prediction accuracies are compared for ten different side-effects using linear as well as non-linear machine learning methods. The thesis summarizes three key findings. First, the drug side-information sources are predictive of the side-effects. Second, non-linear methods show improved prediction accuracies as compared to their linear analogs. Third, the integration of disease indications and structural features with a principled machine learning approach further improves the drug side-effect predictions. However, the current study limits the analysis assuming side-effects are independent. In future, modeling the joint relationships of several side-effects could yield more strong predictions and better help to understand the underlying biological mechanism

Aaltodoc Publication Archive

A machine learning based drug discovery pipeline: finding new therapies for Cystic Fibrosis

Author: Sousa Paulo Nuno Hilário Teixeira de
Publication venue
Publication date: 01/01/2019
Field of study

Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2019O avanço tecnológico e a crescente disponibilidade de dados públicos levaram ao desenvolvimento de metodologias robustas de predição de atividade de compostos com base em aprendizagem automática. Estas metodologias apresentam maior rapidez, eficiência e menores custos que os métodos tradicionais de descoberta de fármacos. Fibrose Quística (FQ) é uma doença autossómica progressiva para a qual existe urgente necessidade de surgimento de novas terapias. Mutações no gene CFTR nos pacientes de FQ levam à produção deficiente do canal de membrana de transporte de aniões CFTR, gerando desequilíbrios iónicos e transporte anormal de fluidos. FQ afeta vários órgãos, os pulmões com mais gravidade, sendo normalmente devido a problemas nestes a causa de morte prematura. A mutação mais prevalente e relevante em FQ é a deleção da fenilalanina 508 (F508del-CFTR). Por esta razão, os principais esforços de descoberta de novos fármacos são direcionados a corrigir ou amenizar os feitos desta mutação. Foi criada uma metodologia com recurso a modelos de aprendizagem automática de classificação e regressão baseada em máquinas de vetores de suporte e Random Forests para descoberta de compostos com potencial terapêutico em FQ a partir de bases de dados de compostos de acesso público. Os compostos mais promissores foram selecionados e testados em laboratório através de ensaios de imunofluorescência com microscopia automatizada de triagem e análise de alto rendimento sobre o efeito na F508del-CFTR, com base na eficiência de tráfego da F508del-CFTR para a membrana plasmática. Os 10 compostos com melhores resultados neste ensaio foram validados com Western Blot e comparados com dois conhecidos compostos corretores da F508del-CFTR. 4 compostos foram identificados como promissores compostos terapêuticos para FQ

Universidade de Lisboa: Repositório.UL