Search CORE

3,495 research outputs found

Updates in metabolomics tools and resources: 2014-2015

Author: Misra Biswapriya B.
van der Hooft Justin
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

Enlighten

Bringing Liveness to Design Patterns

Author: Filipe Oliveira e Sousa Ferreira de Lemos
Publication venue
Publication date: 22/07/2020
Field of study

Repositório Aberto da Universidade do Porto

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2018
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Oxford University Research Archive

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

GEML: A Grammar-based Evolutionary Machine Learning Approach for Design-Pattern Detection

Author: Barbudo Rafael
Ramírez Aurora
Romero José Raúl
Servant Francisco
Publication venue
Publication date: 13/01/2024
Field of study

Design patterns (DPs) are recognised as a good practice in software development. However, the lack of appropriate documentation often hampers traceability, and their benefits are blurred among thousands of lines of code. Automatic methods for DP detection have become relevant but are usually based on the rigid analysis of either software metrics or specific properties of the source code. We propose GEML, a novel detection approach based on evolutionary machine learning using software properties of diverse nature. Firstly, GEML makes use of an evolutionary algorithm to extract those characteristics that better describe the DP, formulated in terms of human-readable rules, whose syntax is conformant with a context-free grammar. Secondly, a rule-based classifier is built to predict whether new code contains a hidden DP implementation. GEML has been validated over five DPs taken from a public repository recurrently adopted by machine learning studies. Then, we increase this number up to 15 diverse DPs, showing its effectiveness and robustness in terms of detection capability. An initial parameter study served to tune a parameter setup whose performance guarantees the general applicability of this approach without the need to adjust complex parameters to a specific pattern. Finally, a demonstration tool is also provided.Comment: 27 pages, 18 tables, 10 figures, journal pape

arXiv.org e-Print Archive

canSAR: an integrated cancer public translational research and drug discovery resource

Author: Al-Lazikani Bissan
Bulusu Krishna C.
Halling-Brown Mark D.
Patel Mishal
Tym Joe E.
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

canSAR is a fully integrated cancer research and drug discovery resource developed to utilize the growing publicly available biological annotation, chemical screening, RNA interference screening, expression, amplification and 3D structural data. Scientists can, in a single place, rapidly identify biological annotation of a target, its structural characterization, expression levels and protein interaction data, as well as suitable cell lines for experiments, potential tool compounds and similarity to known drug targets. canSAR has, from the outset, been completely use-case driven which has dramatically influenced the design of the back-end and the functionality provided through the interfaces. The Web interface at http://cansar.icr.ac.uk provides flexible, multipoint entry into canSAR. This allows easy access to the multidisciplinary data within, including target and compound synopses, bioactivity views and expert tools for chemogenomic, expression and protein interaction network data

CiteSeerX

PubMed Central

Institute of Cancer Research Repository

11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

Author: Abel R
Achenbach J
Adikwu UM
Ain QU
Al-Yamori R
Alhalabi Z
Aniceto N
Ansideri F
Baker D
Balducci A
Banting L
Barilla J
Barrett I
Basu D
Baumann K
Bender A
Bender A
Bender A
Berg E
Bergström F
Bermudez M
Bietz S
Bietz S
Bodnarchuk MS
Boeckler FM
Boeckler FM
Bojarski AJ
Bojarski AJ
Borbulevych OY
Buchholz M
Bulusu KC
Bureau R
Böckler FM
Böttcher S
Büttner FM
Cao Q
Cappel D
Cheeseright T
Clark RD
Clark T
Da Costa FB
Dahlgren M
De Graaf C
Demuth H-U
Dorfman R
Dubrucq K
Ecker GF
Edman K
Egelkraut-Holtus M
Eid S
Eigner-Pitto V
Engel J
Engkvist O
Epple M
Essex JW
Evers A
Exner TE
Fan T-P
Fechner U
Finkelmann AR
Firaha DS
Firth M
Fourches D
Fraaije JH
Frach R
Frach R
Fraczkiewicz R
Freitas A
Friedrich N-O
Friesner R
Fu X
Fuchs JE
Fulle S
Furtado F
Garg P
Gervasio FL
Ghafourian T
Glen R
Gracia RS
Grebner C
Guallar V
Göller AH
Günther MB
Günther S
Güssregen S
Haensele E
Heidrich J
Heil J
Hennig S
Herrmann G
Hessler G
Hilbig M
Himmler H-J
Hoffgaard F
Hogner A
Hollóczki O
Horinek D
Hošek P
Husch T
Ibezim A
Ihlenfeldt WD
Ihlenfeldt WD
Jardin C
Judson P
Jäger C
Kalinowski L
Kalliokoski T
Kast SM
Kast SM
Kast SM
Kibies P
Kibies P
Kirchmair J
Kirchner B
Kireeva N
Klute W
Koch O
Koch P
Kohlbacher O
Kolb P
Korth M
Kos A
Kramer C
Krilov G
Krotzky T
Krotzky T
Kuhn H
Kuhn MA
Kurczab R
Kühne R
Lange A
Lange A
Lanig H
Laufer S
Levine Z
Li X
Lifongo LL
Lin T
Lisurek M
Lokajíček MV
Mackey M
Masek BB
Mathea M
Matter H
Mbah CJ
Mbaze LM
McWilliams L
Mervin L
Mervin LH
Mittal S
Mohamad-Zobir SZ
Montanari F
Moser D
Mrugalla F
Mullen R
Murray DC
Nagy S
Nahum O
Naß A
Nguyen QD
Nogueira MS
Ntie-Kang F
Ntie-Kang F
Ntie-Kang F
Nwodo NJ
Oliveira Santos JS-D
Oliveira TB
Omoto K
Onlia I
Ostroumov D
Owen RM
Panecka J
Patel H
Pervov VS
Petrov A
Pisaková H
Pleik S
Polokoff M
Pongratz T
Pretzel J
Proschak E
Pryde DC
Pöhner IA
Rarey M
Rarey M
Rarey M
Rauh D
Renner G
Renner G
Richmond NJ
Rickmeyer T
Rippmann F
Ross GA
Ruff M
Rupp B
Saladino G
Saleh N
Sandmann A
Sandmann A
Schall C
Schmidt D
Schmidt TC
Schmidt TJ
Schmidtke P
Schneider G
Schomburg KT
Schram J
Schulz R
Schütter C
Segler MHS
Senderowitz H
Shaikh N
Shea J-E
Sherman W
Sievers-Engler A
Simoben CV
Simr P
Sippl W
Smith S
Solovev VP
Soltanshahi F
Sommer K
Sotriffer CA
Spiwok V
Stehle T
Steinbrecher TB
Steudle A
Sticht H
Strohfeldt S
Sánchez-García E
Tautermann CS
Torda AE
Torella R
Truszkowski A
Turk S
Tyrchan C
Tyrchan C
Ulander J
Ulander J
Van den Broek K
Van den Broek K
Van Oeyen A
Volkamer A
Wade RC
Waldman M
Waller MP
Wang L
Warszycki D
Weber J
Wessjohann L
Westerhoff LM
Whitley DC
Wieczorek V
Wolber G
Yosipof A
Zdrazil B
Zielesny A
Zimmermann MO
Zoufir A
Śmieja M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/03/2016
Field of study

Spiral - Imperial College Digital Repository

Scoping study to determine the data sources on biodiversity in diet and food intake : final report

Author: Bronselaer Antoon
De Baets Bernard
De Tré Guy
Lachat Carl
Smith Kathrin W
Van Camp John
Van Damme Patrick
Vanhove Wouter
Publication venue: 'Ghent University'
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

Author: A Makarov
AA Pontet
AJ Dempster
AL Rockwood
AM Richard
AW Jensen
B Seebass
BG Buchanan
C Djerassi
C Steinbeck
C Steinbeck
DA Laws
DL Olson
DL Wheeler
DR Scott
DS Wishart
F Csizmadia
H Budzikiewicz
HE Dayringer
J Braun
J Chen
J Lederberg
JC Lindon
JF Zhang
JJ Irwin
JK Senior
JL Faulon
JM Halket
JR De Laeter
L Sleno
M Badertscher
MD Soffer
ME Elyashberg
MP Balogh
N Huang
O Fiehn
O Fiehn
Oliver Fiehn
P Murray-Rust
QY Wu
RG Dromey
S Heuerding
S Noury
S Omura
SE Stein
SR Heller
SR Heller
T Fink
T Kind
T Morikawa
Tobias Kind
V Wray
W Windig
WD Ihlenfeldt
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65–81%. Corresponding software and supplemental data are available for downloads from the authors' website

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central