10,995 research outputs found
Exploration of Reaction Pathways and Chemical Transformation Networks
For the investigation of chemical reaction networks, the identification of
all relevant intermediates and elementary reactions is mandatory. Many
algorithmic approaches exist that perform explorations efficiently and
automatedly. These approaches differ in their application range, the level of
completeness of the exploration, as well as the amount of heuristics and human
intervention required. Here, we describe and compare the different approaches
based on these criteria. Future directions leveraging the strengths of chemical
heuristics, human interaction, and physical rigor are discussed.Comment: 48 pages, 4 figure
The Future of Computation
``The purpose of life is to obtain knowledge, use it to live with as much
satisfaction as possible, and pass it on with improvements and modifications to
the next generation.'' This may sound philosophical, and the interpretation of
words may be subjective, yet it is fairly clear that this is what all living
organisms--from bacteria to human beings--do in their life time. Indeed, this
can be adopted as the information theoretic definition of life. Over billions
of years, biological evolution has experimented with a wide range of physical
systems for acquiring, processing and communicating information. We are now in
a position to make the principles behind these systems mathematically precise,
and then extend them as far as laws of physics permit. Therein lies the future
of computation, of ourselves, and of life.Comment: 7 pages, Revtex. Invited lecture at the Workshop on Quantum
Information, Computation and Communication (QICC-2005), IIT Kharagpur, India,
February 200
Computer aided framework for designing bio-based commodity molecules with enhanced properties
We investigate the use of computer aided molecular design (CAMD) approach for enhancing the properties of existing molecules by modifying their chemical structure to match target property values. The activity of tailoring molecules requires to aggregate knowledge disseminated across the whole chemical enterprise hierarchy, from the manager level to the chemists and chemical engineers, with different backgrounds and perception of what the ideal molecule would be. So, we propose a framework that allows the search to be successful in matching all requirements while capitalizing this knowledge spread among actors with different backgrounds with the help of SBVR (Semantics of Business Vocabulary and Rules) and OCL (Object Constraint Language). In the context of using biomass as the feedstock, we discuss the coupling of CAMD tools with computer aided organic synthesis tools so as to propose enhanced bio-sourced molecule candidates which could be synthesized with eco-friendly pathways. Finally, we evaluate the sustainability of the molecules and of the whole decision-process as well. Specific applications that concern the use of bio-sourced molecules are presented: a case of typical derivatives of chemical platform molecules issued from the itaconic acid to substitute N-methyl-2-pyrrolidone NMP or dimethyl-formamide DMF solvents and a case of derivatives of lipids to be used a biolubricants
Effect of promoter architecture on the cell-to-cell variability in gene expression
According to recent experimental evidence, the architecture of a promoter,
defined as the number, strength and regulatory role of the operators that
control the promoter, plays a major role in determining the level of
cell-to-cell variability in gene expression. These quantitative experiments
call for a corresponding modeling effort that addresses the question of how
changes in promoter architecture affect noise in gene expression in a
systematic rather than case-by-case fashion. In this article, we make such a
systematic investigation, based on a simple microscopic model of gene
regulation that incorporates stochastic effects. In particular, we show how
operator strength and operator multiplicity affect this variability. We examine
different modes of transcription factor binding to complex promoters
(cooperative, independent, simultaneous) and how each of these affects the
level of variability in transcription product from cell-to-cell. We propose
that direct comparison between in vivo single-cell experiments and theoretical
predictions for the moments of the probability distribution of mRNA number per
cell can discriminate between different kinetic models of gene regulation.Comment: 35 pages, 6 figures, Submitte
Evolutionary Computation and QSAR Research
[Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. ConsellerĂa de EconomĂa e Industria; 10SIN105004P
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4
In recent years, groundbreaking advancements in natural language processing
have culminated in the emergence of powerful large language models (LLMs),
which have showcased remarkable capabilities across a vast array of domains,
including the understanding, generation, and translation of natural language,
and even tasks that extend beyond language processing. In this report, we delve
into the performance of LLMs within the context of scientific discovery,
focusing on GPT-4, the state-of-the-art language model. Our investigation spans
a diverse range of scientific areas encompassing drug discovery, biology,
computational chemistry (density functional theory (DFT) and molecular dynamics
(MD)), materials design, and partial differential equations (PDE). Evaluating
GPT-4 on scientific tasks is crucial for uncovering its potential across
various research domains, validating its domain-specific expertise,
accelerating scientific progress, optimizing resource allocation, guiding
future model development, and fostering interdisciplinary research. Our
exploration methodology primarily consists of expert-driven case assessments,
which offer qualitative insights into the model's comprehension of intricate
scientific concepts and relationships, and occasionally benchmark testing,
which quantitatively evaluates the model's capacity to solve well-defined
domain-specific problems. Our preliminary exploration indicates that GPT-4
exhibits promising potential for a variety of scientific applications,
demonstrating its aptitude for handling complex problem-solving and knowledge
integration tasks. Broadly speaking, we evaluate GPT-4's knowledge base,
scientific understanding, scientific numerical calculation abilities, and
various scientific prediction capabilities.Comment: 230 pages report; 181 pages for main content
- …