1 research outputs found
Integration and analysis of large scale data in chemical biology
much lower molecular weight than macromolecules like proteins or DNA. Small molecules are grouped into different
families according to their physico-chemical or functional properties, and they can be either natural (like lipids) or
synthetic (like drugs). Only a staggeringly low fraction of the small molecule universe has been characterize, and
very little is known about it. For instance, we know that lipids can play the role of scaffolding and energy storage
compounds, and that they differently compose biological membranes. However, we don’t know if it influences some
biological functions, including protein recruitment to membranes and cellular transport.
Chemical biology aims at utilizing chemicals in order to explore biological systems. Advances in synthesizing big
chemical libraries as well as in high-throughput screenings have led to technologies capable of studying protein-lipid
interactions at large scale and in physiological conditions. Therefore, answering such questions has become possible, but
it presents many new computational challenges. For instance, establishing methods capable of automatically classifying
interactions as binding or non-binding requiring a minimal interaction with human experts. Making use of unsupervised
clustering methods to identify clusters of lipids and proteins exhibiting similar patterns and linking them to similar
biological functions.
To tackle these challenges, I have developed a computational pipeline performing a technical and functional analysis
on the readouts produced by the high-throughput technology LiMA. Applied to a screen focusing on 94 proteins and 122
lipid combinations yielding more than 10,000 interactions, I have demonstrated that cooperativity was a key mechanism
for membrane recruitment and that it could be applied to most PH domains. Furthermore, I have identified a conserved
motif conferring PH domains the ability to be recruited to organellar membranes and which is linked to cellular transport
functions. Two amino acids of this motif are found mutated in some human cancer, and we predicted and confirmed
that these mutations could induce discrete changes in binding affinities in vitro and protein mis-localization in vivo.
These results represent milestones in the field of protein-lipid interactions.
While we are progressing toward a global understanding of protein-lipid interactions, data on the bioactivities of
small molecules is accumulating at a tremendous speed. In vitro data on interactions with targets are complemented
by other molecular and phenotypic readouts, such as gene expression profiles or toxicity readouts. The diversity
of screening technologies accompanied by big efforts to collect the resulting data in public databases have created
unprecedented opportunities for chemo-informatics work to integrate these data and make new inferences. For instance,
is the protein target profile of a drug correlated with a given phenotype? Can we predict the side effects of a drug
based on its toxicology readouts? In this context, I have developed CART: a computational platform with which
we address major chemo-informatics challenges to answer such questions. CART integrates many resources covering
molecular and phenotypical readouts, and annotates sets of chemical names with these integrated resources. CART
includes state-of-the-art full-text search engine technologies in order to match chemical names at a very high speed
and accuracy. Importantly, CART is a scalable resource that can cope with the increasing number of new chemical
annotation resources, and therefore, constitutes a major contribution to chemical biology