89 research outputs found
Network analysis and data mining in food science: the emergence of computational gastronomy
Abstract The rapidly growing body of publicly available data on food chemistry and food usage can be analysed using data mining and network analysis methods. Here we discuss how these approaches can yield new insights both into the sensory perception of food and the anthropology of culinary practice. We also show that this development is part of a larger trend. Over the past two decades large-scale data analysis has revolutionized the biological sciences, which have experienced an explosion of experimental data as a result of the advent of high-throughput technology. Large datasets are also changing research methodologies in the social sciences due to the data generated by mobile communication technology and online social networks. Even the arts and humanities are seeing the establishment of ‘digital humanities’ research centres in order to cope with the increasing digitization of literary and historical sources. We argue that food science is likely to be one of the next beneficiaries of large-scale data analysis, perhaps resulting in fields such as ‘computational gastronomy’.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Recommended from our members
Using small samples to estimate neutral component size and robustness in the genotype-phenotype map of RNA secondary structure.
In genotype-phenotype (GP) maps, the genotypes that map to the same phenotype are usually not randomly distributed across the space of genotypes, but instead are predominantly connected through one-point mutations, forming network components that are commonly referred to as neutral components (NCs). Because of their impact on evolutionary processes, the characteristics of these NCs, like their size or robustness, have been studied extensively. Here, we introduce a framework that allows the estimation of NC size and robustness in the GP map of RNA secondary structure. The advantage of this framework is that it only requires small samples of genotypes and their local environment, which also allows experimental realizations. We verify our framework by applying it to the exhaustively analysable GP map of RNA sequence length L = 15, and benchmark it against an existing method by applying it to longer, naturally occurring functional non-coding RNA sequences. Although it is specific to the RNA secondary structure GP map in the first place, our framework can probably be transferred and adapted to other sequence-to-structure GP maps.MW was supported by the EPSRC and the Gatsby Charitable Foundation. SEA was supported by the Gatsby Charitable Foundation
Recommended from our members
Phenotypes can be robust and evolvable if mutations have non-local effects on sequence constraints.
The mapping between biological genotypes and phenotypes plays an important role in evolution, and understanding the properties of this mapping is crucial to determine the outcome of evolutionary processes. One of the most striking properties observed in several genotype-phenotype (GP) maps is the positive correlation between the robustness and evolvability of phenotypes. This implies that a phenotype can be strongly robust against mutations and at the same time evolvable to a diverse range of alternative phenotypes. Here, we examine the causes for this positive correlation by introducing two analytically tractable GP map models that follow the principles of real biological GP maps. The first model is based on gene-like GP maps, reflecting the way in which genetic sequences are organized into protein-coding genes, and the second one is based on the GP map of RNA secondary structure. For both models, we find that a positive correlation between phenotype robustness and evolvability only emerges if mutations at one sequence position can have non-local effects on the sequence constraints at another position. This highlights that non-local effects of mutations are closely related to the coexistence of robustness and evolvability in phenotypes, and are likely to be an important feature of many biological GP maps
Neutral components show a hierarchical community structure in the genotype-phenotype map of RNA secondary structure.
Genotype-phenotype (GP) maps describe the relationship between biological sequences and structural or functional outcomes. They can be represented as networks in which genotypes are the nodes, and one-point mutations between them are the edges. The genotypes that map to the same phenotype form subnetworks consisting of one or multiple disjoint connected components-so-called neutral components (NCs). For the GP map of RNA secondary structure, the NCs have been found to exhibit distinctive network features that can affect the dynamical processes taking place on them. Here, we focus on the community structure of RNA secondary structure NCs. Building on previous findings, we introduce a method to reveal the hierarchical community structure solely from the sequence constraints and composition of the genotypes that form a given NC. Thereby, we obtain modularity values similar to common community detection algorithms, which are much more complex. From this knowledge, we endorse a sampling method that allows a fast exploration of the different communities of a given NC. Furthermore, we introduce a way to estimate the community structure from genotype samples, which is useful when an exhaustive analysis of the NC is not feasible, as is the case for longer sequence lengths.MW was supported by the EPSRC and the Gatsby Charitable Foundation. SEA was supported by the Gatsby Charitable Foundation and the Alan Turing Institute
Temperature in complex networks
Various statistical-mechanics approaches to complex networks have been proposed to describe expected topological properties in terms of ensemble averages. Here we extend this formalism by introducing the fundamental concept of graph temperature, controlling the degree of topological optimization of a network. We recover the temperature-dependent version of various important models as particular cases of our approach, and show examples where, remarkably, the onset of a percolation transition, a scale-free degree distribution, correlations and clustering can be understood as natural properties of an optimized (low-temperature) topology. We then apply our formalism to real weighted networks and we compute their temperature, finding that various techniques used to extract information from complex networks are again particular cases of our approach
Evolution of interface binding strengths in simplified model of protein quaternary structure.
The self-assembly of proteins into protein quaternary structures is of fundamental importance to many biological processes, and protein misassembly is responsible for a wide range of proteopathic diseases. In recent years, abstract lattice models of protein self-assembly have been used to simulate the evolution and assembly of protein quaternary structure, and to provide a tractable way to study the genotype-phenotype map of such systems. Here we generalize these models by representing the interfaces as mutable binary strings. This simple change enables us to model the evolution of interface strengths, interface symmetry, and deterministic assembly pathways. Using the generalized model we are able to reproduce two important results established for real protein complexes: The first is that protein assembly pathways are under evolutionary selection to minimize misassembly. The second is that the assembly pathway of a complex mirrors its evolutionary history, and that both can be derived from the relative strengths of interfaces. These results demonstrate that the generalized lattice model offers a powerful new idealized framework to facilitate the study of protein self-assembly processes and their evolution
Networks for all
A report on the Cold Spring Harbor Laboratory/Wellcome Trust conference on Network Biology, Hinxton, UK, 27-31 August 2008
Optimal scales in weighted networks
The analysis of networks characterized by links with heterogeneous intensity
or weight suffers from two long-standing problems of arbitrariness. On one
hand, the definitions of topological properties introduced for binary graphs
can be generalized in non-unique ways to weighted networks. On the other hand,
even when a definition is given, there is no natural choice of the (optimal)
scale of link intensities (e.g. the money unit in economic networks). Here we
show that these two seemingly independent problems can be regarded as
intimately related, and propose a common solution to both. Using a formalism
that we recently proposed in order to map a weighted network to an ensemble of
binary graphs, we introduce an information-theoretic approach leading to the
least biased generalization of binary properties to weighted networks, and at
the same time fixing the optimal scale of link intensities. We illustrate our
method on various social and economic networks.Comment: Accepted for presentation at SocInfo 2013, Kyoto, 25-27 November 2013
(http://www.socinfo2013.org
The Network Turn
This Element contends that networks are a category of study that cuts across traditional academic barriers, uniting diverse disciplines through a shared understanding of complexity in our world. This title is also available as Open Access on Cambridge Core
- …