103 research outputs found
Searching for network modules
When analyzing complex networks a key target is to uncover their modular
structure, which means searching for a family of modules, namely node subsets
spanning each a subnetwork more densely connected than the average. This work
proposes a novel type of objective function for graph clustering, in the form
of a multilinear polynomial whose coefficients are determined by network
topology. It may be thought of as a potential function, to be maximized, taking
its values on fuzzy clusterings or families of fuzzy subsets of nodes over
which every node distributes a unit membership. When suitably parametrized,
this potential is shown to attain its maximum when every node concentrates its
all unit membership on some module. The output thus is a partition, while the
original discrete optimization problem is turned into a continuous version
allowing to conceive alternative search strategies. The instance of the problem
being a pseudo-Boolean function assigning real-valued cluster scores to node
subsets, modularity maximization is employed to exemplify a so-called quadratic
form, in that the scores of singletons and pairs also fully determine the
scores of larger clusters, while the resulting multilinear polynomial potential
function has degree 2. After considering further quadratic instances, different
from modularity and obtained by interpreting network topology in alternative
manners, a greedy local-search strategy for the continuous framework is
analytically compared with an existing greedy agglomerative procedure for the
discrete case. Overlapping is finally discussed in terms of multiple runs, i.e.
several local searches with different initializations.Comment: 10 page
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs
<p>Abstract</p> <p>Background</p> <p>Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs.</p> <p>Results</p> <p>In this work we compare the performance of the Affinity Propagation (AP) and Markov Clustering (MCL) procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from <it>S. cervisiae </it>hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of <it>S. cerevisiae</it>, derived from genome scale purification experiments and to versions of this network in which varying proportions of the links have been randomly shuffled.</p> <p>Conclusion</p> <p>Our analysis shows that the MCL procedure is significantly more tolerant to noise and behaves more robustly than the AP algorithm. The advantage of MCL over AP is dramatic for unweighted protein interaction graphs, as AP displays severe convergence problems on the majority of the unweighted graph versions that we tested, whereas MCL continues to identify meaningful clusters, albeit fewer of them, as the level of noise in the graph increases. MCL thus remains the method of choice for identifying protein complexes from binary interaction networks.</p
Differential analysis of longitudinal methicillin-resistant Staphylococcus aureus colonization in relation to microbial shifts in the nasal microbiome of neonatal piglets
Methicillin-resistant Staphylococcus aureus (MRSA) is an important human pathogen and often colonizes pigs. To lower the risk of MRSA transmission to humans, a reduction of MRSA prevalence and/or load in pig farms is needed. The nasal microbiome contains commensal species that may protect against MRSA colonization and may be used to develop competitive exclusion strategies. To obtain a comprehensive understanding of the species that compete with MRSA in the developing porcine nasal microbiome, and the moment of MRSA colonization, we analyzed nasal swabs from piglets in two litters. The swabs were taken longitudinally, starting directly after birth until 6 weeks. Both 16S rRNA and tuf gene sequencing data with different phylogenetic resolutions and complementary culture-based and quantitative real-time PCR (qPCR)-based MRSA quantification data were collected. We employed a compositionally aware bioinformatics approach (CoDaSeq - rmcorr) for analysis of longitudinal measurements of the nasal microbiota. The richness and diversity in the developing nasal microbiota increased over time, albeit with a reduction of Firmicutes and Actinobacteria, and an increase of Proteobacteria. Coabundant groups (CAGs) of species showing strong positive and negative correlation with colonization of MRSA and S. aureus were identified. Combining 16S rRNA and tuf gene sequencing provided greater Staphylococcus species resolution, which is necessary to inform strategies with potential protective effects against MRSA colonization in pigs.IMPORTANCE The large reservoir of methicillin-resistant Staphylococcus aureus (MRSA) in pig farms imposes a significant zoonotic risk. An effective strategy to reduce MRSA colonization in pig farms is competitive exclusion whereby MRSA colonization can be reduced by the action of competing bacterial species. We complemented 16S rRNA gene sequencing with Staphylococcus-specific tuf gene sequencing to identify species anticorrelating with MRSA colonization. This approach allowed us to elucidate microbiome dynamics and identify species that are negatively and positively associated with MRSA, potentially suggesting a route for its competitive exclusion
Can Parents Afford to Work? Childcare Costs, Tax-Benefit Policies and Work Incentives
Childcare policies play a crucial role in helping parents reconcile care and employment-related tasks. This paper quantifies the net cost of purchasing full-time centre-based childcare in OECD countries taking into account a wide range of influences on household budgets, including fees charged by childcare providers as well as childcare-related tax concessions and cash benefits available to parents. Building on these calculations, family resources are evaluated for different employment situations in order to assess the financial trade-offs between work and staying at home. Results are disaggregated to identify the policy features that present barriers to work for parents whose employment decisions are known to be particularly responsive to financial work incentives: lone parents and second earners with young children requiring care. The results indicate that the cost of purchasing childcare services should be analysed in conjunction with other social and fiscal policies that affect family incomes. While childcare fees can be very high, high prices may not impede employment if tax-benefit systems incorporate well-balanced provisions that help parents pay for these services. Conversely, even highly subsidised childcare markets can leave parents with little financial gain from employment if high tax burdens or benefit claw-back rates give rise to adverse work incentives
BNP controls early load-dependent regulation of SERCA through calcineurin
Heart failure is characterised by reduced expression of sarcoplasmic reticulum calcium-ATPase (SERCA) and increased expression of B-type natriuretic peptide (BNP). The present study was performed to investigate causality of this inverse relationship under in vivo conditions in the transversal aortic constriction mouse model (TAC). Left ventricular SERCA-mRNA expression was significantly upregulated in TAC by 32% after 6 h, but not different from sham after 24 h. Serum proANP and BNP levels were increased in TAC after 24 h (BNP +274%, p < 0.01; proANP +60%, p < 0.05), but only proANP levels were increased after 6 h (+182%, p < 0.01). cGMP levels were only increased 24 h after TAC (+307%, p < 0.01), but not 6 h after TAC. BNP infusion inhibited the increase in SERCA expression 6 h after TAC. In BNP-receptor-knockout animals (GC-A), the expression of SERCA was still significantly increased 24 h after TAC at the mRNA level by 35% (p < 0.05), as well as at the protein level by 25% (p < 0.05). MCIP expression as an indicator of calcineurin activity was regulated in parallel to SERCA after 6 and 24 h. MCIP-mRNA was increased by 333% 6 h after TAC, but not significantly different from sham after 24 h. In the GC-A-KO mice, MCIP-mRNA was significantly increased in TAC compared to WT after 24 h. In mice with BNP infusion, MCIP was significantly lower 6 h after TAC compared to control animals. In conclusion, mechanical load leads to an upregulation of SERCA expression. This is followed by upregulation of natriuretic peptides with subsequent suppression of SERCA upregulation. Elevated natriuretic peptides may suppress SERCA expression by inhibition of calcineurin activity via activation of GC-A
Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by experts using mainly qualitative descriptions of molecular function. Recently, in vitro DNA-binding preferences to all possible 8-nt DNA sequences have been measured for 178 mouse homeodomains using protein-binding microarrays, offering the unprecedented opportunity of evaluating the classification methods against quantitative measures of molecular function. To this end, we automatically derive homeodomain subtypes from the DNA-binding data and independently group the same domains using sequence information alone. We test five sequence-based methods, which use different sequence-similarity measures and algorithms to group sequences. Results show that methods that optimize the classification robustness reflect well the detailed functional specificity revealed by the experimental data. In some of these classifications, 73–83% of the subfamilies exactly correspond to, or are completely contained in, the function-based subtypes. Our findings demonstrate that certain sequence-based classifications are capable of yielding very specific molecular function annotations. The availability of quantitative descriptions of molecular function, such as DNA-binding data, will be a key factor in exploiting this potential in the future.Canadian Institutes of Health Research (MOP#82940)Sickkids FoundationOntario Research FundNational Science Foundation (U.S.)National Human Genome Research Institute (U.S.) (R01 HG003985
Methods for visual mining of genomic and proteomic data atlases
<p>Abstract</p> <p>Background</p> <p>As the volume, complexity and diversity of the information that scientists work with on a daily basis continues to rise, so too does the requirement for new analytic software. The analytic software must solve the dichotomy that exists between the need to allow for a high level of scientific reasoning, and the requirement to have an intuitive and easy to use tool which does not require specialist, and often arduous, training to use. Information visualization provides a solution to this problem, as it allows for direct manipulation and interaction with diverse and complex data. The challenge addressing bioinformatics researches is how to apply this knowledge to data sets that are continually growing in a field that is rapidly changing.</p> <p>Results</p> <p>This paper discusses an approach to the development of visual mining tools capable of supporting the mining of massive data collections used in systems biology research, and also discusses lessons that have been learned providing tools for both local researchers and the wider community. Example tools were developed which are designed to enable the exploration and analyses of both proteomics and genomics based atlases. These atlases represent large repositories of raw and processed experiment data generated to support the identification of biomarkers through mass spectrometry (the PeptideAtlas) and the genomic characterization of cancer (The Cancer Genome Atlas). Specifically the tools are designed to allow for: the visual mining of thousands of mass spectrometry experiments, to assist in designing informed targeted protein assays; and the interactive analysis of hundreds of genomes, to explore the variations across different cancer genomes and cancer types.</p> <p>Conclusions</p> <p>The mining of massive repositories of biological data requires the development of new tools and techniques. Visual exploration of the large-scale atlas data sets allows researchers to mine data to find new meaning and make sense at scales from single samples to entire populations. Providing linked task specific views that allow a user to start from points of interest (from diseases to single genes) enables targeted exploration of thousands of spectra and genomes. As the composition of the atlases changes, and our understanding of the biology increase, new tasks will continually arise. It is therefore important to provide the means to make the data available in a suitable manner in as short a time as possible. We have done this through the use of common visualization workflows, into which we rapidly deploy visual tools. These visualizations follow common metaphors where possible to assist users in understanding the displayed data. Rapid development of tools and task specific views allows researchers to mine large-scale data almost as quickly as it is produced. Ultimately these visual tools enable new inferences, new analyses and further refinement of the large scale data being provided in atlases such as PeptideAtlas and The Cancer Genome Atlas.</p
- …