137 research outputs found
A gold standard set of mechanistically diverse enzyme superfamilies
Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures
Molecular Diversity of Terpene Synthases in the Liverwort Marchantia polymorpha
Marchantia polymorpha is a basal terrestrial land plant, which like most liverworts accumulates structurally diverse terpenes believed to serve in deterring disease and herbivory. Previous studies have suggested that the mevalonate and methylerythritol phosphate pathways, present in evolutionarily diverged plants, are also operative in liverworts. However, the genes and enzymes responsible for the chemical diversity of terpenes have yet to be described. In this study, we resorted to a HMMER search tool to identify 17 putative terpene synthase genes from M. polymorpha transcriptomes. Functional characterization identified four diterpene synthase genes phylogenetically related to those found in diverged plants and nine rather unusual monoterpene and sesquiterpene synthase-like genes. The presence of separate monofunctional diterpene synthases for ent-copalyl diphosphate and ent-kaurene biosynthesis is similar to orthologs found in vascular plants, pushing the date of the underlying gene duplication and neofunctionalization of the ancestral diterpene synthase gene family to \u3e400 million years ago. By contrast, the mono- and sesquiterpene synthases represent a distinct class of enzymes, not related to previously described plant terpene synthases and only distantly so to microbial-type terpene synthases. The absence of a Mg2+ binding, aspartate-rich, DDXXD motif places these enzymes in a noncanonical family of terpene synthases
Exploring the future of data-driven product design
Connected devices present new opportunities to advance design through data collection in the wild, similar to the way digital services evolve through analytics. However, it is still unclear how live data transmitted by connected devices informs the design of these products, going beyond performance optimisation to support creative practices. Design can be enriched by data captured by connected devices, from usage logs to environmental sensors, and data about the devices and people around them. Through a series of workshops, this paper contributes industry and academia perspectives on the future of data-driven product design. We highlight HCI challenges, issues and implications, including sensemaking and the generation of design insight. We further challenge current notions of data-driven design and envision ways in which future HCI research can develop ways to work with data in the design process in a connected, rich, human manner
Recommended from our members
Empirical audit and review and an assessment of evidentiary value in research on the psychological consequences of scarcity.
Empirical audit and review is an approach to assessing the evidentiary value of a research area. It involves identifying a topic and selecting a cross-section of studies for replication. We apply the method to research on the psychological consequences of scarcity. Starting with the papers citing a seminal publication in the field, we conducted replications of 20 studies that evaluate the role of scarcity priming in pain sensitivity, resource allocation, materialism, and many other domains. There was considerable variability in the replicability, with some strong successes and other undeniable failures. Empirical audit and review does not attempt to assign an overall replication rate for a heterogeneous field, but rather facilitates researchers seeking to incorporate strength of evidence as they refine theories and plan new investigations in the research area. This method allows for an integration of qualitative and quantitative approaches to review and enables the growth of a cumulative science
The Structure-Function Linkage Database
The Structure–Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure–function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies ‘look alike’, making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity
Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences
Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by experts using mainly qualitative descriptions of molecular function. Recently, in vitro DNA-binding preferences to all possible 8-nt DNA sequences have been measured for 178 mouse homeodomains using protein-binding microarrays, offering the unprecedented opportunity of evaluating the classification methods against quantitative measures of molecular function. To this end, we automatically derive homeodomain subtypes from the DNA-binding data and independently group the same domains using sequence information alone. We test five sequence-based methods, which use different sequence-similarity measures and algorithms to group sequences. Results show that methods that optimize the classification robustness reflect well the detailed functional specificity revealed by the experimental data. In some of these classifications, 73–83% of the subfamilies exactly correspond to, or are completely contained in, the function-based subtypes. Our findings demonstrate that certain sequence-based classifications are capable of yielding very specific molecular function annotations. The availability of quantitative descriptions of molecular function, such as DNA-binding data, will be a key factor in exploiting this potential in the future.Canadian Institutes of Health Research (MOP#82940)Sickkids FoundationOntario Research FundNational Science Foundation (U.S.)National Human Genome Research Institute (U.S.) (R01 HG003985
Recommended from our members
Correction for O’Donnell et al., Empirical audit and review and an assessment of evidentiary value in research on the psychological consequences of scarcity
Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies
Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation
Recommended from our members
InterPro in 2019: improving coverage, classification and access to protein sequence annotations
The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities
Enjoy! Assertive Language and Consumer Compliance in (Non)Hedonic Contexts
This paper is concerned with the tension between consumer persuasion and freedom of choice. We study how assertive language (as in the slogan Just do it!) affects consumer compliance in hedonic vs. utilitarian contexts. Previous literature consistently claimed that forceful language would cause reactance and decreased compliance. However, we find in four studies that assertive persuasion is effective in contexts involving hedonic goods and hedonically framed utilitarian goods. Our hypotheses emerge from sociolinguistic research and confirm the relevance of linguistic research in consumer behavior
- …