39,891 research outputs found

    BIOZON: a system for unification, management and analysis of heterogeneous biological data

    Get PDF
    BACKGROUND: Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. DESCRIPTION: Here we present a system (Biozon) that addresses these problems, and offers biologists a new knowledge resource to navigate through and explore. Biozon unifies multiple biological databases consisting of a variety of data types (such as DNA sequences, proteins, interactions and cellular pathways). It is fundamentally different from previous efforts as it uses a single extensive and tightly connected graph schema wrapped with hierarchical ontology of documents and relations. Beyond warehousing existing data, Biozon computes and stores novel derived data, such as similarity relationships and functional predictions. The integration of similarity data allows propagation of knowledge through inference and fuzzy searches. Sophisticated methods of query that span multiple data types were implemented and first-of-a-kind biological ranking systems were explored and integrated. CONCLUSION: The Biozon system is an extensive knowledge resource of heterogeneous biological data. Currently, it holds more than 100 million biological documents and 6.5 billion relations between them. The database is accessible through an advanced web interface that supports complex queries, "fuzzy" searches, data materialization and more, online at

    Graph theoretic methods for the analysis of structural relationships in biological macromolecules

    Get PDF
    Subgraph isomorphism and maximum common subgraph isomorphism algorithms from graph theory provide an effective and an efficient way of identifying structural relationships between biological macromolecules. They thus provide a natural complement to the pattern matching algorithms that are used in bioinformatics to identify sequence relationships. Examples are provided of the use of graph theory to analyze proteins for which three-dimensional crystallographic or NMR structures are available, focusing on the use of the Bron-Kerbosch clique detection algorithm to identify common folding motifs and of the Ullmann subgraph isomorphism algorithm to identify patterns of amino acid residues. Our methods are also applicable to other types of biological macromolecule, such as carbohydrate and nucleic acid structures

    Trademark Searching Tools and Strategies: Questions for the New Millennium

    Get PDF
    The intent of this discussion is to raise questions about trademark searching which will be discussed in future issues of IDEA. I will lead you through the questions raised by my journey through primarily legal literature in treatises and periodicals on the Lexis and Westlaw platforms

    Enhancing the effectiveness of ligand-based virtual screening using data fusion

    Get PDF
    Data fusion is being increasingly used to combine the outputs of different types of sensor. This paper reviews the application of the approach to ligand-based virtual screening, where the sensors to be combined are functions that score molecules in a database on their likelihood of exhibiting some required biological activity. Much of the literature to date involves the combination of multiple similarity searches, although there is also increasing interest in the combination of multiple machine learning techniques. Both approaches are reviewed here, focusing on the extent to which fusion can improve the effectiveness of searching when compared with a single screening mechanism, and on the reasons that have been suggested for the observed performance enhancement

    Identification of a New Family of Enzymes with Potential \u3cem\u3eO\u3c/em\u3e-acetylpeptidoglycan esterase activity in both Gram-positive and Gram-negative bacteria

    Get PDF
    Background: The metabolism of the rigid bacterial cell wall heteropolymer peptidoglycan is a dynamic process requiring continuous biosynthesis and maintenance involving the coordination of both lytic and synthetic enzymes. The O-acetylation of peptidoglycan has been proposed to provide one level of control on these activities as this modification inhibits the action of the major endogenous lytic enzymes, the lytic transglycosylases. The O-acetylation of peptidoglycan also inhibits the activity of the lysozymes which serve as the first line of defense of host cells against the invasion of bacterial pathogens. Despite this central importance, there is a dearth of information regarding peptidoglycan O-acetylation and nothing has previously been reported on its de-acetylation. Results: Homology searches of the genome databases have permitted this first report on the identification of a potential family of O-Acetylpeptidoglycan esterases (Ape). These proteins encoded in the genomes of a variety of both Gram-negative and Gram-positive bacteria, including a number of important human pathogens such as species of Neisseria, Helicobacter, Campylobacter, and Bacillus anthracis, have been organized into three families based on amino acid sequence similarities with family 1 being further divided into three sub-families. The genes encoding these proteins are shown to be clustered with Peptidoglycan O-acetyltransferases (Pat) and in some cases, together with other genes involved in cell wall metabolism. Representative bacteria that encode the Ape proteins were experimentally shown to produce O-acetylated peptidoglycan. Conclusion: The hypothetical proteins encoded by the pat and ape genes have been organized into families based on sequence similarities. The Pat proteins have sequence similarity to Pseudomonas aeruginosa AlgI, an integral membrane protein known to participate in the O-acetylation of the exopolysaccaride, alginate. As none of the bacteria that harbor the pat genes produce alginate, we propose that the Pat proteins serve to O-acetylate peptidoglycan which is known to be a maturation event occurring in the periplasm. The Ape sequences have amino acid sequence similarity to the CAZy CE 3 carbohydrate esterases, a family previously known to be composed of only O-acetylxylan esterases. They are predicted to contain the α/β hydrolase fold associated with the GDSL and TesA hydrolases and they possess the signature motifs associated with the catalytic residues of the CE3 esterases. Specific signature sequence motifs were identified for the Ape proteins which led to their organization into distinct families. We propose that by expressing both Pat and Ape enzymes, bacteria would be able to obtain a high level of localized control over the degradation of peptidoglycan through the attachment and removal of O-linked acetate. This would facilitate the efficient insertion of pores and flagella, localize spore formation, and control the level of general peptidoglycan turnover

    Maximized Posteriori Attributes Selection from Facial Salient Landmarks for Face Recognition

    Full text link
    This paper presents a robust and dynamic face recognition technique based on the extraction and matching of devised probabilistic graphs drawn on SIFT features related to independent face areas. The face matching strategy is based on matching individual salient facial graph characterized by SIFT features as connected to facial landmarks such as the eyes and the mouth. In order to reduce the face matching errors, the Dempster-Shafer decision theory is applied to fuse the individual matching scores obtained from each pair of salient facial features. The proposed algorithm is evaluated with the ORL and the IITK face databases. The experimental results demonstrate the effectiveness and potential of the proposed face recognition technique also in case of partially occluded faces.Comment: 8 pages, 2 figure

    Measuring industry-science links through inventor-author relations: A profiling method

    Get PDF
    In this pilot study we examine the performance of text-based profiling in recovering a set of validated inventor-author links. In a first step we match patents and publications solely based on their similarity in content. Next, we compare inventor and author names on the highest ranked matches for the occurrence of name matches. Finally, we compare these candidate matches with the names listed in a validated set of inventor-author names. Our text-based profile methodology performs significantly better than a random matching of patents and publications, suggesting that text-based profiling is a valuable complementary tool to the name searches used in previous studies.innovation; industry-science links; text-based profiling;

    TarO : a target optimisation system for structural biology

    Get PDF
    This work was funded by the UK Biotechnology and Biological Sciences Research Council (BBSRC) Structural Proteomics of Rational Targets (SPoRT) initiative, (Grant BBS/B/14434). Funding to pay the Open Access publication charges for this article was provided by BBSRC.TarO (http://www.compbio.dundee.ac.uk/taro) offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural biology techniques. The protein sequence is analysed by 17 algorithms and compared to 8 databases. TarO gathers putative homologues, including orthologues, and then obtains predictions of properties for these sequences including crystallisation propensity, protein disorder and post-translational modifications. Analyses are run on a high-performance computing cluster, the results integrated, stored in a database and accessed through a web-based user interface. Output is in tabulated format and in the form of an annotated multiple sequence alignment (MSA) that may be edited interactively in the program Jalview. TarO also simplifies the gathering of additional annotations via the Distributed Annotation System, both from the MSA in Jalview and through links to Dasty2. Routes to other information gateways are included, for example to relevant pages from UniProt, COG and the Conserved Domains Database. Open access to TarO is available from a guest account with private accounts for academic use available on request. Future development of TarO will include further analysis steps and integration with the Protein Information Management System (PIMS), a sister project in the BBSRC Structural Proteomics of Rational Targets initiative.Publisher PDFPeer reviewe

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    corecore