1,237 research outputs found

    Comparative venom-gland transcriptomics and venom proteomics of four Sidewinder Rattlesnake (\u3ci\u3eCrotalus cerastes\u3c/i\u3e) lineages reveal little differential expression despite individual variation

    Get PDF
    Changes in gene expression can rapidly influence adaptive traits in the early stages of lineage diversification. Venom is an adaptive trait comprised of numerous toxins used for prey capture and defense. Snake venoms can vary widely between conspecific populations, but the influence of lineage diversification on such compositional differences are unknown. To explore venom differentiation in the early stages of lineage diversification, we used RNA-seq and mass spectrometry to characterize Sidewinder Rattlesnake (Crotalus cerastes) venom. We generated the first venom-gland transcriptomes and complementary venom proteomes for eight individuals collected across the United States and tested for expression differences across life history traits and between subspecific, mitochondrial, and phylotranscriptomic hypotheses. Sidewinder venom was comprised primarily of hemorrhagic toxins, with few cases of differential expression attributable to life history or lineage hypotheses. However, phylotranscriptomic lineage comparisons more than doubled instances of significant expression differences compared to all other factors. Nevertheless, only 6.4% of toxins were differentially expressed overall, suggesting that shallow divergence has not led to major changes in Sidewinder venom composition. Our results demonstrate the need for consensus venom-gland transcriptomes based on multiple individuals and highlight the potential for discrepancies in differential expression between different phylogenetic hypotheses

    Coplanar Repeats by Energy Minimization

    Full text link
    This paper proposes an automated method to detect, group and rectify arbitrarily-arranged coplanar repeated elements via energy minimization. The proposed energy functional combines several features that model how planes with coplanar repeats are projected into images and captures global interactions between different coplanar repeat groups and scene planes. An inference framework based on a recent variant of α\alpha-expansion is described and fast convergence is demonstrated. We compare the proposed method to two widely-used geometric multi-model fitting methods using a new dataset of annotated images containing multiple scene planes with coplanar repeats in varied arrangements. The evaluation shows a significant improvement in the accuracy of rectifications computed from coplanar repeats detected with the proposed method versus those detected with the baseline methods.Comment: 14 pages with supplemental materials attache

    PRISE2: software for designing sequence-selective PCR primers and probes.

    Get PDF
    BackgroundPRISE2 is a new software tool for designing sequence-selective PCR primers and probes. To achieve high level of selectivity, PRISE2 allows the user to specify a collection of target sequences that the primers are supposed to amplify, as well as non-target sequences that should not be amplified. The program emphasizes primer selectivity on the 3' end, which is crucial for selective amplification of conserved sequences such as rRNA genes. In PRISE2, users can specify desired properties of primers, including length, GC content, and others. They can interactively manipulate the list of candidate primers, to choose primer pairs that are best suited for their needs. A similar process is used to add probes to selected primer pairs. More advanced features include, for example, the capability to define a custom mismatch penalty function. PRISE2 is equipped with a graphical, user-friendly interface, and it runs on Windows, Macintosh or Linux machines.ResultsPRISE2 has been tested on two very similar strains of the fungus Dactylella oviparasitica, and it was able to create highly selective primers and probes for each of them, demonstrating the ability to create useful sequence-selective assays.ConclusionsPRISE2 is a user-friendly, interactive software package that can be used to design high-quality selective primers for PCR experiments. In addition to choosing primers, users have an option to add a probe to any selected primer pair, enabling design of Taqman and other primer-probe based assays. PRISE2 can also be used to design probes for FISH and other hybridization-based assays

    Clustering with shallow trees

    Full text link
    We propose a new method for hierarchical clustering based on the optimisation of a cost function over trees of limited depth, and we derive a message--passing method that allows to solve it efficiently. The method and algorithm can be interpreted as a natural interpolation between two well-known approaches, namely single linkage and the recently presented Affinity Propagation. We analyze with this general scheme three biological/medical structured datasets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.Comment: 11 pages, 7 figure

    Query Workload-Aware Index Structures for Range Searches in 1D, 2D, and High-Dimensional Spaces

    Get PDF
    abstract: Most current database management systems are optimized for single query execution. Yet, often, queries come as part of a query workload. Therefore, there is a need for index structures that can take into consideration existence of multiple queries in a query workload and efficiently produce accurate results for the entire query workload. These index structures should be scalable to handle large amounts of data as well as large query workloads. The main objective of this dissertation is to create and design scalable index structures that are optimized for range query workloads. Range queries are an important type of queries with wide-ranging applications. There are no existing index structures that are optimized for efficient execution of range query workloads. There are also unique challenges that need to be addressed for range queries in 1D, 2D, and high-dimensional spaces. In this work, I introduce novel cost models, index selection algorithms, and storage mechanisms that can tackle these challenges and efficiently process a given range query workload in 1D, 2D, and high-dimensional spaces. In particular, I introduce the index structures, HCS (for 1D spaces), cSHB (for 2D spaces), and PSLSH (for high-dimensional spaces) that are designed specifically to efficiently handle range query workload and the unique challenges arising from their respective spaces. I experimentally show the effectiveness of the above proposed index structures by comparing with state-of-the-art techniques.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Schema-aware keyword search on linked data

    Get PDF
    Keyword search is a popular technique for querying the ever growing repositories of RDF graph data on the Web. This is due to the fact that the users do not need to master complex query languages (e.g., SQL, SPARQL) and they do not need to know the underlying structure of the data on the Web to compose their queries. Keyword search is simple and flexible. However, it is at the same time ambiguous since a keyword query can be interpreted in different ways. This feature of keyword search poses at least two challenges: (a) identifying relevant results among a multitude of candidate results, and (b) dealing with the performance scalability issue of the query evaluation algorithms. In the literature, multiple schema-unaware approaches are proposed to cope with the above challenges. Some of them identify as relevant results only those candidate results which maintain the keyword instances in close proximity. Other approaches filter out irrelevant results using their structural characteristics or rank and top-k process the retrieved results based on statistical information about the data. In any case, these approaches cannot disambiguate the query to identify the intent of the user and they cannot scale satisfactorily when the size of the data and the number of the query keywords grow. In recent years, different approaches tried to exploit the schema (structural summary) of the RDF (Resource Description Framework) data graph to address the problems above. In this context, an original hierarchical clustering technique is introduced in this dissertation. This approach clusters the results based on a semantic interpretation of the keyword instances and takes advantage of relevance feedback from the user. The clustering hierarchy uses pattern graphs which are structured queries and clustering together result graphs with the same structure. Pattern graphs represent possible interpretations for the keyword query. By navigating though the hierarchy the user can select the pattern graph which is relevant to her intent. Nevertheless, structural summaries are approximate representations of the data and, therefore, might return empty answers or miss results which are relevant to the user intent. To address this issue, a novel approach is presented which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs to extract additional results potentially of interest to the user. Query caching and multi-query optimization techniques are leveraged for the efficient evaluation of relaxed pattern graphs. Although the approaches which consider the structural summary of the data graph are promising, they require interaction with the user. It is claimed in this dissertation that without additional information from the user, it is not possible to produce results of high quality from keyword search on RDF data with the existing techniques. In this regard, an original keyword query language on RDF data is introduced which allows the user to convey his intention flexibly and effortlessly by specifying cohesive keyword groups. A cohesive group of keywords in a query indicates that its keywords should form a cohesive unit in the query results. It is experimentally demonstrated that cohesive keyword queries improve the result quality effectively and prune the search space of the pattern graphs efficiently compared to traditional keyword queries. Most importantly, these benefits are achieved while retaining the simplicity and the convenience of traditional keyword search. The last issue addressed in this dissertation is the diversification problem for keyword search on RDF data. The goal of diversification is to trade off relevance and diversity in the results set of a keyword query in order to minimize the dissatisfaction of the average user. Novel metrics are developed for assessing relevance and diversity along with techniques for the generation of a relevant and diversified set of query interpretations for a keyword query on an RDF data graph. Experimental results show the effectiveness of the metrics and the efficiency of the approach
    • …
    corecore