11 research outputs found

    SNPmplexViewer--toward a cost-effective traceability system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Beef traceability has become mandatory in many regions of the world and is typically achieved through the use of unique numerical codes on ear tags and animal passports. DNA-based traceability uses the animal's own DNA code to identify it and the products derived from it. Using <it>SNaPshot</it>, a primer-extension-based method, a multiplex of 25 SNPs in a single reaction has been practiced for reducing the expense of genotyping a panel of SNPs useful for identity control.</p> <p>Findings</p> <p>To further decrease <it>SNaPshot</it>'s cost, we introduced the Perl script <it>SNPmplexViewer</it>, which facilitates the analysis of trace files for reactions performed without the use of fluorescent size standards. <it>SNPmplexViewer </it>automatically aligns reference and target trace electropherograms, run with and without fluorescent size standards, respectively. <it>SNPmplexViewer </it>produces a modified target trace file containing a normalised trace in which the reference size standards are embedded. <it>SNPmplexViewer </it>also outputs aligned images of the two electropherograms together with a difference profile.</p> <p>Conclusions</p> <p>Modified trace files generated by <it>SNPmplexViewer </it>enable genotyping of <it>SnaPshot </it>reactions performed without fluorescent size standards, using common fragment-sizing software packages. <it>SNPmplexViewer</it>'s normalised output may also improve the genotyping software's performance. Thus, <it>SNPmplexViewer </it>is a general free tool enabling the reduction of <it>SNaPshot</it>'s cost as well as the fast viewing and comparing of trace electropherograms for fragment analysis. <it>SNPmplexViewer </it>is available at <url>http://cowry.agri.huji.ac.il/cgi-bin/SNPmplexViewer.cgi</url>.</p

    Text mining and rating prediction with topical user models

    No full text
    Recent years have seen an abundance of user-generated texts published online. Mining these texts for useful information is a growing research area with many aspects that are yet to be fully explored. Two such aspects, which are investigated in this thesis, are the extraction of implicit information about users to create user models, and the application of these models to tasks that require user information. Our main approach to extracting user information is via topical user models, which represent each author and document with low-dimensional distributions over topics (a topic is a distribution over words). We develop methods that utilise these topical user models to address the following tasks: (1) authorship attribution: identifying which user wrote a given anonymous text; (2) polarity inference: detecting the level of sentiment expressed in a given text; and (3) rating prediction: determining a given user's expected sentiment towards a given item. <br> The first task we consider is authorship attribution, where the goal is to identify the authors of anonymous texts. Authorship attribution is one of the most commonly attempted tasks in the authorship analysis field, which -- in addition to authorship attribution -- also deals with profiling authors by inferring demographic information and personality traits from their texts. Traditionally, research in this field has focused on formal texts, such as essays and novels, but recently more attention has been given to online user-generated texts, such as emails and blogs. Authorship attribution of online user-generated texts is a more challenging task than traditional authorship attribution, because such texts tend to be short and informal, and the number of candidate authors is often larger than in traditional settings. We address this challenge by employing topical user models. In addition to exploring novel ways of applying two popular topic models to this task, we develop a new model that projects users and documents to two disjoint topic spaces. Employing our model in authorship attribution yields state-of-the-art performance on several datasets, which contain either formal texts or online user-generated texts, where the number of candidate authors ranges from three to about 20,000. <br> The second task we consider is polarity inference, where the goal is to infer the degree of positive or negative sentiment expressed in texts. Polarity inference is a key task in the sentiment analysis field, which deals with inferring people's sentiments and opinions from texts. Even though the way polarity is expressed often appears to depend on the author, most of the work in this field ignores authors. In this thesis, we introduce a framework that infers the polarity of texts by employing user-specific inference models, where the models can be weighted according to user similarity. We show that our framework outperforms two popular baselines, even when all the base models are given equal weights. In addition, we show that performance can be further improved by considering user similarity in terms of language use (e.g., as captured by topical user models) and rating patterns. <br> The third and final task we consider is rating prediction, where the goal is to predict the rating a given user would assign to a given item. Rating prediction is a core component of many recommender systems, which require a way to predict users' future sentiments in order to find and recommend items of personal interest. Recently, rating prediction algorithms that are based on matrix factorisation have become increasingly popular, mainly due to their high accuracy and scalability. However, such algorithms often deliver inaccurate rating predictions for users who submitted only a few ratings. In this thesis, we introduce an extension to the basic matrix factorisation algorithm that considers information about the users when generating rating predictions. We show that employing either demographic information or text-based information (in the form of topical user models) outperforms baselines that consider only ratings, thereby enabling more accurate generation of personalised rating predictions for users who have not submitted many ratings. In the case of topical user models, these predictions are generated without requiring users to explicitly supply any information about themselves and their preferences. <div><br></div><div>Awards: Winner of the Mollie Holman Doctoral Medal for Excellence, Faculty of Information Technology, 2012.</div
    corecore