Search CORE

19 research outputs found

The relationship between mitochondrial DNA haplotype and the reproductive capacity of domestic pigs (Sus scrofa domesticus)

Author: Rajasekar Sriram
St. John Justin C.
Tsai Te-Sha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Frequencies of mtDNA variants for all 33 samples sequenced by Next Generation Sequencing. (XLSX 140Â kb

Crossref

Springer - Publisher Connector

Adelaide Research & Scholarship

PubMed Central

Monash University Research Portal

FigShare

An algebraic approach to rule-based information extraction

Author: Frederick Reiss
Huaiyu Zhu
Rajasekar Krishnamurthy
Shivakumar Vaithyanathan
Sriram Raghavan
Publication venue
Publication date: 01/01/2008
Field of study

Abstract—Traditional approaches to rule-based information extraction (IE) have primarily been based on regular expression grammars. However, these grammar-based systems have difficulty scaling to large data sets and large numbers of rules. Inspired by traditional database research, we propose an algebraic approach to rule-based IE that addresses these scalability issues through query optimization. The operators of our algebra are motivated by our experience in building several rule-basedextractionprograms over diverse datasets.Wepresent the operators of our algebra and propose several optimization strategies motivated by the text-specific characteristics of our operators. Finally we validate the potential benefits of our approach by extensive experiments over real-world blog data. I

CiteSeerX

Crossref

Regular expression learning for information extraction

Author: H. V. Jagadish
Rajasekar Krishnamurthy
Shivakumar Vaithyanathan
Sriram Raghavan
Yunyao Li
Publication venue
Publication date: 01/01/2008
Field of study

Regular expressions have served as the dominant workhorse of practical information extraction for several years. However, there has been little work on reducing the manual effort involved in building high-quality, complex regular expressions for information extraction tasks. In this paper, we propose Re-LIE, a novel transformation-based algorithm for learning such complex regular expressions. We evaluate the performance of our algorithm on multiple datasets and compare it against the CRF algorithm. We show that ReLIE, in addition to being an order of magnitude faster, outperforms CRF under conditions of limited training data and cross-domain data. Finally, we show how the accuracy of CRF can be improved by using features extracted by ReLIE.

CiteSeerX

Crossref

ABSTRACT Avatar Semantic Search: A Database Approach to Information Retrieval

Author: Eser Kandogan
Huaiyu Zhu
Rajasekar Krishnamurthy
Shivakumar Vaithyanathan
Sriram Raghavan
Publication venue
Publication date
Field of study

We present Avatar Semantic Search, a prototype search engine that exploits annotations in the context of classical keyword search. The process of annotations is accomplished offline by using highprecision information extraction techniques to extract facts, concepts, and relationships from text. These facts and concepts are represented and indexed in a structured data store. At runtime, keyword queries are interpreted in the context of these extracted facts and converted into one or more precise queries over the structured store. In this demonstration we describe the overall architecture of the Avatar Semantic Search engine. We also demonstrate the superiority of the AVATAR approach over traditional keyword search engines using Enron email data set and a blog corpus. 1

CiteSeerX

Additional file 5: of The relationship between mitochondrial DNA haplotype and the reproductive capacity of domestic pigs (Sus scrofa domesticus)

Author: Justin St. John (3550193)
Sriram Rajasekar (3550190)
Te-Sha Tsai (3550196)
Publication venue
Publication date
Field of study

Breed distribution across the mtDNA haplotypes for the 216 commercial pigs. (DOCX 42Â kb

FigShare

AVATAR information extraction system

Author: Huaiyu Zhu
Rajasekar Krishnamurthy
Shivakumar Vaithyanathan
Sriram Raghavan
T. S. Jayram
Publication venue
Publication date
Field of study

Abstract The AVATAR Information Extraction System (IES) at the IBM Almaden Research Center enables high-precision, rule-based, information extraction from text-documents. Drawing from our experience we propose the use of probabilistic database techniques as the formal underpinnings of information extrac-tion systems so as to maintain high precision while increasing recall. This involves building a framework where rule-based annotators can be mapped to queries in a database system. We use examplesfrom AVATAR IES to describe the challenges in achieving this goal. Finally, we show that deriving precision estimates in such a database system presents a significant challenge for probabilistic databasesystems

CiteSeerX