2,132 research outputs found
SWI-Prolog and the Web
Where Prolog is commonly seen as a component in a Web application that is
either embedded or communicates using a proprietary protocol, we propose an
architecture where Prolog communicates to other components in a Web application
using the standard HTTP protocol. By avoiding embedding in external Web servers
development and deployment become much easier. To support this architecture, in
addition to the transfer protocol, we must also support parsing, representing
and generating the key Web document types such as HTML, XML and RDF.
This paper motivates the design decisions in the libraries and extensions to
Prolog for handling Web documents and protocols. The design has been guided by
the requirement to handle large documents efficiently. The described libraries
support a wide range of Web applications ranging from HTML and XML documents to
Semantic Web RDF processing.
To appear in Theory and Practice of Logic Programming (TPLP)Comment: 31 pages, 24 figures and 2 tables. To appear in Theory and Practice
of Logic Programming (TPLP
Constructing a Personal Knowledge Graph from Disparate Data Sources
This thesis revolves around the idea of a Personal Knowledge Graph as a uniform coherent structure of personal data collected from multiple disparate sources: A knowledge base consisting of entities such as persons, events, locations and companies interlinked with semantically meaningful relationships in a graph structure where the user is at its center. The personal knowledge graph is intended to be a valuable resource for a digital personal assistant, expanding its capabilities to answer questions and perform tasks that require personal knowledge about the user.
We explored techniques within Knowledge Representation, Knowledge Extraction/ Information Extraction and Information Management for the purpose of constructing such a graph. We show the practical advantages of using Knowledge Graphs for personal information management, utilizing the structure for extracting and inferring answers and for handling resources like documents, emails and calendar entries.
We have proposed a framework for aggregating user data and shown how existing ontologies can be used to model personal knowledge.
We have shown that a personal knowledge graph based on the user's personal resources is a viable concept, however we were not able to enrich our personal knowledge graph with knowledge extracted from unstructured private sources. This was mainly due to sparsity of relevant information, the informal nature and the lack of context in personal correspondence
Preliminary evaluation of the CellFinder literature curation pipeline for gene expression in kidney cells and anatomical parts
Biomedical literature curation is the process of automatically and/or manually deriving knowledge from scientific publications and recording it into specialized databases for structured delivery to users. It is a slow, error-prone, complex, costly and, yet, highly important task. Previous experiences have proven that text mining can assist in its many phases, especially, in triage of relevant documents and extraction of named entities and biological events. Here, we present the curation pipeline of the CellFinder database, a repository of cell research, which includes data derived from literature curation and microarrays to identify cell types, cell lines, organs and so forth, and especially patterns in gene expression. The curation pipeline is based on freely available tools in all text mining steps, as well as the manual validation of extracted data. Preliminary results are presented for a data set of 2376 full texts from which >4500 gene expression events in cell or anatomical part have been extracted. Validation of half of this data resulted in a precision of ~50% of the extracted data, which indicates that we are on the right track with our pipeline for the proposed task. However, evaluation of the methods shows that there is still room for improvement in the named-entity recognition and that a larger and more robust corpus is needed to achieve a better performance for event extraction. Database URL: http://www.cellfinder.org
Knowledgebase Representation for Royal Bengal Tiger In The Context of Bangladesh
Royal Bengal Tiger is one of the penetrating threaten animal in Bangladesh forest at Sundarbans. In this work we have had concentrate to establish a robust Knowledgebase for Royal Bengal Tiger. We improve our previous work to achieve efficiency on knowledgebase representation. We have categorized the tigers from others animal from collected data by using Support Vector Machines(SVM) .Manipulating our collected data in a structured way by XML parsing on JAVA platform. Our proposed system generates n-triple by considering parsed data. We proceed on an ontology is constructed by ProtE9;gE9; which containing information about names, places, awards. A straightforward approach of this work to make the knowledgebase representation of Royal Bengal Tiger more reliable on the web. Our experiments show the effectiveness of knowledgebase construction. Complete knowledgebase construction of Royal Bengal Tigers how the efficient out-put. The complete knowledgebase construction helps to integrate the raw data in a structured way. The outcome of our proposed system contains the complete knowledgebase. Our experimental results show the strength of our system by retrieving information from ontology in reliable way
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Exploration of documents concerning Foundlings in Fafe along XIX Century
Dissertação de mestrado integrado em Informatics EngineeringThe abandonment of children and newborns is a problem in our society.
In the last few decades, the introduction of contraceptive methods, the development of
social programs and family planning were fundamental to control undesirable pregnancies
and support families in need. But these developments were not enough to solve the
abandonment epidemic.
The anonymous abandonment has a dangerous aspect. In order to preserve the family
identity, a child is usually left in a public place at night. Since children and newborns are
one of the most vulnerable groups in our society, the time between the abandonment and
the assistance of the child is potentially deadly.
The establishment of public institutions in the past, such as the foundling wheel, was
extremely important as a strategy to save lives. These institutions supported the abandoned
children, while simultaneously providing a safer abandonment process, without
compromising the anonymity of the family.
The focus of the Masterâs Project discussed in this dissertation is the analysis and processing
of nineteenth century documents, concerning the Foundling Wheel of Fafe.
The analysis of sample documents is the initial step in the development of an ontology.
The ontology has a fundamental role in the organization and structure of the information
contained in these historical documents. The identification of concepts and the relationships
between them, culminates in a structured knowledge repository. Other important component
is the development of a digital platform, where users are able to access the content stored in
the knowledge repository and explore the digital archive, which incorporates the digitized
version of documents and books from these historical institutions.
The development of this project is important for some reasons. Directly, the implementation
of a knowledge repository and a digital platform preserves information. These
documents are mostly unique records and due to their age and advanced state of degradation,
the substitution of the physical by digital access reduces the wear and tear associated to
each consultation. Additionally, the digital archive facilitates the dissemination of valuable
information. Research groups or the general public are able to use the platform as a tool
to discover the past, by performing biographic, cultural or socio-economic studies over
documents dated to the ninetieth century.O abandono de crianças e de recém-nascidos é um flagelo da sociedade.
Nas Ășltimas dĂ©cadas, a introdução de mĂ©todos contraceptivos e de programas sociais
foram essenciais para o desenvolvimento do planeamento familiar. Apesar destes avanços,
estes programas não solucionaram a problemåtica do abandono de crianças e recém-nascidos.
Problemas socioeconĂłmicos sĂŁo o principal factor que explica o abandono.
O processo de abandono de crianças possui uma agravante perigosa. De forma a proteger
a identidade da famĂlia, este processo ocorre normalmente em locais pĂșblicos e durante
a noite. Como crianças e recém-nascidos constituem um dos grupos mais vulneråveis da
sociedade, o tempo entre o abandono da criança e seu salvamento, pode ser demasiado
longo e fatal.
A casa da roda foi uma instituição introduzida de forma a tornar o processo de abandono
anĂłnimo mais seguro.
O foco do Projeto de Mestrado discutido nesta dissertação é a anålise e tratamento de
documentos do século XIX, relativos à Casa da Roda de Fafe preservados pelo Arquivo
Municipal de Fafe.
A anĂĄlise documental representa o ponto de partida do processo de desenvolvimento de
uma ontologia. A ontologia possui um papel fundamental na organização e estruturação da
informação contida nos documentos históricos. O processo de desenvolvimento de uma base
de conhecimento consiste na identificação de conceitos e relaçÔes existentes nos documentos.
Outra componente fundamental deste projecto Ă© o desenvolvimento de uma plataforma
digital, que permite utilizadores acederem Ă base de conhecimento desenvolvida. Os
utilizadores podem pesquisar, explorar e adicionar informação à base de conhecimento.
O desenvolvimento deste projecto possui importùncia. De forma imediata, a implementação
de uma plataforma digital permite salvaguardar e preservar informação contida nos
documentos. Estes documentos sĂŁo os Ășnicos registos existentes com esse conteĂșdo e muitos
encontram-se num estado avançado de degradação. A substituição de acessos fĂsicos por
acessos digitais reduz o desgaste associado a cada consulta.
O desenvolvimento da plataforma digital permite disseminar a informação contida na
base documental. Investigadores ou o pĂșblico em geral podem utilizar esta ferramenta com
o intuito de realizar estudos biogrĂĄficos, culturais e sociais sobre este arquivo histĂłrico
Internet based molecular collaborative and publishing tools
The scientific electronic publishing model has hitherto been an Internet based delivery of electronic articles that are essentially replicas of their paper counterparts. They contain little in the way of added semantics that may better expose the science, assist the peer review process and facilitate follow on collaborations, even though the enabling technologies have been around for some time and are mature. This thesis will examine the evolution of chemical electronic publishing over the past 15 years. It will illustrate, which the help of two frameworks, how publishers should be exploiting technologies to improve the semantics of chemical journal articles, namely their value added features and relationships with other chemical resources on the Web.
The first framework is an early exemplar of structured and scalable electronic publishing where a Web content management system and a molecular database are integrated. It employs a test bed of articles from several RSC journals and supporting molecular coordinate and connectivity information. The value of converting 3D molecular expressions in chemical file formats, such as the MOL file, into more generic 3D graphics formats, such as Web3D, is assessed. This exemplar highlights the use of metadata management for bidirectional hyperlink maintenance in electronic publishing.
The second framework repurposes this metadata management concept into a Semantic Web application called SemanticEye. SemanticEye demonstrates how relationships between chemical electronic articles and other chemical resources are established. It adapts the successful semantic model used for digital music metadata management by popular applications such as iTunes. Globally unique identifiers enable relationships to be established between articles and other resources on the Web and SemanticEye implements two: the Document Object Identifier (DOI) for articles and the IUPAC International Chemical Identifier (InChI) for molecules. SemanticEyeâs potential as a framework for seeding collaborations between researchers, who have hitherto never met, is explored using FOAF, the friend-of-a-friend Semantic Web standard for social networks
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
- âŠ