4,164 research outputs found

    Context Based Indexing in Search Engines: A Review

    Get PDF
    There are so many increasing amount of information in the today’s World Wide Web. For these increasing amount of information we need efficient and effective index structure .Most indexing techniques directly matched terms from the documents and terms from query. Granting efficient and fast accesses to the index is a key issue for performance of web search engines. The main aim of search engine is to provide most relevant documents to the users in minimum possible time. Indexing is performed on the web pages after they have been gathered into a repository by the crawler. The existing architecture of search engine shoes that the index is built on the basis of the terms of the document. The context of the documents being collected by the crawler in the repository is being extracted by the indexer using the context repository, thesaurus and Ontology repository and then documents are indexed. 

    P2P and SOA architecture for digital libraries

    Get PDF
    Doutoramento em Engenharia InformáticaIn an information-driven society where the volume and value of produced and consumed data assumes a growing importance, the role of digital libraries gains particular importance. This work analyzes the limitations in current digital library management systems and the opportunities brought by recent distributed computing models. The result of this work is the implementation of the University of Aveiro integrated system for digital libraries and archives. It concludes by analyzing the system in production and proposing a new service oriented digital library architecture supported in a peer-to-peer infrastructureNuma sociedade em que o volume e o valor da informação produzida e disseminada tem um peso cada vez maior, o papel das bibliotecas digitais assume especial relevo. O presente trabalho analisa as limitações dos actuais sistemas de gestão de bibliotecas digitais e as oportunidades criadas pelos mais recentes modelos de computação distribuída. Deste trabalho resultou a implementação do sistema integrado para bibliotecas e arquivos digitais da Universidade de Aveiro. Este trabalho finaliza debruçando-se sobre o sistema em produção e propondo uma nova arquitectura de biblioteca digital sustentada numa infrastrutura peer-to-peer e orientada a serviços

    Web and Electronic Publishing Trends

    Get PDF

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

    A SEMANTIC GRAPH DATABASE FOR BIM-GIS INTEGRATED INFORMATION MODEL FOR AN INTELLIGENT URBAN MOBILITY WEB APPLICATION

    Get PDF
    Over the recent years, the usage of semantic web technologies and Resources Description Framework (RDF) data models have been notably increased in many fields. Multiple systems are using RDF data to describe information resources and semantic associations. RDF data plays a very important role in advanced information retrieval, and graphs are efficient ways to visualize and represent real world data by providing solutions to many real-time scenarios that can be simulated and implemented using graph databases, and efficiently query graphs with multiple attributes representing different domains of knowledge. Given that graph databases are schema less with efficient storage for semi-structured data, they can provide fast and deep traversals instead of slow RDBMS SQL based joins allowing Atomicity, Consistency, Isolation and durability (ACID) transactions with rollback support, and by utilizing mathematics of graph they can enormous potential for fast data extraction and storage of information in the form of nodes and relationships. In this paper, we are presenting an architectural design with complete implementation of BIM-GIS integrated RDF graph database. The proposed integration approach is composed of four main phases: ontological BIM and GIS model’s construction, mapping and semantic integration using interoperable data formats, then an import into a graph database with querying and filtering capabilities. The workflows and transformations of IFC and CityGML schemas into object graph databases model are developed and applied to an intelligent urban mobility web application on a game engine platform validate the integration methodology

    Forum Session at the First International Conference on Service Oriented Computing (ICSOC03)

    Get PDF
    The First International Conference on Service Oriented Computing (ICSOC) was held in Trento, December 15-18, 2003. The focus of the conference ---Service Oriented Computing (SOC)--- is the new emerging paradigm for distributed computing and e-business processing that has evolved from object-oriented and component computing to enable building agile networks of collaborating business applications distributed within and across organizational boundaries. Of the 181 papers submitted to the ICSOC conference, 10 were selected for the forum session which took place on December the 16th, 2003. The papers were chosen based on their technical quality, originality, relevance to SOC and for their nature of being best suited for a poster presentation or a demonstration. This technical report contains the 10 papers presented during the forum session at the ICSOC conference. In particular, the last two papers in the report ere submitted as industrial papers

    Multimedia Information Retrieval

    Get PDF
    With recent advances in screen and mass storage technology, together with the on-going advances in computer power, many users of personal computers and low end workstations are now regularly manipulating non-textual information. This information may be in the form of drawings, graphs, animations, sound, or video (for example). With the increased usage of these media on computer systems there has not, however, been much work in the provision of access methods to non-textual computer based information. An increasingly common method for accessing large document bases of textual information is free text retrieval. In such systems users typically enter natural language queries. These are then matched against the textual documents in the system. It is often possible for the user to re-formulate a query by providing relevance feedback, this usually takes the form of the user informing the system that certain documents are indeed relevant to the current search. This information, together with the original query, is then used by the retrieval engine to provide an improved list of matched documents. Although free text retrieval provides reasonably effective access to large document bases it does not provide easy access to non-textual information. Various query based access methods to nontextual document bases are presented, but these are all restricted to specific domains and cannot be used in mixed media systems. Hypermedia, on the other hand, is an access method for document bases which is based on the user browsing through the document base rather than issuing queries. A set of interconnected paths are constructed through the base which the user may follow. Although providing poorer access to large document bases the browsing approach does provide very natural access to non-textual information. The recent explosion in hypermedia systems and discussion has been partly due to the requirement for access to mixed media document bases. Some work is reported which presents an integration of free text retrieval based queries with hypermedia. This provides a solution to the scaling problem of browsing based systems, these systems provide access to textual nodes by query or by browsing. Non-textual nodes are, however, still only accessible by browsing - either from the starting point of the document base or from a textual document which matched the query. A model of retrieval for non-textual documents is developed, this model is based on document's context within the hypermedia document base, as opposed to the document's content. If a non-textual document is connected to several textual documents, by paths in the hypermedia, then it is likely that the non-textual document will match the query whenever a high enough proportion of the textual documents match. This model of retrieval uses clustering techniques to calculate a descriptor for non-textual nodes so that they may be retrieved directly in response to a query. To establish that this model of retrieval for non-textual documents is worthwhile an experiment was run which used the text only CACM collection. Each record within the collection was initially treated as if it were non-textual and had a cluster based description calculated based on citations, this cluster based descriptor was then compared with the actual descriptor (calculated from the record's content) to establish how accurate the cluster descriptor was. As a base case the experiment was repeated using randomly created links, as opposed to citations. The results showed that for citation based links the cluster based descriptions had a mean correlating of 0.230 with the content based description (on a range from 0 to 1, where 1 represents a perfect match) and performed approximately six times better than when random links were used (mean random correlation was 0.037). This shows that citation based cluster descriptions of documents are significantly closer to the actual descriptions than random based links, and although the correlation is quite low, the cluster approach provides a useful technique for describing documents. The model of retrieval presented for non-textual documents relies upon a hypermedia structure existing in the document base, since the model cannot work if the documents are not linked together. A user interface to a document base which gives access to a retrieval engine and to hypermedia links can be based around three main categories: browsing only access, use the retrieval engine to support link creation; query only access, use links to provide access to non-text; query and browsing access Although the last user interface may initially appear most suitable for a document base which can support queries and browsing it is also potentially the most complex interface, and may require a more complex model of retrieval for users to successfully search the document base. A set of user tests were carried out to establish user behaviour and to consider interface issues concerning easy access to documents which are held on such document bases. These tests showed that, overall, no access method was clearly better or poorer than any other method. (Abstract shortened by ProQuest.)
    • …
    corecore