4,164 research outputs found
Context Based Indexing in Search Engines: A Review
There are so many increasing amount of information in the today’s World Wide Web. For these increasing amount of information we need efficient and effective index structure .Most indexing techniques directly matched terms from the documents and terms from query. Granting efficient and fast accesses to the index is a key issue for performance of web search engines. The main aim of search engine is to provide most relevant documents to the users in minimum possible time. Indexing is performed on the web pages after they have been gathered into a repository by the crawler. The existing architecture of search engine shoes that the index is built on the basis of the terms of the document. The context of the documents being collected by the crawler in the repository is being extracted by the indexer using the context repository, thesaurus and Ontology repository and then documents are indexed.Â
P2P and SOA architecture for digital libraries
Doutoramento em Engenharia InformáticaIn an information-driven society where the volume and value of produced and
consumed data assumes a growing importance, the role of digital libraries
gains particular importance. This work analyzes the limitations in current digital
library management systems and the opportunities brought by recent
distributed computing models.
The result of this work is the implementation of the University of Aveiro
integrated system for digital libraries and archives. It concludes by analyzing
the system in production and proposing a new service oriented digital library
architecture supported in a peer-to-peer infrastructureNuma sociedade em que o volume e o valor da informação produzida e
disseminada tem um peso cada vez maior, o papel das bibliotecas digitais
assume especial relevo. O presente trabalho analisa as limitações dos actuais
sistemas de gestão de bibliotecas digitais e as oportunidades criadas pelos
mais recentes modelos de computação distribuÃda.
Deste trabalho resultou a implementação do sistema integrado para bibliotecas
e arquivos digitais da Universidade de Aveiro. Este trabalho finaliza
debruçando-se sobre o sistema em produção e propondo uma nova
arquitectura de biblioteca digital sustentada numa infrastrutura peer-to-peer e
orientada a serviços
DRIVER Technology Watch Report
This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field
A SEMANTIC GRAPH DATABASE FOR BIM-GIS INTEGRATED INFORMATION MODEL FOR AN INTELLIGENT URBAN MOBILITY WEB APPLICATION
Over the recent years, the usage of semantic web technologies and Resources Description Framework (RDF) data models have been notably increased in many fields. Multiple systems are using RDF data to describe information resources and semantic associations. RDF data plays a very important role in advanced information retrieval, and graphs are efficient ways to visualize and represent real world data by providing solutions to many real-time scenarios that can be simulated and implemented using graph databases, and efficiently query graphs with multiple attributes representing different domains of knowledge. Given that graph databases are schema less with efficient storage for semi-structured data, they can provide fast and deep traversals instead of slow RDBMS SQL based joins allowing Atomicity, Consistency, Isolation and durability (ACID) transactions with rollback support, and by utilizing mathematics of graph they can enormous potential for fast data extraction and storage of information in the form of nodes and relationships. In this paper, we are presenting an architectural design with complete implementation of BIM-GIS integrated RDF graph database. The proposed integration approach is composed of four main phases: ontological BIM and GIS model’s construction, mapping and semantic integration using interoperable data formats, then an import into a graph database with querying and filtering capabilities. The workflows and transformations of IFC and CityGML schemas into object graph databases model are developed and applied to an intelligent urban mobility web application on a game engine platform validate the integration methodology
Forum Session at the First International Conference on Service Oriented Computing (ICSOC03)
The First International Conference on Service Oriented Computing (ICSOC) was held in Trento, December 15-18, 2003. The focus of the conference ---Service Oriented Computing (SOC)--- is the new emerging paradigm for distributed computing and e-business processing that has evolved from object-oriented and component computing to enable building agile networks of collaborating business applications distributed within and across organizational boundaries. Of the 181 papers submitted to the ICSOC conference, 10 were selected for the forum session which took place on December the 16th, 2003. The papers were chosen based on their technical quality, originality, relevance to SOC and for their nature of being best suited for a poster presentation or a demonstration. This technical report contains the 10 papers presented during the forum session at the ICSOC conference. In particular, the last two papers in the report ere submitted as industrial papers
Multimedia Information Retrieval
With recent advances in screen and mass storage technology, together with the on-going advances in computer power, many users of personal computers and low end workstations are now regularly manipulating non-textual information. This information may be in the form of drawings, graphs, animations, sound, or video (for example). With the increased usage of these media on computer systems there has not, however, been much work in the provision of access methods to non-textual computer based information. An increasingly common method for accessing large document bases of textual information is free text retrieval. In such systems users typically enter natural language queries. These are then matched against the textual documents in the system. It is often possible for the user to re-formulate a query by providing relevance feedback, this usually takes the form of the user informing the system that certain documents are indeed relevant to the current search. This information, together with the original query, is then used by the retrieval engine to provide an improved list of matched documents. Although free text retrieval provides reasonably effective access to large document bases it does not provide easy access to non-textual information. Various query based access methods to nontextual document bases are presented, but these are all restricted to specific domains and cannot be used in mixed media systems. Hypermedia, on the other hand, is an access method for document bases which is based on the user browsing through the document base rather than issuing queries. A set of interconnected paths are constructed through the base which the user may follow. Although providing poorer access to large document bases the browsing approach does provide very natural access to non-textual information. The recent explosion in hypermedia systems and discussion has been partly due to the requirement for access to mixed media document bases. Some work is reported which presents an integration of free text retrieval based queries with hypermedia. This provides a solution to the scaling problem of browsing based systems, these systems provide access to textual nodes by query or by browsing. Non-textual nodes are, however, still only accessible by browsing - either from the starting point of the document base or from a textual document which matched the query. A model of retrieval for non-textual documents is developed, this model is based on document's context within the hypermedia document base, as opposed to the document's content. If a non-textual document is connected to several textual documents, by paths in the hypermedia, then it is likely that the non-textual document will match the query whenever a high enough proportion of the textual documents match. This model of retrieval uses clustering techniques to calculate a descriptor for non-textual nodes so that they may be retrieved directly in response to a query. To establish that this model of retrieval for non-textual documents is worthwhile an experiment was run which used the text only CACM collection. Each record within the collection was initially treated as if it were non-textual and had a cluster based description calculated based on citations, this cluster based descriptor was then compared with the actual descriptor (calculated from the record's content) to establish how accurate the cluster descriptor was. As a base case the experiment was repeated using randomly created links, as opposed to citations. The results showed that for citation based links the cluster based descriptions had a mean correlating of 0.230 with the content based description (on a range from 0 to 1, where 1 represents a perfect match) and performed approximately six times better than when random links were used (mean random correlation was 0.037). This shows that citation based cluster descriptions of documents are significantly closer to the actual descriptions than random based links, and although the correlation is quite low, the cluster approach provides a useful technique for describing documents. The model of retrieval presented for non-textual documents relies upon a hypermedia structure existing in the document base, since the model cannot work if the documents are not linked together. A user interface to a document base which gives access to a retrieval engine and to hypermedia links can be based around three main categories: browsing only access, use the retrieval engine to support link creation; query only access, use links to provide access to non-text; query and browsing access Although the last user interface may initially appear most suitable for a document base which can support queries and browsing it is also potentially the most complex interface, and may require a more complex model of retrieval for users to successfully search the document base. A set of user tests were carried out to establish user behaviour and to consider interface issues concerning easy access to documents which are held on such document bases. These tests showed that, overall, no access method was clearly better or poorer than any other method. (Abstract shortened by ProQuest.)
Recommended from our members
Knowledge search for new product development: a multi-agent based methodology
Manufacturers are the leaders in developing new products to drive productivity. Higher productivity means more products based on the same materials, energy, labour, and capitals. New product development plays a critical role in the success of manufacturing firms. Activities in the product development process are dependent on the knowledge of new product development team members. Increasingly, many enterprises consider effective knowledge search to be a source of competitive advantage.
This research presents an exploratory case study conducted at an aircraft manufacturer. This investigation uncovered six, empirically derived and theoretically informed, problems to enterprise knowledge search. They have been articulated as (i) the effectual web bandwidth limits search speed; (ii) less relevant search results based on word-frequency recognition models of search engine; (iii) un-useable techniques for enterprise search; (iv) rigour security, reliability, and company policy; (v) poor search performance about unstructured enterprise knowledge; (vi) the lack of tacit knowledge sharing. Existing search methodologies have focused on the internet search, rather than providing effective search for enterprise.
This research aim is developed to assist the manufacturing enterprise in meeting the industrial requirements in the following way: a methodology and system that can improve the information and knowledge search performance in new product development process. Based on the exploratory case findings, a knowledge search methodology and system has been developed. Agent technology is used to fulfil the requirements of enterprise search. Some initial tests were conducted to better understand implementation issues and future deployment of the methodology and system in practice
- …