104,916 research outputs found

    Execution Performance Issues in Full-Text Information Retrieval

    Get PDF
    The task of an information retrieval system is to identify documents that will satisfy a user’s information need. Effective fulfillment of this task has long been an active area of research, leading to sophisticated retrieval models for representing information content in documents and queries and measuring similarity between the two. The maturity and proven effectiveness of these systems has resulted in demand for increased capacity, performance, scalability, and functionality, especially as information retrieval is integrated into more traditional database management environments. In this dissertation we explore a number of functionality and performance issues in information retrieval. First, we consider creation and modification of the document collection, concentrating on management of the inverted file index. An inverted file architecture based on a persistent object store is described and experimental results are presented for inverted file creation and modification. Our architecture provides performance that scales well with document collection size and the database features supported by the persistent object store provide many solutions to issues that arise during integration of information retrieval into vii more general database environments. We then turn to query evaluation speed and introduce a new optimization technique for statistical ranking retrieval systems that support structured queries. Experimental results from a variety of query sets show that execution time can be reduced by more than 50% with no noticeable impact on retrieval effectiveness, making these more complex retrieval models attractive alternatives for environments that demand high performance

    Fritt søk med FAST søkemotor integrert i PostgreSQL relasjonsdatabase

    Get PDF
    Focus of this thesis is the relationship between databases and information retrieval systems. As a background, the first part consists of a general presentation of databases and information retrieval systems and some examples of already existing efforts to combine the two. While these examples typically have expanded either a database system or an IRS to obtain multi-functionality, we have made an effort of bridging the two systems. Our prototype integrates FDS (Fast Data Search) into the PostgreSQL database management system as a new index access method. FDS is a powerful and scalable commercial enterprise search platform using a typical search engine query language. PostgreSQL, being open source and a general basis for research, lends itself well to customization. The new index access method provides the database with powerful free text capabilities while retaining the power of the relational model for structured data. Preliminary results including a simple performance test verify the feasibility of the integration, and demonstrate the scalability of the prototype. Storage, indexing, updating and search functions are implemented, but ACID properties could not be guaranteed, because the external indexing system has no such guarantee. I also present a prototype for automatic extraction of related structured data in the relational database to XML. Combining these two prototypes by allowing the extracted information to be searched using the full text index, makes it possible to search the database without knowledge of the underlying database scheme. Finally I discuss potential expansions of our implementation by indexing other data than text, multicolumn-indexing and moving complex evaluation from PostgreSQL to FDS, and suggest how this could be done. The thesis is written in Norwegian

    BiologicalNetworks: visualization and analysis tool for systems biology

    Get PDF
    Systems level investigation of genomic scale information requires the development of truly integrated databases dealing with heterogeneous data, which can be queried for simple properties of genes or other database objects as well as for complex network level properties, for the analysis and modelling of complex biological processes. Towards that goal, we recently constructed PathSys, a data integration platform for systems biology, which provides dynamic integration over a diverse set of databases [Baitaluk et al. (2006) BMC Bioinformatics 7, 55]. Here we describe a server, BiologicalNetworks, which provides visualization, analysis services and an information management framework over PathSys. The server allows easy retrieval, construction and visualization of complex biological networks, including genome-scale integrated networks of protein–protein, protein–DNA and genetic interactions. Most importantly, BiologicalNetworks addresses the need for systematic presentation and analysis of high-throughput expression data by mapping and analysis of expression profiles of genes or proteins simultaneously on to regulatory, metabolic and cellular networks. BiologicalNetworks Server is available at

    The integration of lessons learned knowledge in Building Information Modelling (BIM)

    Get PDF
    Lessons learned systems are vital means for integrating construction knowledge into the various phases of the construction project life cycle. Many such systems are tailored towards the owner organisation’s specific needs and workflows to overcome challenges with information collection, documentation and retrieval. Previous works have relied on the development of conventional local and network/cloud-based database management systems to store and retrieve lessons gathered on projects. These lessons learned systems operate independently and have not been developed to take full advantage of the benefits of integration with emerging building information modelling (BIM) technology. As such construction professionals are faced with the shortcomings of the lack in efficient and speedy retrieval of context-focused information on lessons learned for appropriate utilization in projects. To tackle this challenge, we propose the integration of lesson learned knowledge management in BIM in addition to existing 2D-8D modelling of project information. The integration was implemented through the embedding of non –structured query system, NoSQL (MongoDB), in a BIM enabled environment to host lessons learned information linked to model items and 4D modelling project tasks of the digitised model. This is beyond existing conventional text-based queries and is novel. The system is implemented in .NET Frameworks and interfaced with a project management BIM tool, Navisworks Manage. The demonstration with a test case of a federated model from a pre-design school project suggests that lessons learned systems can become an integral part of BIM environments and contribute to enhancing knowledge reuse in projects

    An OAI-based Digital Library Framework for Biodiversity Information Systems

    Get PDF
    Biodiversity information systems (BISs) involve all kinds of heterogeneous data, which include ecological and geographical features. However, available information systems offer very limited support for managing such data in an integrated fashion, and integration is often based on geographic coordinates alone. Furthermore, such systems do not fully support image content management (e.g., photos of landscapes or living organisms), a requirement of many BIS end-users. In order to meet their needs, these users - e.g., biologists, environmental experts - often have to alternate between distinct biodiversity and image information systems to combine information extracted from them. This cumbersome operational procedure is forced on users by lack of interoperability among these systems. This hampers the addition of new data sources, as well as cooperation among scientists. The approach provided in this paper to meet these issues is based on taking advantage of advances in Digital Library (DL) innovations to integrate networked collections of heterogeneous data. It focuses on creating the basis for a biodiversity information system under the digital library perspective, combining new techniques of content-based image retrieval and database query processing mechanisms. This approach solves the problem of system switching, and provides users with a flexible platform from which to tailor a BIS to their needs

    Remote ID for Rapid Assessment of Flight and Vehicle Information

    Get PDF
    The ability to rapidly identify UAS (Unmanned Aircraft Systems) in the field has emerged as a critical need for the integration of small UASs into the national airspace and counter-UAS operations. This paper proposes an architecture for rapid retrieval of UAS information leveraging NASA's current Unmanned Aircraft System (UAS) Traffic Management (UTM) system. The proposed architecture utilizes UTM components: FIMS (Flight Information Management System), USS (UAS Service Supplier), and vehicle registration and model database in order to provide assessment of the UAS reported in the field including the ability to distinguish between participating and non- participating UTM actors. Detailed system descriptions are provided and preliminary results from field tests conducted during UTM TCL (Technical Capability Level) 3 are discussed. It is found that 94 percent of the remote ID look-ups were successful. The average time of a look-up is found to be 1.2 seconds. Failure cases are examined and recommendations on next steps to advance UAS remote identification are provided

    IDL-XML based information sharing model for enterprise integration

    Get PDF
    CJM is a mechanized approach to problem solving in an enterprise. Its basis is intercommunication between information systems, in order to provide faster and more effective decision making process. These results help minimize human error, improve overall productivity and guarantee customer satisfaction. Most enterprises or corporations started implementing integration by adopting automated solutions in a particular process, department, or area, in isolation from the rest of the physical or intelligent process resulting in the incapability for systems and equipment to share information with each other and with other computer systems. The goal in a manufacturing environment is to have a set of systems that will interact seamlessly with each other within a heterogeneous object framework overcoming the many barriers (language, platforms, and even physical location) that do not grant information sharing. This study identifies the data needs of several information systems of a corporation and proposes a conceptual model to improve the information sharing process and thus Computer Integrated Manufacturing. The architecture proposed in this work provides a methodology for data storage, data retrieval, and data processing in order to provide integration at the enterprise level. There are four layers of interaction in the proposed IXA architecture. The name TXA (DDL - XML Architecture for Enterprise Integration) is derived from the standards and technologies used to define the layers and corresponding functions of each layer. The first layer addresses the systems and applications responsible for data manipulation. The second layer provides the interface definitions to facilitate the interaction between the applications on the first layer. The third layer is where data would be structured using XML to be stored and the fourth layer is a central repository and its database management system

    Enterprise Information Systems Integration and Business Process Improvement Initiative: An Empirical Study

    Get PDF
    Since the mid-and late 80\u27s, business process improvement (BPI) has become one of the leading methodologies to deliver corporations with high quality products and services. Businesses are seeking not simply to automate existing operations, but to improve and redesign business processes and capture customers\u27 expectations for products, and service delivery. Extensive communication and inter-connectivity arising from adoption of standards and integrated services digital networks (ISDN) has become a major force affectingbusinesses in fundamental ways (Madnick, 1990; Boar, 1993). The second avenue through which businesses are identifying new opportunities is the availability of databases (Madnick, 1990). By linking inter-organizational, inter-functional, and inter-personal levels of the processes through IS networks, businesses are not only automating their activities, they are also reshaping and improving their business processes (Hammer and Champy, 1993). By accessing enterprise-wise information from databases, IS integration is providing numerous opportunities to coordinate organizational activities by facilitating communication and information exchange across departments without the need to go up and down the vertical chain of command. The use of information networks to access relevant information from databases has been of enormous importance to eliminate duplicate activities, prevent errors from occurring, cycle time reduction in product development, and customer responsiveness (Davenport, 1993). The need of a well planned database management system is one of the important requirements for BPI. In most organizations, data architecture has evolved as a result of applications databases in various departments rather than as a well planned data management strategy. Therefore, the resolution of data management problems becomes quite difficult (Goodhue, Quillard, and Rockart, 1988). The access to timely, accurate and consistent information is crucial in business process improvement. IS integration, through communication networks and database systems, enables organizations to create and sustain process improvement through timely retrieval of consistent and accurate information. Process improvement can be measured by the extent desired specified results are produced right thefirst time (i.e., outcomes with zero defect), the extent various processes minimize the consumption of the business resources, and the extent business processes are easily modified to meet or exceed customers\u27 expectations for products and service delivery. The current study is aimed at developing and empirically testing the relationships between IS integration and BPI. As presently there are only a handful studies that empirically test the relationship between information systems and BPI, this study is an important step for furthering the scope of present stage of the IS literature

    A Molecular Biology Database Digest

    Get PDF
    Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology
    corecore