114,930 research outputs found

    Knowledge-based document retrieval with application to TEXPROS

    Get PDF
    Document retrieval in an information system is most often accomplished through keyword search. The common technique behind keyword search is indexing. The major drawback of such a search technique is its lack of effectiveness and accuracy. It is very common in a typical keyword search over the Internet to identify hundreds or even thousands of records as the potentially desired records. However, often few of them are relevant to users\u27 interests. This dissertation presents knowledge-based document retrieval architecture with application to TEXPROS. The architecture is based on a dual document model that consists of a document type hierarchy and, a folder organization. Using the knowledge collected during document filing, the search space can be narrowed down significantly. Combining the classical text-based retrieval methods with the knowledge-based retrieval can improve tremendously both search efficiency and effectiveness. With the proposed predicate-based query language, users can more precisely and accurately specify the search criteria and their knowledge about the documents to be retrieved. To assist users formulate a query, a guided search is presented as part of an intelligent user interface. Supported by an intelligent question generator, an inference engine, a question base, and a predicate-based query composer, the guided search collects the most important information known to the user to retrieve the documents that satisfy users\u27 particular interests. A knowledge-based query processing and search engine is presented as the core component in this architecture. Algorithms are developed for the search engine to effectively and efficiently retrieve the documents that match the query. Cache is introduced to speed up the process of query refinement. Theoretical proof and performance analysis are performed to prove the efficiency and effectiveness of this knowledge-based document retrieval approach

    Data Mining in Electronic Commerce

    Full text link
    Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Web-course search engine : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University

    Get PDF
    The World Wide Web is an amazing place that people's lives more and more rely on. Especially, for the young generation, they spend a significant amount of their play and study time using the Internet. Many tools have been developed to help the educational users in finding educational resources. These tools include various search engines. Web directories and educational domain gateways. Nevertheless, these systems have many weaknesses that made them unsuitable for the specific search needs of the learners. The research presented in this thesis describes the development of the Web-course search engine, which is a friendly, efficient and accurate helper for the learners to get what they want in the vast Internet ocean. The most attractive feature of this system is that the system uses one universal language, which lets the searchers and the resources "communicate" with each other. Then the learner searchers can find the Web-based educational resources that are most fit to their needs and course providers can provide all necessary information about their courseware. This universal language is one widely acceptable Metadata standard. Following the Metadata standard, the system collects exact information about educational resources, provides adequate search parameters for search and returns evaluative results. By using the Web-course search engine, the learners and the other educational users are able to find useful, valuable and related educational resources more effectively and efficiently. Some improvement suggestions of the search mechanism in the World Wide Web have been brought forward for the future research as a result of this project

    A self-adapting latency/power tradeoff model for replicated search engines

    Get PDF
    For many search settings, distributed/replicated search engines deploy a large number of machines to ensure efficient retrieval. This paper investigates how the power consumption of a replicated search engine can be automatically reduced when the system has low contention, without compromising its efficiency. We propose a novel self-adapting model to analyse the trade-off between latency and power consumption for distributed search engines. When query volumes are high and there is contention for the resources, the model automatically increases the necessary number of active machines in the system to maintain acceptable query response times. On the other hand, when the load of the system is low and the queries can be served easily, the model is able to reduce the number of active machines, leading to power savings. The model bases its decisions on examining the current and historical query loads of the search engine. Our proposal is formulated as a general dynamic decision problem, which can be quickly solved by dynamic programming in response to changing query loads. Thorough experiments are conducted to validate the usefulness of the proposed adaptive model using historical Web search traffic submitted to a commercial search engine. Our results show that our proposed self-adapting model can achieve an energy saving of 33% while only degrading mean query completion time by 10 ms compared to a baseline that provisions replicas based on a previous day's traffic

    CHORUS Deliverable 4.5: Report of the 3rd CHORUS Conference

    Get PDF
    The third and last CHORUS conference on Multimedia Search Engines took place from the 26th to the 27th of May 2009 in Brussels, Belgium. About 100 participants from 15 European countries, the US, Japan and Australia learned about the latest developments in the domain. An exhibition of 13 stands presented 16 research projects currently ongoing around the world

    From local laboratory data to public domain database in search of indirect association of diseases: AJAX based gene data search engine.

    Get PDF
    This paper presents an extensible schema for capturing laboratory gene variance data with its meta-data properties in a semi-structured environment. This paper also focuses on the issues of creating a local and task specific component database which is a subset of global data resources. An XML based genetic disorder component database schema is developed with adequate flexibilities to facilitate searching of gene mutation data. A web based search engine is developed that allows researchers to query a set of gene parameters obtained from local XML schema and subsequently allow them to automatically establish a link with the public domain gene databases. The application applies AJAX (Asynchronous Javascript and XML), a cutting-edge web technology, to carry out the gene data searching function

    Middleware Technologies for Cloud of Things - a survey

    Get PDF
    The next wave of communication and applications rely on the new services provided by Internet of Things which is becoming an important aspect in human and machines future. The IoT services are a key solution for providing smart environments in homes, buildings and cities. In the era of a massive number of connected things and objects with a high grow rate, several challenges have been raised such as management, aggregation and storage for big produced data. In order to tackle some of these issues, cloud computing emerged to IoT as Cloud of Things (CoT) which provides virtually unlimited cloud services to enhance the large scale IoT platforms. There are several factors to be considered in design and implementation of a CoT platform. One of the most important and challenging problems is the heterogeneity of different objects. This problem can be addressed by deploying suitable "Middleware". Middleware sits between things and applications that make a reliable platform for communication among things with different interfaces, operating systems, and architectures. The main aim of this paper is to study the middleware technologies for CoT. Toward this end, we first present the main features and characteristics of middlewares. Next we study different architecture styles and service domains. Then we presents several middlewares that are suitable for CoT based platforms and lastly a list of current challenges and issues in design of CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268, Digital Communications and Networks, Elsevier (2017

    Middleware Technologies for Cloud of Things - a survey

    Full text link
    The next wave of communication and applications rely on the new services provided by Internet of Things which is becoming an important aspect in human and machines future. The IoT services are a key solution for providing smart environments in homes, buildings and cities. In the era of a massive number of connected things and objects with a high grow rate, several challenges have been raised such as management, aggregation and storage for big produced data. In order to tackle some of these issues, cloud computing emerged to IoT as Cloud of Things (CoT) which provides virtually unlimited cloud services to enhance the large scale IoT platforms. There are several factors to be considered in design and implementation of a CoT platform. One of the most important and challenging problems is the heterogeneity of different objects. This problem can be addressed by deploying suitable "Middleware". Middleware sits between things and applications that make a reliable platform for communication among things with different interfaces, operating systems, and architectures. The main aim of this paper is to study the middleware technologies for CoT. Toward this end, we first present the main features and characteristics of middlewares. Next we study different architecture styles and service domains. Then we presents several middlewares that are suitable for CoT based platforms and lastly a list of current challenges and issues in design of CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268, Digital Communications and Networks, Elsevier (2017
    corecore