114,930 research outputs found
Knowledge-based document retrieval with application to TEXPROS
Document retrieval in an information system is most often accomplished through keyword search. The common technique behind keyword search is indexing. The major drawback of such a search technique is its lack of effectiveness and accuracy. It is very common in a typical keyword search over the Internet to identify hundreds or even thousands of records as the potentially desired records. However, often few of them are relevant to users\u27 interests.
This dissertation presents knowledge-based document retrieval architecture with application to TEXPROS. The architecture is based on a dual document model that consists of a document type hierarchy and, a folder organization. Using the knowledge collected during document filing, the search space can be narrowed down significantly. Combining the classical text-based retrieval methods with the knowledge-based retrieval can improve tremendously both search efficiency and effectiveness.
With the proposed predicate-based query language, users can more precisely and accurately specify the search criteria and their knowledge about the documents to be retrieved. To assist users formulate a query, a guided search is presented as part of an intelligent user interface. Supported by an intelligent question generator, an inference engine, a question base, and a predicate-based query composer, the guided search collects the most important information known to the user to retrieve the documents that satisfy users\u27 particular interests.
A knowledge-based query processing and search engine is presented as the core component in this architecture. Algorithms are developed for the search engine to effectively and efficiently retrieve the documents that match the query. Cache is introduced to speed up the process of query refinement. Theoretical proof and performance analysis are performed to prove the efficiency and effectiveness of this knowledge-based document retrieval approach
Data Mining in Electronic Commerce
Modern business is rushing toward e-commerce. If the transition is done
properly, it enables better management, new services, lower transaction costs
and better customer relations. Success depends on skilled information
technologists, among whom are statisticians. This paper focuses on some of the
contributions that statisticians are making to help change the business world,
especially through the development and application of data mining methods. This
is a very large area, and the topics we cover are chosen to avoid overlap with
other papers in this special issue, as well as to respect the limitations of
our expertise. Inevitably, electronic commerce has raised and is raising fresh
research problems in a very wide range of statistical areas, and we try to
emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Web-course search engine : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University
The World Wide Web is an amazing place that people's lives more and more rely on. Especially, for the young generation, they spend a significant amount of their play and study time using the Internet. Many tools have been developed to help the educational users in finding educational resources. These tools include various search engines. Web directories and educational domain gateways. Nevertheless, these systems have many weaknesses that made them unsuitable for the specific search needs of the learners. The research presented in this thesis describes the development of the Web-course search engine, which is a friendly, efficient and accurate helper for the learners to get what they want in the vast Internet ocean. The most attractive feature of this system is that the system uses one universal language, which lets the searchers and the resources "communicate" with each other. Then the learner searchers can find the Web-based educational resources that are most fit to their needs and course providers can provide all necessary information about their courseware. This universal language is one widely acceptable Metadata standard. Following the Metadata standard, the system collects exact information about educational resources, provides adequate search parameters for search and returns evaluative results. By using the Web-course search engine, the learners and the other educational users are able to find useful, valuable and related educational resources more effectively and efficiently. Some improvement suggestions of the search mechanism in the World Wide Web have been brought forward for the future research as a result of this project
A self-adapting latency/power tradeoff model for replicated search engines
For many search settings, distributed/replicated search engines deploy a large number of machines to ensure efficient retrieval. This paper investigates how the power consumption of a replicated search engine can be automatically reduced when the system has low contention, without compromising its efficiency. We propose a novel self-adapting model to analyse the trade-off between latency and power consumption for distributed search engines. When query volumes are high and there is contention for the resources, the model automatically increases the necessary number of active machines in the system to maintain acceptable query response times. On the other hand, when the load of the system is low and the queries can be served easily, the model is able to reduce the number of active machines, leading to power savings. The model bases its decisions on examining the current and historical query loads of the search engine. Our proposal is formulated as a general dynamic decision problem, which can be quickly solved by dynamic programming in response to changing query loads. Thorough experiments are conducted to validate the usefulness of the proposed adaptive model using historical Web search traffic submitted to a commercial search engine. Our results show that our proposed self-adapting model can achieve an energy saving of 33% while only degrading mean query completion time by 10 ms compared to a baseline that provisions replicas based on a previous day's traffic
CHORUS Deliverable 4.5: Report of the 3rd CHORUS Conference
The third and last CHORUS conference on Multimedia Search Engines took place from the 26th to the 27th of May 2009 in Brussels, Belgium. About 100 participants from 15 European countries, the US, Japan and Australia learned about the latest developments in the domain. An exhibition of 13 stands presented 16 research projects currently ongoing around the
world
From local laboratory data to public domain database in search of indirect association of diseases: AJAX based gene data search engine.
This paper presents an extensible schema for capturing laboratory gene variance data with its meta-data properties in a semi-structured environment. This paper also focuses on the issues of creating a local and task specific component database which is a subset of global data resources. An XML based genetic disorder component database schema is developed with adequate flexibilities to facilitate searching of gene mutation data. A web based search engine is developed that allows researchers to query a set of gene parameters obtained from local XML schema and subsequently allow them to automatically establish a link with the public domain gene databases. The application applies AJAX (Asynchronous Javascript and XML), a cutting-edge web technology, to carry out the gene data searching function
Middleware Technologies for Cloud of Things - a survey
The next wave of communication and applications rely on the new services
provided by Internet of Things which is becoming an important aspect in human
and machines future. The IoT services are a key solution for providing smart
environments in homes, buildings and cities. In the era of a massive number of
connected things and objects with a high grow rate, several challenges have
been raised such as management, aggregation and storage for big produced data.
In order to tackle some of these issues, cloud computing emerged to IoT as
Cloud of Things (CoT) which provides virtually unlimited cloud services to
enhance the large scale IoT platforms. There are several factors to be
considered in design and implementation of a CoT platform. One of the most
important and challenging problems is the heterogeneity of different objects.
This problem can be addressed by deploying suitable "Middleware". Middleware
sits between things and applications that make a reliable platform for
communication among things with different interfaces, operating systems, and
architectures. The main aim of this paper is to study the middleware
technologies for CoT. Toward this end, we first present the main features and
characteristics of middlewares. Next we study different architecture styles and
service domains. Then we presents several middlewares that are suitable for CoT
based platforms and lastly a list of current challenges and issues in design of
CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268,
Digital Communications and Networks, Elsevier (2017
Middleware Technologies for Cloud of Things - a survey
The next wave of communication and applications rely on the new services
provided by Internet of Things which is becoming an important aspect in human
and machines future. The IoT services are a key solution for providing smart
environments in homes, buildings and cities. In the era of a massive number of
connected things and objects with a high grow rate, several challenges have
been raised such as management, aggregation and storage for big produced data.
In order to tackle some of these issues, cloud computing emerged to IoT as
Cloud of Things (CoT) which provides virtually unlimited cloud services to
enhance the large scale IoT platforms. There are several factors to be
considered in design and implementation of a CoT platform. One of the most
important and challenging problems is the heterogeneity of different objects.
This problem can be addressed by deploying suitable "Middleware". Middleware
sits between things and applications that make a reliable platform for
communication among things with different interfaces, operating systems, and
architectures. The main aim of this paper is to study the middleware
technologies for CoT. Toward this end, we first present the main features and
characteristics of middlewares. Next we study different architecture styles and
service domains. Then we presents several middlewares that are suitable for CoT
based platforms and lastly a list of current challenges and issues in design of
CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268,
Digital Communications and Networks, Elsevier (2017
- …