Search CORE

133,531 research outputs found

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

Software-implemented attack tolerance for critical information retrieval

Author: Yang Yunwen (Erica)
Publication venue
Publication date: 01/01/2004
Field of study

The fast-growing reliance of our daily life upon online information services often demands an appropriate level of privacy protection as well as highly available service provision. However, most existing solutions have attempted to address these problems separately. This thesis investigates and presents a solution that provides both privacy protection and fault tolerance for online information retrieval. A new approach to Attack-Tolerant Information Retrieval (ATIR) is developed based on an extension of existing theoretical results for Private Information Retrieval (PIR). ATIR uses replicated services to protect a user's privacy and to ensure service availability. In particular, ATIR can tolerate any collusion of up to t servers for privacy violation and up to ƒ faulty (either crashed or malicious) servers in a system with k replicated servers, provided that k ≥ t + ƒ + 1 where t ≥ 1 and ƒ ≤ t. In contrast to other related approaches, ATIR relies on neither enforced trust assumptions, such as the use of tanker-resistant hardware and trusted third parties, nor an increased number of replicated servers. While the best solution known so far requires k (≥ 3t + 1) replicated servers to cope with t malicious servers and any collusion of up to t servers with an O(n^*^) communication complexity, ATIR uses fewer servers with a much improved communication cost, O(n1/2)(where n is the size of a database managed by a server).The majority of current PIR research resides on a theoretical level. This thesis provides both theoretical schemes and their practical implementations with good performance results. In a LAN environment, it takes well under half a second to use an ATIR service for calculations over data sets with a size of up to 1MB. The performance of the ATIR systems remains at the same level even in the presence of server crashes and malicious attacks. Both analytical results and experimental evaluation show that ATIR offers an attractive and practical solution for ever-increasing online information applications

Durham e-Theses

Recommended from our members

Teaching and learning in information retrieval

Author: Efthimiadis E. N.
Fernandez-Luna J. M.
Huete J. F.
MacFarlane A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/01/2009
Field of study

A literature review of pedagogical methods for teaching and learning information retrieval is presented. From the analysis of the literature a taxonomy was built and it is used to structure the paper. Information Retrieval (IR) is presented from different points of view: technical levels, educational goals, teaching and learning methods, assessment and curricula. The review is organized around two levels of abstraction which form a taxonomy that deals with the different aspects of pedagogy as applied to information retrieval. The first level looks at the technical level of delivering information retrieval concepts, and at the educational goals as articulated by the two main subject domains where IR is delivered: computer science (CS) and library and information science (LIS). The second level focuses on pedagogical issues, such as teaching and learning methods, delivery modes (classroom, online or e-learning), use of IR systems for teaching, assessment and feedback, and curricula design. The survey, and its bibliography, provides an overview of the pedagogical research carried out in the field of IR. It also provides a guide for educators on approaches that can be applied to improving the student learning experiences

City Research Online

Thesauri : practical guidance for construction

Author: McCulloch E.
Publication venue: 'Emerald'
Publication date: 01/01/2005
Field of study

Purpose - With the growing recognition that thesauri aid information retrieval, organisations are beginning to adopt, and in many cases, create thesauri. This paper offers some guidance on the construction process. Design/methodology/approach - An opinion piece with a practical focus, based on recent experiences gleaned from consultancy work. Findings - A number of steps can be taken to ensure any thesaurus under construction is fit for purpose. Due consideration is therefore given to aspects such as term selection, structure and notation, thesauri standards, software and Web display issues, thesauri evaluation and maintenance. This paper also notes that creating new subject schemes from scratch, however attractive, contributes to the plethora of terminologies currently in existence and can limit user searching within particular contexts. The decision to create a "new" thesaurus should therefore be taken carefully and observance of standards is paramount. Practical implications - This paper offers advice to assist practitioners in the development of thesauri. Originality/value - Useful guidance for those practitioners new to the area of thesaurus construction is provided, together with an overview of selected key processes involved in the construction of a thesaurus

Crossref

University of Strathclyde Institutional Repository

Evaluating the retrieval effectiveness of Web search engines using a representative query sample

Author: Lewandowski Dirk
Publication venue
Publication date: 09/05/2014
Field of study

Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3 per cent of cases whereas Bing only found the correct answer 76.6 per cent of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines

arXiv.org e-Print Archive

REPOSIT

An inquiry-based learning approach to teaching information retrieval

Author: C. E. Hmelo-Silver
D. Bligh
F. Lazarinis
G. C. Furman
Gareth J. F. Jones
H. Fry
K. McFarlane
M. D. Merrill
M. D. Merrill
P. A. Kirschner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

The study of information retrieval (IR) has increased in interest and importance with the explosive growth of online information in recent years. Learning about IR within formal courses of study enables users of search engines to use them more knowledgeably and effectively, while providing the starting point for the explorations of new researchers into novel search technologies. Although IR can be taught in a traditional manner of formal classroom instruction with students being led through the details of the subject and expected to reproduce this in assessment, the nature of IR as a topic makes it an ideal subject for inquiry-based learning approaches to teaching. In an inquiry-based learning approach students are introduced to the principles of a subject and then encouraged to develop their understanding by solving structured or open problems. Working through solutions in subsequent class discussions enables students to appreciate the availability of alternative solutions as proposed by their classmates. Following this approach students not only learn the details of IR techniques, but significantly, naturally learn to apply them in solution of problems. In doing this they not only gain an appreciation of alternative solutions to a problem, but also how to assess their relative strengths and weaknesses. Developing confidence and skills in problem solving enables student assessment to be structured around solution of problems. Thus students can be assessed on the basis of their understanding and ability to apply techniques, rather simply their skill at reciting facts. This has the additional benefit of encouraging general problem solving skills which can be of benefit in other subjects. This approach to teaching IR was successfully implemented in an undergraduate module where students were assessed in a written examination exploring their knowledge and understanding of the principles of IR and their ability to apply them to solving problems, and a written assignment based on developing an individual research proposal

Crossref

Irish Universities

DCU Online Research Access Service

Group Invariant Deep Representations for Image Instance Retrieval

Author: Chandrasekhar Vijay
Lin Jie
Morère Olivier
Petta Julie
Poggio Tomaso
Veillard Antoine
Publication venue
Publication date: 11/01/2016
Field of study

Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to overfitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-the-art compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step --quantization or hashing-- may be able to further improve the competitiveness of the descriptors

arXiv.org e-Print Archive

DSpace@MIT

An introduction to crowdsourcing for language and multimedia technology research

Author: A. Doan
C. Callison-Burch
C. Rashtchian
G. Paolacci
G. Pickard
J. Ross
L. Ahn von
L. Ahn von
M. Larson
O. Alonso
R. Snow
S. Novotney
T. Yan
V.C. Rayker
V.S. Sheng
W. Mason
W. Willett
W.S. Lasecki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Language and multimedia technology research often relies on large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible

Crossref

Irish Universities

DCU Online Research Access Service