Search CORE

44,710 research outputs found

Managing large collections of data mining models

Author: Alexander Tuzhilin
Bernstein P.
Bing Liu
Han J.
Hornick M.
Krishnan R.
Tuzhilin A.
Will H
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Meeting of the MINDS: an information retrieval research agenda

Author: Allan J.
Callan J.
Clarke C.L.A.
Dumais S.
Evans D.A.
Sanderson M.
Zhai C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2007
Field of study

Since its inception in the late 1950s, the field of Information Retrieval (IR) has developed tools that help people find, organize, and analyze information. The key early influences on the field are well-known. Among them are H. P. Luhn's pioneering work, the development of the vector space retrieval model by Salton and his students, Cleverdon's development of the Cranfield experimental methodology, Spärck Jones' development of idf, and a series of probabilistic retrieval models by Robertson and Croft. Until the development of the WorldWideWeb (Web), IR was of greatest interest to professional information analysts such as librarians, intelligence analysts, the legal community, and the pharmaceutical industry

White Rose Research Online

Illinois Digital Scholarship: Preserving and Accessing the Digital Past, Present, and Future

Author: Grady Michael
Mischo William H.
Sandore Beth
Publication venue
Publication date: 07/04/2004
Field of study

Since the University's establishment in 1867, its scholarly output has been issued primarily in print, and the University Library and Archives have been readily able to collect, preserve, and to provide access to that output. Today, technological, economic, political and social forces are buffeting all means of scholarly communication. Scholars, academic institutions and publishers are engaged in debate about the impact of digital scholarship and open access publishing on the promotion and tenure process. The upsurge in digital scholarship affects many aspects of the academic enterprise, including how we record, evaluate, preserve, organize and disseminate scholarly work. The result has left the Library with no ready means by which to archive digitally produced publications, reports, presentations, and learning objects, much of which cannot be adequately represented in print form. In this incredibly fluid environment of digital scholarship, the critical question of how we will collect, preserve, and manage access to this important part of the University scholarly record demands a rational and forward-looking plan - one that includes perspectives from diverse scholarly disciplines, incorporates significant research breakthroughs in information science and computer science, and makes effective projections for future integration within the Library and computing services as a part of the campus infrastructure.Prepared jointly by the University of Illinois Library and CITES at the University of Illinois at Urbana-Champaig

Illinois Digital Environment for Access to Learning and Scholarship Repository

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

Crossref

University of Strathclyde Institutional Repository

OPUS - University of Technology Sydney

Using Biotic Interaction Networks for Prediction in Biodiversity and Emerging Diseases

Author: Autonoma México
Camila González-rosas
Carlos N. Ibarra-cerdeña
Christopher R. Stephens
Joaquin Giménez Heau
Victor Sánchez-cordero
Publication venue
Publication date: 01/01/2008
Field of study

Networks offer a powerful tool for understanding and visualizing inter-species interactions within an ecology. Previously considered examples, such as trophic networks, are just representations of experimentally observed direct interactions. However, species interactions are so rich and complex it is not feasible to directly observe more than a small fraction. In this paper, using data mining techniques, we show how potential interactions can be inferred from geographic data, rather than by direct observation. An important application area for such a methodology is that of emerging diseases, where, often, little is known about inter-species interactions, such as between vectors and reservoirs. Here, we show how using geographic data, biotic interaction networks that model statistical dependencies between species distributions can be used to infer and understand inter-species interactions. Furthermore, we show how such networks can be used to build prediction models. For example, for predicting the most important reservoirs of a disease, or the degree of disease risk associated with a geographical area. We illustrate the general methodology by considering an important emerging disease - Leishmaniasis. This data mining approach allows for the use of geographic data to construct inferential biotic interaction networks which can then be used to build prediction models with a wide range of applications in ecology, biodiversity and emerging diseases

CiteSeerX

Nature Precedings

Managing the KM Trade-Off: Knowledge Centralization versus Distribution

Author: Bonifacio Matteo
Camussone Pierfranco
Publication venue
Publication date: 01/01/2003
Field of study

KM is more an archipelago of theories and practices rather than a monolithic approach. We propose a conceptual map that organizes some major approaches to KM according to their assumptions on the nature of knowledge. The paper introduces the two major views on knowledge objectivist, subjectivist - and explodes each of them into two major approaches to KM: knowledge as a market, and knowledge as intellectual capital (the objectivistic perspective); knowledge as mental models, and knowledge as practice (the subjectivist perspective). We argue that the dichotomy between objective and subjective approaches is intrinsic to KM within complex organizations, as each side of the dichotomy responds to different, and often conflicting, needs: on the one hand, the need to maximize the value of knowledge through its replication; on the other hand, the need to keep knowledge appropriate to an increasingly complex and changing environment. Moreover, as a proposal for a deeper discussion, such trade-off will be suggested as the origin of other relevant KM related trade-offs that will be listed. Managing these trade-offs will be proposed as a main challenge of KM

Unitn-eprints Research

Random Indexing K-tree

Author: De Vine Lance
De Vries Christopher M.
Geva Shlomo
Publication venue
Publication date: 01/01/2009
Field of study

Random Indexing (RI) K-tree is the combination of two algorithms for clustering. Many large scale problems exist in document clustering. RI K-tree scales well with large inputs due to its low complexity. It also exhibits features that are useful for managing a changing collection. Furthermore, it solves previous issues with sparse document vectors when using K-tree. The algorithms and data structures are defined, explained and motivated. Specific modifications to K-tree are made for use with RI. Experiments have been executed to measure quality. The results indicate that RI K-tree improves document cluster quality over the original K-tree algorithm.Comment: 8 pages, ADCS 2009; Hyperref and cleveref LaTeX packages conflicted. Removed clevere

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive