Search CORE

524 research outputs found

The Weight Function in the Subtree Kernel is Decisive

Author: Azaïs Romain
Ingels Florian
Publication venue
Publication date: 12/04/2019
Field of study

Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through eight real data classification problems the great efficiency of our approach, in particular for small datasets, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.Comment: 36 page

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Interactive information retrieval

Author: Allan
Barry
Bates
Beaulieu
Beaulieu
Belkin
Belkin
Bhavnani
Blair
Borgman
Borgman
Brajnik
Broder
Buyukkokten
Byström
Campbell
Case
Chen
Cove
Crestani
Crouch
Downie
Dumais
Eastman
Efthimiadis
Ellis
Ellis
Fidel
Ford
Ford
Foster
Fox
Hansen
Harper
Hearst
Hearst
Hearst
Heinström
Hill
Ingwersen
Ingwersen
Jansen
Jansen
Jones
Jones
Kang
Kelly
Kelly
Kim
Konstan
Kruschwitz
Kuhlthau
Legg
Lin
Lin
Lorigo
Lynch
López-Ostenero
Maña-López
Niemi
Norman
Over
Pirkola
Pu
Radev
Reid
Reid
Riedl
Rieh
Robertson
Rosenfeld
Roussinov
Ruthven
Ruthven
Savolainen
Shipman
Shneiderman
Sihvonen
Slone
Smeaton
Spink
Spink
Spink
Spink
Spink
Spink
Spärck Jones
Spärck Jones
Sweeney
Tombros
Tombros
Toms
Topi
Topi
Vakkari
Vakkari
Vakkari
Vakkari
van der Eijk
Vechtomova
Voorhees
White
White
White
White
Wiesman
Wu
Xie
Publication venue: 'Wiley'
Publication date: 01/11/2008
Field of study

Crossref

University of Strathclyde Institutional Repository

Applying Wikipedia to Interactive Information Retrieval

Author: Milne David N.
Publication venue: 'University of Waikato'
Publication date: 15/09/2010
Field of study

There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

Research Commons@Waikato

Selection Bias in News Coverage: Learning it, Fighting it

Author: Aberer Karl
Bourgeois Dylan
Rappaz Jeremie
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

News entities must select and filter the coverage they broadcast through their respective channels since the set of world events is too large to be treated exhaustively. The subjective nature of this filtering induces biases due to, among other things, resource constraints, editorial guidelines, ideological affinities, or even the fragmented nature of the information at a journalist's disposal. The magnitude and direction of these biases are, however, widely unknown. The absence of ground truth, the sheer size of the event space, or the lack of an exhaustive set of absolute features to measure make it difficult to observe the bias directly, to characterize the leaning's nature and to factor it out to ensure a neutral coverage of the news. In this work, we introduce a methodology to capture the latent structure of media's decision process on a large scale. Our contribution is multi-fold. First, we show media coverage to be predictable using personalization techniques, and evaluate our approach on a large set of events collected from the GDELT database. We then show that a personalized and parametrized approach not only exhibits higher accuracy in coverage prediction, but also provides an interpretable representation of the selection bias. Last, we propose a method able to select a set of sources by leveraging the latent representation. These selected sources provide a more diverse and egalitarian coverage, all while retaining the most actively covered events

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

The Weight Function in the Subtree Kernel is Decisive

Author: Azaïs Romain
Ingels Florian
Publication venue
Publication date: 01/01/1989
Field of study

Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficul per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through two real data classification problems the great efficiency of our approach, in particular with respect to the ones considered in the literature, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.Comment: 28 page

arXiv.org e-Print Archive

Yale University

Proceedings of the 6th Dutch-Belgian Information Retrieval Workshop

Author
Publication venue: Neslia Paniculata
Publication date: 01/03/2006
Field of study

University of Twente Research Information

Optimizing a Law School’s Course Schedule

Author: Saxer Shelley
Thompson Gary M.
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2003
Field of study

[Excerpt] “Just like other educational institutions, law schools must schedule courses by taking into consideration student needs, faculty resources, and logistical support such as classroom size and equipment needs. Course scheduling is an administrative function, typically handled by an Assistant Dean or an Associate Dean, who works with the faculty and the registrar to balance these considerations in advance of the registration process. Usually, the entire academic year is scheduled in advance, although the spring semester may be labeled tentative until registration begins for that semester. It’s hard to imagine, but some schools even publish a two-year schedule of upper-division courses so that students can plan their entire law school career in advance. In order to give assistance to those academics involved for the first time in the scheduling process, this article discusses the law school scheduling process and how a scheduling software package has worked to successfully automate what has been seen as one of the most abysmal administrative tasks of an Associate Dean. We first provide a background to course scheduling at a typical law school. We then present a review of the tools for, and literature on, course scheduling, followed by a discussion of how technology can be applied to course scheduling in general, and our outcomes of applying this technology in a law school environment. We close with a brief summary.

UNH Scholars' Repository

eCommons@Cornell

Taking a stance: resistance, faking and Muddling Through

Author: Hanney Roy
Publication venue: 'Informa UK Limited'
Publication date: 01/03/2016
Field of study

This article focuses on project-based learning in media practice education, identifying three themes of interest. The first questions the recontextualisation of practice from the professional to a pedagogic environment. The second theme questions how much we know about what goes on inside a project and contrasts the ways in which students ‘do’ projects with the ways in which educators idealise project work as a mirror of professional practice. The final theme questions whether processes and procedures external to a project environment may result in a decoupling between professional practice and the everyday formulations of practice enacted by students. While educators may seek to encourage students to simultaneously adopt academic, professional and creative identities, as part of an active and purposeful approach to doing projects, this article questions whether tensions between these identities may actually encourage students to engage in decoupling behaviour. The article aims to encourage media practice educators to reflect on their own use of projects and question the ways in which the identities students claim as learners align with educator's beliefs and values

Solent University Research Portal

Advances and utility of diagnostic ultrasound in musculoskeletal medicine

Author: Lento Paul H.
Primack Scott
Publication venue: Humana Press Inc
Publication date: 01/01/2007
Field of study

Musculoskeletal ultrasound (US) can serve as an excellent imaging modality for the musculoskeletal clinician. Although MRI is more commonly ordered in the United States for musculoskeletal problems, both of these imaging modalities have advantages and disadvantages and can be viewed as complementary rather than adversarial. For diagnostic US, relative recent advances in technology have improved ultrasound’s ability to diagnose a myriad of musculoskeletal problems with enhanced resolution. The structures most commonly imaged with diagnostic musculoskeletal US, include tendon, muscle, nerve, joint, and some osseous pathology. This brief review article will discuss the role of US in imaging various common musculoskeletal disorders and will highlight, where appropriate, how recent technological advances have improved this imaging modality in musculoskeletal medicine. Additionally, clinicians practicing musculoskeletal medicine should be aware of the ability as well as limitations of this unique imaging modality and become familiar with conditions where US may be more advantageous than MRI

Springer - Publisher Connector

PubMed Central