Search CORE

47 research outputs found

Topian 0.1 Reference Manual

Author: Chappelier Jean-Cédric
Eckard Emmanuel
Publication venue
Publication date: 22/01/2008
Field of study

This document describes Topian ("Topic-based Model layer for Xapian"), a software layer intended to add support for topical models to Xapian

Infoscience - École polytechnique fédérale de Lausanne

Free Software for research in Information Retrieval and Textual Clustering

Author: Chappelier Jean-Cédric
Eckard Emmanuel
Publication venue
Publication date: 22/01/2008
Field of study

The document provides an overview of the main Free ("Open Source") software of interest for research in Information Retrieval, as well as some background on the context. I provides a guideline for choosing appropriate tools

Infoscience - École polytechnique fédérale de Lausanne

Inclusion de sens dans la représentation de documents textuels : état de l'art

Author: Chappelier Jean-Cédric
Eckard Emmanuel
Publication venue
Publication date: 22/01/2008
Field of study

Ce document donne un aperçu de l'état de l'art dans le domaine de la représentation du sens dans les documents textuels

Infoscience - École polytechnique fédérale de Lausanne

Large-scale extraction of brain connectivity from the neuroscientific literature

Author: Chappelier Jean-Cédric
Hill Sean
Richardet Renaud
Telefont Martin
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: In neuroscience, as in many other scientific domains, the primary form of knowledge dissemination is through published articles. One challenge for modern neuroinformatics is finding methods to make the knowledge from the tremendous backlog of publications accessible for search, analysis and the integration of such data into computational models. A key example of this is metascale brain connectivity, where results are not reported in a normalized repository. Instead, these experimental results are published in natural language, scattered among individual scientific publications. This lack of normalization and centralization hinders the large-scale integration of brain connectivity results. In this article, we present text-mining models to extract and aggregate brain connectivity results from 13.2 million PubMed abstracts and 630 216 full-text publications related to neuroscience. The brain regions are identified with three different named entity recognizers (NERs) and then normalized against two atlases: the Allen Brain Atlas (ABA) and the atlas from the Brain Architecture Management System (BAMS). We then use three different extractors to assess inter-region connectivity. Results: NERs and connectivity extractors are evaluated against a manually annotated corpus. The complete in litero extraction models are also evaluated against invivo connectivity data from ABA with an estimated precision of 78%. The resulting database contains over 4 million brain region mentions and over 100 000 (ABA) and 122 000 (BAMS) potential brain region connections. This database drastically accelerates connectivity literature review, by providing a centralized repository of connectivity data to neuroscientists. Availability and implementation: The resulting models are publicly available at github.com/BlueBrain/bluima. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

RERO DOC Digital Library

Tool for robust stochastic parsing using optimal maximum coverage

Author: Chappelier Jean-Cédric
Kadlec Vladimir
Rajman Martin
Publication venue: Swiss Federal Institute of Technology (EPFL)
Publication date: 13/07/2005
Field of study

This report presents a robust syntactic parser that is able to return a "correct" derivation tree even if the grammar cannot generate the input sentence. The following two step solution is prop osed: the finest corresponding most probable optimal maximum coverage is generated first, then the trees from this coverage are glued into one resulting tree. We discuss the implementation of this method with the SLP toolkit and libkp library

Infoscience - École polytechnique fédérale de Lausanne

Large-scale extraction of brain connectivity from the neuroscientific literature

Author: Chappelier Jean-Cédric
Hill Sean
Richardet Renaud
Telefont Martin
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/01/2015
Field of study

In neuroscience, as in many other scientific domains, the primary form of knowledge dissemination is through published articles. One challenge for modern neuroinformatics is finding methods to make the knowledge from the tremendous backlog of publications accessible for search, analysis and the integration of such data into computational models. A key example of this is metascale brain connectivity, where results are not reported in a normalized repository. Instead, these experimental results are published in natural language, scattered among individual scientific publications. This lack of normalization and centralization hinders the large-scale integration of brain connectivity results. In this article, we present text-mining models to extract and aggregate brain connectivity results from 13.2 million PubMed abstracts and 630 216 full-text publications related to neuroscience. The brain regions are identified with three different named entity recognizers (NERs) and then normalized against two atlases: the Allen Brain Atlas (ABA) and the atlas from the Brain Architecture Management System (BAMS). We then use three different extractors to assess inter-region connectivity

Infoscience - École polytechnique fédérale de Lausanne

PubMed Central

INtegrating SPEech acoustic and linguistic Constraints: Baseline System Development

Author: Bernardis Giulia
Bourlard Hervé
Chappelier Jean-Cédric
Rajman Martin
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

In this report, we discuss the initial issues addressed in a research project aiming at the development of an advanced natural speech recognition system for the automatic processing of telephone directory requests. This multi-faceted project involves (1) text processing (labeling and tagging) of a large database of telephone-based natural voice requests (including all kinds of peculiarities), (2) development of robust acoustic models, (3) integrating advanced natural language (syntactic and semantic) constraints, (4) detecting and dealing with a large number of out-of-vocabulary words (proper names), and (5) testing of the resulting system on natural queries. All this work will be performed on the basis of a database containing prompted (read) speech and (simulated) natural requests to information service. This report describes the initial steps that were required to set up a reasonable baseline system and a good research and evaluation framework. More specifically, a significant amount of time was devoted to proper text processing of speaker request transcriptions, in order to create the basis necessary for the lexical and linguistic modeling, as well as for the evaluation of recognition results

Infoscience - École polytechnique fédérale de Lausanne

Finding instabilities in the community structure of complex networks

Author: A. Capocci
D. Gfeller
David Gfeller
Jean-Cédric Chappelier
P. Erdös
Paolo De Los Rios
V. Prigent
W. W. Zachary
Publication venue: 'American Physical Society (APS)'
Publication date: 24/03/2005
Field of study

The problem of finding clusters in complex networks has been extensively studied by mathematicians, computer scientists and, more recently, by physicists. Many of the existing algorithms partition a network into clear clusters, without overlap. We here introduce a method to identify the nodes lying ``between clusters'' and that allows for a general measure of the stability of the clusters. This is done by adding noise over the weights of the edges of the network. Our method can in principle be applied with any clustering algorithm, provided that it works on weighted networks. We present several applications on real-world networks using the Markov Clustering Algorithm (MCL).Comment: 4 pages, 5 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Modèles génératifs à base de thèmes pour l'accès à l'information textuelle

Author: Chappelier Jean-Cédric
Publication venue: Paris, Hermès Science Publications, Lavoisier
Publication date: 10/01/2012
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Offline grammar-based recognition of handwritten sentences

Author: Bunke Horst
Chappelier Jean-Cédric
Zimmermann Matthias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

This paper proposes a sequential coupling of a Hidden Markov Model (HMM) recognizer for offline handwritten English sentences with a probabilistic bottom-up chart parser using Stochastic Context-Free Grammars (SCFG) extracted from a text corpus. Based on extensive experiments, we conclude that syntax analysis helps to improve recognition rates significantly

Infoscience - École polytechnique fédérale de Lausanne

Bern Open Repository and Information System (BORIS)