Search CORE

35 research outputs found

A new method for learning Phrase Based Machine Translation with Multivariate Mutual Information

Author: Latiri Chiraz
Nasri Cyrine
Slimani Yahya
Smaïli Kamel
Publication venue: HAL CCSD
Publication date: 20/09/2012
Field of study

International audienceCurrent statistical machine translation systems usually build an initial word-to-word alignments before learning phrase translation pairs. This operation needs so many matching between di erent single words of both considered languages. We propose a new approach for phrase-based machine translation which does not need any word alignments, it is based on inter-lingual triggers determined by Multivariate Mutual Information. This algorithm segments sentences into phrases and nds their alignments simultaneously. The main objective is to build directly valid alignments between source and target phrases. Inspite of the youth of this method, experiments showed that the results are competitive but needs some more e orts in order to overcome the one of state-of-the-art methods

INRIA a CCSD electronic archive server

Characterizing Health-Related Information Needs of Domain Experts (regular paper)

Author: Chouquet Cécile
Latiri Chiraz
Tamine Lynda
Znaidi Eya
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/05/2013
Field of study

International audienceIn information retrieval literature, understanding the users' intents behind the queries is critically important to gain a better insight of how to select relevant results. While many studies investigated how users in general carry out exploratory health searches in digital environments, a few focused on how are the queries formulated, specifically by domain expert users. This study intends to fill this gap by studying 173 health expert queries issued from 3 medical information retrieval tasks within 2 different evaluation compaigns. A statistical analysis has been carried out to study both variation and correlation of health-query attributes such as length, clarity and specificity of either clinical or non clinical queries. The knowledge gained from the study has an immediate impact on the design of future health information seeking systems

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

Analyse exploratoire des requêtes d'experts médicaux : cas des campagnes d'évaluation TREC et CLEF (regular paper)

Author: Chouquet Cécile
Latiri Chiraz
Tamine Lynda
Znaidi Eya
Publication venue: Universite de Lille
Publication date: 01/01/2013
Field of study

International audienceDans ce papier, nous nous intéressons à l'analyse des besoins en information exprimés par des experts médicaux dans l'objectif de les caractériser puis mesurer l'impact de leur structure sur les résultats de recherche. À cet, effet, nous menons une étude exploratoire basée sur des analyses statistiques multidimensionnelles sur des collections de requêtes issues de campagnes d'évaluation internationales standards en l'occurrence TREC et CLEF. Notre étude révèle des variabilités significatives à la fois dans la morphologie des requêtes, que des besoins et des performances, que nous interprétons sur la base des objectifs et spécificités des tâches médicales associées. Les résultats de cette étude ont un impact sur la conception de systèmes de recherche d'information médicaux

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL-INSA Toulouse

STATISTICAL MACHINE TRANSLATION IMPROVEMENT BASED ON PHRASE SELECTION

Author: Chiraz Latiri
Nasri Cyrine
Smaili Kamel
Publication venue: HAL CCSD
Publication date: 05/09/2015
Field of study

International audienceThis paper describes the importance of introducing a phrase-based language model in the process of machine translation. In fact, nowadays SMT are based on phrases for translation but their language models are based on classical ngrams. In this paper we introduce a phrase-based language model (PBLM) in the decoding process to try to match the phrases of a translation table with those predicted by a language model. Furthermore, we propose a new way to retrieve phrases and their corresponding translation by using the principle of conditional mutual information. The SMT developed will be compared to the baseline one in terms of BLEU, TER and METEOR. The experimental results show that the introduction of PBLM in the translation decoding improve the results

INRIA a CCSD electronic archive server

Training phrase-based SMT without explicit word aligment

Author: Latiri Chiraz
Nasri Cyrine
Smaïli Kamel
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 12/04/2014
Field of study

International audienceThe machine translation systems usually build an initialword-to-word alignment, before training the phrase translation pairs.This approach requires a lot of matching between different single words ofboth considered languages. In this paper, we propose a new approach forphrase-based machine translation which does not require any word alignment.This method is based on inter-lingual triggers retrieved by MultivariateMutual Information. This algorithm segments sentences intophrases and fnds their alignments simultaneously. The main objectiveof this work is to build directly valid alignments between source andtarget phrases. The achieved results, in terms of performance are satisfactoryand the obtained translation table is smaller than the referenceone; this approach could be considered as an alternative to the classicalmethods

INRIA a CCSD electronic archive server

Training Statistical Machine Translation with Multivariate Mutual Information

Author: Latiri Chiraz
Nasri Cyrine
Smaïli Kamel
Publication venue: HAL CCSD
Publication date: 27/11/2011
Field of study

International audienceIn this paper, we describe a new model for phrase-based statistical machine translation. Roughly speaking, statistical approach uses a language and a translation model. This latter could be viewed as a lexical and an alignment model. The approach we propose does not need any alignment, it is based on inter-lingual triggers determined by multivariate mutual information (MMI). This measure depends on conditional mutual information, this means that a source phrase is directly linked to a target one. The conditional mutual information is used in both directions (source-target and target-source languages). We present an experimental evaluation conducted on EUROPARL corpora (French and English) and using the decoder MOSES. We compare then our results to those of a previous work in which we used inter-lingual triggers determined by a simple mutual information (MI) as well as to those given by baseline model (Koehn et al., 2003)

INRIA a CCSD electronic archive server

Tweet Contextualization Based on Wikipedia and Dbpedia

Author: Berrut Catherine
Latiri Chiraz
Mulhem Philippe
Slimani Yahya
Zingla Meriem Amina
Publication venue: HAL CCSD
Publication date: 09/03/2016
Field of study

National audienceBound to 140 characters, tweets are short and not written maintaining formal grammar and proper spelling. These spelling variations increase the likelihood of vocabulary mismatch and make them difficult to understand without context. This paper falls under the tweet contextualization task that aims at providing, automatically, a summary that explains a given tweet, allowing a reader to understand it. We propose different tweet expansion approaches based on Wikipeda and Dbpedia as external knowledge sources. These proposed approaches are divided into two steps. The first step consists in generating the candidate terms for a given tweet, while the second one consists in ranking and selecting these candidate terms using asimilarity measure. The effectiveness of our methods is proved through an experimental study conducted on the INEX 2014 collection

Hal - Université Grenoble Alpes

A new method for learning Phrase Based Machine Translation with Multivariate Mutual Information

Author: Latiri Chiraz
Nasri Cyrine
Slimani Yahya
Smaïli Kamel
Publication venue: HAL CCSD
Publication date: 20/09/2012
Field of study

INRIA a CCSD electronic archive server

Phrase-based Machine Translation based on Text Mining and Statistical Language Modeling Techniques

Author: Langlois David
Latiri Chiraz
Lavecchia Caroline
Nasri Cyrine
Smaili Kamel
Publication venue: Alexander Gelbukh
Publication date: 01/01/2011
Field of study

International audienceIn this paper, we introduce two new methods dedicated to phrase based machine translation. Both are based on mining a parallel corpus in order to nd out the couples of linguistic units which are translation of each other. The presented methods do not rely on any alignment in contrast to what is done usually by the statistical machine translation community. Each of them proposes a complete translation table containing translations of single words and phrases. The rst method is inspired from the well-known trigger language model while the second one is inspired from the association rules mining technique. All experiments ar e conducted on a large part of EUROPARL corpus and highlight the utility of both proposed approaches

INRIA a CCSD electronic archive server

Extraction de Connaissances a partir de Textes : M ethodes et Applications

Author: Latiri Chiraz
Publication venue: HAL CCSD
Publication date: 24/06/2013
Field of study

Non disp

Thèses en Ligne