25 research outputs found

    Using Risk-Tracing Snowball Approach to Increase HIV Case Detection Among High-Risk Populations in Cambodia: An Intervention Study

    Get PDF
    Background: Early HIV diagnosis and initiation onto antiretroviral therapy may prevent ongoing spread of HIV. Risk Tracing Snowball Approach (RTSA) has been shown to be effective in detecting new HIV cases in other settings. The main objective of this study is to evaluate the effectiveness of RTSA in increasing the rate of newly identified HIV cases among high-risk populations. Our second objective was to evaluate the effectiveness of RTSA, as compared to the walk-in group, in increasing the number of HIV tests and early case detection. Methods: This study was conducted from April 1 to September 30, 2016 at two NGO clinics in Phnom Penh, Cambodia. Respondent driven sampling method was adapted to develop RTSA to reach high-risk populations, including key populations and the general population who have social connections with key populations. Bivariate and multivariate logistic regression analyses were conducted. Results: During the implementation period, 721 clients walked in for HIV testing (walk-in group), and all were invited to be seeds. Of the invited clients, 36.6% agreed to serve as seeds. Throughout the implementation, 6195 coupons were distributed to seeds or recruiters, and resulted in 1572 clients visiting the two clinics with coupons (RTSA group), for a coupon return rate of 25.3%. The rate of newly identified HIV cases among the RTSA group was significantly lower compared to that in walk-in group. However, the highest number of newly identified HIV cases was found during the implementation period, compared to both pre- and post-implementation period. Although statistically not significant, the mean CD4 count of newly identified HIV cases detected through RTSA was almost 200 cells/mm3 higher than that in the walk-in group. Conclusions: Although the rate of newly identified HIV cases among the RTSA group was lower than that in the walk-in group, the inclusion of RTSA in addition to the traditional walk-in method boosted new HIV case detection in the two participating clinics. A higher mean CD4 count for the RTSA group may reveal that RTSA may be able to detect HIV cases earlier than the traditional walk-in approach. Further research is needed to understand whether RTSA is a cost-effective intervention to prevent ongoing spread of the HIV among high-risk populations in Cambodia

    Vers une modélisation statistique multi-niveau du langage, application aux langues peu dotées

    No full text
    This PhD thesis focuses on the problems encountered when developing automatic speech recognition for under-resourced languages with a writing system without explicit separation between words. The specificity of the languages covered in our work requires automatic segmentation of text corpus into words in order to make the n-gram language modeling applicable. While the lack of text data has an impact on the performance of language model, the errors introduced by automatic segmentation can make these data even less usable. To deal with these problems, our research focuses primarily on language modeling, and in particular the choice of lexical and sub-lexical units, used by the recognition systems. We investigate the use of multiple units in speech recognition system. We validate these modeling approaches based on multiple units in recognition systems for a group of languages : Khmer, Vietnamese, Thai and Laotian.Ce travail de thèse porte sur la reconnaissance automatique de la parole des langues peu dotées et ayant un système d'écriture sans séparation explicite entre les mots. La spécificité des langues traitées dans notre contexte d'étude nécessite la segmentation automatique en mots pour rendre la modélisation du langage n-gramme applicable. Alors que le manque de données textuelles a un impact sur la performance des modèles de langage, les erreurs introduites par la segmentation automatique peuvent rendre ces données encore moins exploitables. Pour tenter de pallier les problèmes, nos recherches sont axées principalement sur la modélisation du langage, et en particulier sur le choix des unités lexicales et sous-lexicales, utilisées par les systèmes de reconnaissance. Nous expérimentons l'utilisation des multiples unités au niveau des modèles du langage et au niveau des sorties de systèmes de reconnaissance. Nous validons ces approches de modélisation à base des multiples unités sur les systèmes de reconnaissance pour un groupe de langues peu dotées : le khmer, le vietnamien, le thaï et le laotien

    Vers une modélisation statistique multi-niveau du langage, application aux langues peu dotées

    No full text
    This PhD thesis focuses on the problems encountered when developing automatic speech recognition for under-resourced languages with a writing system without explicit separation between words. The specificity of the languages covered in our work requires automatic segmentation of text corpus into words in order to make the n-gram language modeling applicable. While the lack of text data has an impact on the performance of language model, the errors introduced by automatic segmentation can make these data even less usable. To deal with these problems, our research focuses primarily on language modeling, and in particular the choice of lexical and sub-lexical units, used by the recognition systems. We investigate the use of multiple units in speech recognition system. We validate these modeling approaches based on multiple units in recognition systems for a group of languages : Khmer, Vietnamese, Thai and Laotian.Ce travail de thèse porte sur la reconnaissance automatique de la parole des langues peu dotées et ayant un système d'écriture sans séparation explicite entre les mots. La spécificité des langues traitées dans notre contexte d'étude nécessite la segmentation automatique en mots pour rendre la modélisation du langage n-gramme applicable. Alors que le manque de données textuelles a un impact sur la performance des modèles de langage, les erreurs introduites par la segmentation automatique peuvent rendre ces données encore moins exploitables. Pour tenter de pallier les problèmes, nos recherches sont axées principalement sur la modélisation du langage, et en particulier sur le choix des unités lexicales et sous-lexicales, utilisées par les systèmes de reconnaissance. Nous expérimentons l'utilisation des multiples unités au niveau des modèles du langage et au niveau des sorties de systèmes de reconnaissance. Nous validons ces approches de modélisation à base des multiples unités sur les systèmes de reconnaissance pour un groupe de langues peu dotées : le khmer, le vietnamien, le thaï et le laotien

    Vers une modélisation statistique multi-niveau du langage (application aux langues peu dotées)

    No full text
    Ce travail de thèse porte sur la reconnaissance automatique de la parole des langues peu dotées et ayant un système d'écrire sans séparation explicite entre les mots. La spécificité des lanques traitées dans notre contexte d'étude nécessite la segmentation automatique en mots pour rendre la modélisation du langage n-gramme applicable. Alors que le manque de données textuelles a un impact sur la performance des modèles de langage, les erreurs introduites par la segmentation automatique peuvent rendre ces données encore moins exploitables. Pour tenter de pallier les problèmes, nos recherches sont axées principalement sur la modélisation du langage, et en particulier sur le choix des unités lexicales et sous-lexicales, utilisées par les systèmes de reconnaissance. Nous expérimentons l'utilisation des multiples unités au niveau des modèles du langage et au niveau des sorties de systèmes de reconnaissance. Nous validons ces approches de modélisations à base des multiples unités sur les sytèmes de reconnaissance pour un groupede langues peu dotées : le khmer, le vietnamien, le thaï et le laotien.This PhD thesis focuses on the problems encountered when developing automatic speech recognition for under-resourced languages with a writing system without explicit separation between words. The specificity of the languages covered in our work requires automatic segmentation of text corpus into words in order to make the n-gram language modeling applicable. While the lack of text data has an impact on the performance of language model, the errors introduced by automatic segmentation can make these data even less usable. To deal with these problems, our research focuses primarily on language modeling, and in particular the choice of lexical and sub-lexical units, used by the recognition systems. We investigate the use of multiple units in speech recognition system. We validate these modeling approaches based on multiple units in recognition systems for a group of languages : Khmer, Vietnamese, Thai and Laotian.GRENOBLE1-BU Sciences (384212103) / SudocSudocFranceF

    Multiple Text Segmentation for Statistical Language Modeling

    No full text
    International audienceIn this article we deal with the text segmentation problem in statistical language modeling for under-resourced languages with a writing system without word boundary delimiters. While the lack of text resources has a negative impact on the performance of language models, the errors introduced by the automatic word segmentation makes those data even less usable. To better exploit the text resources, we propose a method based on weighted finite state transducers to estimate the N-gram language model from the training corpus on which each sentence is segmented in multiple ways instead of a unique seg-mentation. The multiple segmentation generates more N-grams from the training corpus and allows obtaining the N-grams not found in unique segmentation. We use this approach to train the language models for automatic speech recognition systems of Khmer and Vietnamese languages and the multiple segmenta-tions lead to a better performance than the unique segmentation approach
    corecore