678 research outputs found

    Real-Time Identification of Parallel Texts from Bilingual Newsfeed

    Get PDF
    Parallel texts are documents that present parallel translations. This paper describes a simple method that can be deployed on a real-time news feed to create an infinitely growing source of parallel texts in French and English. Our experiment was lead on the Canada Newswire news feed. Given some of its intrinsic properties, it was possible to deploy a relatively simple text matching techniques that rely on language independent cognates such numbers, capitalized words, punctuation and new lines characters. On three week of press releases, our system correctly identified the vast majority of parallel press release. It committed only minor errors on repeated news items

    A Supervised Learning Approach to Acronym Identification

    Get PDF
    This paper addresses the task of finding acronym-definition pairs in text. Most of the previous work on the topic is about systems that involve manually generated rules or regular expressions. In this paper, we present a supervised learning approach to the acronym identification task. Our approach reduces the search space of the supervised learning system by putting some weak constraints on the kinds of acronym-definition pairs that can be identified. We obtain results comparable to hand-crafted systems that use stronger constraints. We describe our method for reducing the search space, the features used by our supervised learning system, and our experiments with various learning schemes

    BIKE: Bilingual Keyphrase Experiments

    Get PDF
    This paper presents a novel strategy for translating lists of keyphrases. Typical keyphrase lists appear in scientific articles, information retrieval systems and web page meta-data. Our system combines a statistical translation model trained on a bilingual corpus of scientific papers with sense-focused look-up in a large bilingual terminological resource. For the latter, we developed a novel technique that benefits from viewing the keyphrase list as contextual help for sense disambiguation. The optimal combination of modules was discovered by a genetic algorithm. Our work applies to the French / English language pair

    Model and simulation of a solar kiln with energy storage

    Get PDF
    A solar kiln with energy storage can be used for continuous drying. This kiln consisted of several units which were modeled to simulate it in operation. A model was proposed for each unit, and another based on laboratory tests for drying a wooden board by passing air across. These models were combined to produce a global model. Simulation results were then analyzed and showed that the use of storage was justified to reduce drying time. Moreover, with the judicious use of storage and air renewal, drying schedules could be produced for a better quality of dried wood

    Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity

    Get PDF
    In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, and organization). We describe the system’s architecture and compare its performance with a supervised system. We experimentally evaluate the system on a standard corpus, with the three classical named-entity types, and also on a new corpus, with a new named-entity type (car brands)

    Solar timber kilns: State of the art and foreseeable developments

    Get PDF
    Analysis of the evolution in solar heated drying kilns in recent decades shows that there have been a series of modifications to optimize their thermal and drying efficiency. Using an analysis method based on product design, we report on existing solar timber kilns. The dryers and their component units are studied, developments are noted, focusing on changing trends in technological systems. As a result of this analysis we suggest some future adaptations

    An oriented-design simplified model for the efficiency of a flat plate solar air collector

    Get PDF
    In systems design, suitably adapted physical models are required. Different modelling approaches for a solar air collector were studied in this paper. First, a classical model was produced, based on a linearization of the conservation of energy equations. Its resolution used traditional matrix methods. In order to improve the possibilities for use in design, the behaviour of the collector was next expressed in terms of efficiency. Lastly, simplified models constructed from the results obtained with the classical linearized model, and explicitly including the design variables of the collector, were proposed. These reduced models were then evaluated in terms of Parsimony, Exactness, Precision and Specialisation (PEPS). It was concluded that one of them (D2), using a low number of variables and of equations, is well suited for the design of solar air collector coupled with other sub-systems in more complex devices such as solar kiln with energy storag

    Création de surcouche de documents hypertextes et traitement du langage naturel

    Get PDF
    Cet article présente une extension aux algorithmes de création de surcouche de documents hypertextuels. Il s’agit de diversifier la granularité de l’information qu’il est possible de capturer en utilisant des techniques de traitement du langage naturel. Une surcouche de document Web (web page wrapper) est une vue sur des noeuds HTML contenant une information donnée et désirée. Par exemple, dans une manchette de journal, une surcouche peut baliser le nom de l’auteur, la date ou même toutes les références à un lieu ou a une compagnie quelconque. Nous avons étendu le fonctionnement d’un algorithme de création de surcouchage afin de dépasser la limite des noeuds HTML et d’extraire de l’information du contenu textuel qui s’y retrouve. Nous appliquons cette technique à la création automatique de lexiques (liste de mots)

    Semi-Supervised Named Entity Recognition:\ud Learning to Recognize 100 Entity Types with Little Supervision\ud

    Get PDF
    Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. \ud \ud In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. \ud \ud Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. \ud \ud We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts. \u

    Le choix du lieu de résidence des jeunes familles : analyse multicritère appliquée au cas de la Ville de Sherbrooke

    Get PDF
    Le choix du lieu de résidence est généralement influencé par le cycle de vie d’une personne. Ce choix repose sur des critères pouvant varier tout au long de son existence, notamment par les étapes sociales et économiques qu’elle franchit. L’une de ces étapes, partagée par la majorité de la population, consiste à fonder une famille. L’objectif de cet essai est d’identifier les endroits propices à l’installation de jeunes familles sur le territoire de la ville de Sherbrooke. Pour ce faire, on a eu recours à une analyse multicritère, basée sur les critères de choix résidentiels des jeunes familles relevés dans la littérature. À la lumière des résultats obtenus, il apparaît que près de la moitié du territoire de la ville de Sherbrooke (44,8 %) est considérée propice à l’établissement de jeunes familles. Ces zones sont situées en milieu urbain et périurbain. Les endroits très limitants (9,5 %) sont généralement localisés en bordure d’autoroutes ou de voies ferrées. Au total, sept secteurs ont été jugés propices à l’établissement de jeunes familles sur le territoire de la ville. L’un des éléments communs à cinq de ces secteurs est qu’ils étaient situés en périphérie du noyau urbain au moment de leur construction. On constate aussi que les quartiers plus récents comportent davantage de familles avec des enfants à la maison. Cela laisse présager que les quartiers plus vieux également propices sont actuellement occupés par des familles dont les enfants ont quitté le foyer.Abstract: Housing choice is usually influenced by the lifecycle of a person. This choice is based on criteria that vary through the different social and economics stages which occur in one’s life. One of these stages is starting a family. The goal of this study is to identify the areas in the city of Sherbrooke which present the desired housing characteristics by young families. This will be achieved be performing a multicriteria analysis based on the criteria found in a literature review. The results of the analysis show that nearly half of the territory of the city of Sherbrooke (44.8%) appears to be moderately favorable or better for young families. These areas are located in the urban or periurban sectors of the town. The areas that are the least favorable (9.5%) are generally located close to expressways or railroads. A total of seven favorable areas for young families were identified. A common element of 5 of those areas is that they were located in the outskirts of the urban core at the time of their construction. It also appears that, generally speaking, recent developments contain more families with children at home than other older ones. This seems to suggest that the housing stock favorable for young families is still occupied by families with no children at home
    corecore