77 research outputs found

    Daniel@FinTOC-2019 Shared Task : TOC Extraction and Title Detection

    Get PDF
    International audienceWe present different methods for the two tasks of the 2019 FinTOC challenge: Title Detection and Table of Contents Extraction. For the Title Detection task we present different approaches using various features : visual characteristics , punctuation density and character n-grams. Our best approach achieved an official F-measure score of 94.88%, ranking 6 on this task. For the TOC extraction task, we presented a method combining visual characteristics of the document layout. With this method we ranked first on this task with 42.72%

    Improving Neural Question Answering with Retrieval and Generation

    Get PDF
    Text-based Question Answering (QA) is a subject of interest both for its practical applications, and as a test-bed to measure the key Artificial Intelligence competencies of Natural Language Processing (NLP) and the representation and application of knowledge. QA has progressed a great deal in recent years by adopting neural networks, the construction of large training datasets, and unsupervised pretraining. Despite these successes, QA models require large amounts of hand-annotated data, struggle to apply supplied knowledge effectively, and can be computationally ex- pensive to operate. In this thesis, we employ natural language generation and information retrieval techniques in order to explore and address these three issues. We first approach the task of Reading Comprehension (RC), with the aim of lifting the requirement for in-domain hand-annotated training data. We describe a method for inducing RC capabilities without requiring hand-annotated RC instances, and demonstrate performance on par with early supervised approaches. We then explore multi-lingual RC, and develop a dataset to evaluate methods which enable training RC models in one language, and testing them in another. Second, we explore open-domain QA (ODQA), and consider how to build mod- els which best leverage the knowledge contained in a Wikipedia text corpus. We demonstrate that retrieval-augmentation greatly improves the factual predictions of large pretrained language models in unsupervised settings. We then introduce a class of retrieval-augmented generator model, and demonstrate its strength and flexibility across a range of knowledge-intensive NLP tasks, including ODQA. Lastly, we study the relationship between memorisation and generalisation in ODQA, developing a behavioural framework based on memorisation to contextualise the performance of ODQA models. Based on these insights, we introduce a class of ODQA model based on the concept of representing knowledge as question- answer pairs, and demonstrate how, by using question generation, such models can achieve high accuracy, fast inference, and well-calibrated predictions

    More than pupils: Italian women in science at the turn of the twentieth century

    Get PDF

    Book of Abstracts

    Get PDF
    ICES Annual Science Conference, 19 – 23 September 2011, GdaƄsk Music and Congress Center, GdaƄsk, Poland. IMR contributors: Benjamin Planque, Torild Johansen, Tuula Skarstein, Jon‐Ivar Westgaard, Halvor Knutsen, Kristin Helle, Michael Pennington, Marek Ostrowski, Nils Olav Handegard, Mette Skern‐Mauritzen, Edda Johannesen, Ulf Lindstrþm, Harald Gjþséter, Ken Drinkwater, Trond Kristiansen, Geir Ottersen, Esben Moland Olse

    Transition metal imido complexes: synthesis and applications to polymerisation catalysis

    Get PDF
    This thesis describes studies into Group 5 and Group 6 transition metal imido complexes, with particular emphasis on the development of complexes which can be applied to catalytic processes. Chapter 1 highlights the electronic and structural aspects of the imido and alkylidene ligands. The isolobal analogy between Group 4 bent metallocene. Group 5 half-sandwich imido and Group 6 bis(imido) metal fragments is outlined. In addition, Ziegler-Natta type a-olefin polymerisation and Ring Opening Metathesis Polymerisation (ROMP) are briefly reviewed. Chapter 2 describes initial screening of half-sandwich vanadium imido and chromium bis((^t)butylimido) dichloride complexes as catalyst precursors. Synthesis of the chromium bis(imido) dialkyl complex Cr(N(^t)Bu)(_2)(CH(_2)Ph)(_2) (1) is described, its conversion to a cationic alkyl species is probed and the polymerisation activity associated with the resultant compound is addressed. Finally this chapter details the synthesis and characterisation of a range of bis(adamantylimido) chromium complexes. Chapter 3 presents a synthetic entry point into the bis(arylimido) chemistry of chromium. The complex Cr(NAr)(_2)(NH(^t)Bu)Cl (12) is described (Ar = 2,6-(^i)Pr(_2)C(_6)H(_3)) and its conversion to the dichloride complex Cr(NAr)(_2)Cl(_2) (14) is examined. 14 forms the stable monoadduct with pyridine, the X-ray crystallographic study of which reveals a distorted square based pyramidal geometry about the chromium atom. The inclusion of the arylimido ligand at the metal centre allows stabilisation of the chromium bis- phosphine complexes Cr(NAr)(_2)(PMe(_3))(_2) (18) and Cr(NAr)(_2)(PMe(_2)Ph)(_2) (19). The reactivity of 18 towards unsaturated hydrocarbon substrates is briefly investigated. Chapter 4 focuses on the organometallic chemistry of the [Cr(NAr)(_2)] moiety. A range of dialkyl derivatives are isolated and the molecular structures of a selection are solved. The generation of the nascent species [Cr(NAr)(_2)(=CHCMe(_3))] is investigated and the conversion of Cr(NAr)(_2)(CH(_2)CMe(_3))(_2) (24) to Cr(NAr)(-2)(CHDCMe(_3))(C(_6)D(_5)) (25) is the subject of a kinetic study. In chapter 5, the ROMP of a series of amino acid derived norbomene monomers is studied. The resultant polymers are fully characterised and a brief molecular modelling study is carried out on representative polymers chain lengths. Chapter 6 contains experimental details to chapters 2-5

    Socio-Life Science and the COVID-19 Outbreak

    Get PDF
    This open access book presents the first step towards building socio-life science, a field of science investigating humans in such a way that both social and life-scientific factors are integrated. Because humans are both living and social creatures, a human action can never be understood fully without knowing both the biological traits of a person and the social scientific environments in which he exists. With this consideration, the editors of this book have initiated a research project promoting a deeper and more integrated understanding of human behavior and human health. This book aims to show what can, and could be, achieved through our interdisciplinary project. One important product is the newly formed three-party collaboration between Pasteur Institut, Kyoto University, and the Research Institute of Economy, Trade and Industry. Covering many different fields, including medicine, epidemiology, anthropology, economics, sociology, demography, geography, and policy, researchers in these institutes, and many others, present their studies on the COVID-19 pandemic. Although based on different methodologies, the studies show the importance of behavioral change and governmental policy in the fight against a huge pandemic. The book explains the unique genome cohort–panel data that the project builds to study social and life scientific aspects of humans

    Complete Issue of Volume 8

    Get PDF

    Unmet goals of tracking: within-track heterogeneity of students' expectations for

    Get PDF
    Educational systems are often characterized by some form(s) of ability grouping, like tracking. Although substantial variation in the implementation of these practices exists, it is always the aim to improve teaching efficiency by creating homogeneous groups of students in terms of capabilities and performances as well as expected pathways. If students’ expected pathways (university, graduate school, or working) are in line with the goals of tracking, one might presume that these expectations are rather homogeneous within tracks and heterogeneous between tracks. In Flanders (the northern region of Belgium), the educational system consists of four tracks. Many students start out in the most prestigious, academic track. If they fail to gain the necessary credentials, they move to the less esteemed technical and vocational tracks. Therefore, the educational system has been called a 'cascade system'. We presume that this cascade system creates homogeneous expectations in the academic track, though heterogeneous expectations in the technical and vocational tracks. We use data from the International Study of City Youth (ISCY), gathered during the 2013-2014 school year from 2354 pupils of the tenth grade across 30 secondary schools in the city of Ghent, Flanders. Preliminary results suggest that the technical and vocational tracks show more heterogeneity in student’s expectations than the academic track. If tracking does not fulfill the desired goals in some tracks, tracking practices should be questioned as tracking occurs along social and ethnic lines, causing social inequality

    The Book Structure Extraction Competition with the Resurgence software for part and chapter detection at Caen University

    No full text
    ISBN: 978-3-642-23576-4International audienceThe GREYC Island team participated in the Structure Extraction Competition part of the INEX Book track for the second time, with the Resurgence software. We used a minimal strategy primarily based on top-down document representation with two levels, part and chapter. The main idea is to use a model describing relationships for elements in the document structure. Frontiers between high-level units are detected, parts and then chapters. Page is also used. The periphery center relationship is calculated on the entire document and reflected on each page. The strong points of the approach are that it deals with the entire document; it handles books without ToCs, and titles that are not represented in the ToC (e. g. preface); it is not dependent on lexicon, hence tolerant to OCR errors and language independent; it is simple and fast
    • 

    corecore