14 research outputs found

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications

    Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

    Get PDF
    Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    Decoding Legalese Without Borders: Multilingual Evaluation of Language Models on Long Legal Texts

    Get PDF
    Pretrained transformers have sparked an explosion of research in the field of Natural Language Processing (NLP). Scaling up language models based on the transformer architecture in terms of size, compute, and data led to impressive emergent capabilities that were considered unattainable in such a brief span, a mere three years ago, prior to the launch of GPT-3. These advances catapulted the previously niche field of legal NLP into the mainstream, at the latest, with GPT-4 passing the bar. Many products based on GPT-4 and other large language models are entering the market at an increasing pace, many of those targeting the legal field. This dissertation makes contributions in two key areas within Natural Language Processing (NLP) focused on legal text: resource curation and detailed model analysis. First, we curate an extensive set of multilingual legal datasets, train a variety of language models on these, and establish comprehensive benchmarks for evaluating Large Language Models (LLMs) in the legal domain. Second, we conduct a multidimensional analysis of model performance, focusing on metrics like explainability and calibration in the context of Legal Judgment Prediction. We introduce novel evaluation frameworks and find that while our trained models exhibit high performance and better calibration than human experts, they do not necessarily offer improved explainability. Furthermore, we investigate the feasibility of re-identification in anonymized legal texts, concluding that large-scale re-identification using LLMs is currently unfeasible. For future work, we propose exploring domain adaptation and instruction tuning to enhance language model performance on legal benchmarks, while also advocating for a detailed examination of dataset overlaps and model interpretability. Additionally, we emphasize the need for dataset extension to unexplored legal tasks and underrepresented jurisdictions, aiming for a more comprehensive coverage of the global legal landscape in NLP resources

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018 : 10-12 December 2018, Torino

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges
    corecore