3,402 research outputs found

    Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables

    Full text link
    Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i.e., generalization. In this paper, we propose a method called Joint Dropout, that addresses the challenge of low-resource neural machine translation by substituting phrases with variables, resulting in significant enhancement of compositionality, which is a key aspect of generalization. We observe a substantial improvement in translation quality for language pairs with minimal resources, as seen in BLEU and Direct Assessment scores. Furthermore, we conduct an error analysis, and find Joint Dropout to also enhance generalizability of low-resource NMT in terms of robustness and adaptability across different domainsComment: Accepted at MT Summit 202

    Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse

    Get PDF
    This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses. This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups. In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in users’ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018—6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena

    Big Data and Analytics: Issues and Challenges for the Past and Next Ten Years

    Get PDF
    In this paper we continue the minitrack series of papers recognizing issues and challenges identified in the field of Big Data and Analytics, from the past and going forward. As this field has evolved, it has begun to encompass other analytical regimes, notably AI/ML systems. In this paper we focus on two areas: continuing main issues for which some progress has been made and new and emerging issues which we believe form the basis for near-term and future research in Big Data and Analytics. The Bottom Line: Big Data and Analytics is healthy, is growing in scope and evolving in capability, and is finding applicability in more problem domains than ever before

    Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents

    Get PDF
    It is argued that suitably trained neural language models exhibit key properties of epistemic agency: they hold probabilistically coherent and logically consistent degrees of belief, which they can rationally revise in the face of novel evidence. To this purpose, we conduct computational experiments with rankers: T5 models [Raffel et al. 2020] that are pretrained on carefully designed synthetic corpora. Moreover, we introduce a procedure for eliciting a model’s degrees of belief, and define numerical metrics that measure the extent to which given degrees of belief violate (probabilistic, logical, and Bayesian) rationality constraints. While pretrained rankers are found to suffer from global inconsistency (in agreement with, e.g., [Jang et al. 2021]), we observe that subsequent self-training on auto-generated texts allows rankers to gradually obtain a probabilistically coherent belief system that is aligned with logical constraints. In addition, such self-training is found to have a pivotal role in rational evidential learning, too, for it seems to enable rankers to propagate a novel evidence item through their belief systems, successively re-adjusting individual degrees of belief. All this, we conclude, confirms the Rationality Hypothesis, i.e., the claim that suitable trained NLMs may exhibit advanced rational skills. We suggest that this hypothesis has empirical, yet also normative and conceptual ramifications far beyond the practical linguistic problems NLMs have originally been designed to solve

    Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

    Full text link
    Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is comparatively unexplored. We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between probabilistic latent variables using Optimal Transport. We demonstrate how this direct guidance improves parsing from natural languages using fewer examples and less training. We evaluate our method on two datasets, MTOP and MultiATIS++SQL, establishing state-of-the-art results under a few-shot cross-lingual regime. Ablation studies further reveal that our method improves performance even without parallel input translations. In addition, we show that our model better captures cross-lingual structure in the latent space to improve semantic representation similarity.Comment: Accepted to TACL 2023. Pre-MIT Press publication. 17 pages, 3 figures, 6 table

    Development of a mixed reality application to perform feasibility studies on new robotic use cases

    Get PDF
    Dissertação de mestrado integrado em Engenharia e Gestão IndustrialManufacturing companies are trying to affirm their position in the market by introducing new concepts and processes to their production systems. For this purpose, new technologies must be employed to ensure better performance and quality of their processes. Robotics has evolved a lot in the past years, creating new hardware and software technologies to answer the increasing demands of the markets. Collaborative robots are seen as one of the emerging and most promising technologies to answer industry 4.0 necessities. However, the expertise needed to implement these robots is not often found in small and medium-sized enterprises that represent a large share of the existing manufacturing companies. At the same time, mixed reality represents a new and immersive way to test new processes without physically deploying them. To tackle this problem, a mixed reality application is developed from top to bottom, aiming to facilitate the research and feasibility studies of new robotic use cases in the pre-study implementation phase. This application serves as a proof-of-concept, and it is not developed for the end user. First, the application's requirements are set to answer the manufacturing companies’ needs, providing two testing robots, an intuitive robot placement method, a trajectory modeling and parameterization system, and a result framework. Then the development of the application’s functionalities is explained, answering the requirements previously established. A collision detection system was defined and developed to perceive self and environmental collisions. Furthermore, a novel process to configure the robot based on imitation learning was developed. In the end, a painting tool was integrated into the robot's 3D model and used for a use-case study of a painting task. Then, the results were registered, and the application was accessed according to the non-functional requirements. Finally, a qualitative analysis was made to evaluate the fields where this new concept can help manufacturing companies improve the implementation success of new robotic applications.As empresas de manufatura estão a tentar afirmar sua posição no mercado introduzindo novos conceitos e processos nos seus sistemas de produção. Para isso, novas tecnologias devem ser empregues para garantir um melhor desempenho e qualidade dos seus processos. O campo da robótica evoluiu bastante nos últimos anos, criando novas tecnologias de hardware e software para atender à crescente procura dos mercados. Neste sentido, os robots colaborativos surgem como uma das tecnologias mais promissoras para atender às necessidades da indústria 4.0. No entanto, o conhecimento necessário para implementar este tipo de robots não é frequentemente encontrado em pequenas e médias empresas que representam grande parte das empresas de manufatura existentes. Ao mesmo tempo, a realidade mista representa uma maneira nova e imersiva de testar novos processos sem implementá-los fisicamente. Para fazer face ao problema, uma aplicação de realidade mista é desenvolvida com o objetivo de facilitar a pesquisa e realização de estudos de viabilidade de novos casos de uso de robótica na fase de pré-estudo da sua implementação. A aplicação serve como prova de conceito e não é desenvolvida para o utilizador final. Primeiramente, os requisitos da aplicação são definidos de acordo com as necessidades das empresas de manufatura, sendo fornecidos dois robots de teste, um método intuitivo de posicionamento, um sistema de modelagem e parametrização de trajetórias e uma estrutura de resultados. Em seguida é apresentado o processo de desenvolvimento das funcionalidades da aplicação, tendo em conta os requisitos previamente estabelecidos. Um sistema de deteção de colisões foi pensado e desenvolvido para localizar e representar colisões do robot com a sua própria estrutura física e com o ambiente real. Além disso, foi desenvolvido um novo processo para definir a pose inicial do robot baseado na aprendizagem por imitação. No final, uma ferramenta de pintura foi desenvolvida e integrada no modelo 3D do robot com o objetivo de estudar o desempenho da aplicação numa tarefa de pintura. Em seguida, os resultados foram registados e a aplicação avaliada de acordo com os requisitos não funcionais. Por fim, foi realizada uma análise qualitativa para avaliar os campos em que este novo conceito pode ajudar as empresas de manufatura a melhorar o sucesso da implementação de novas aplicações robóticas

    Knowledge extraction from unstructured data

    Get PDF
    Data availability is becoming more essential, considering the current growth of web-based data. The data available on the web are represented as unstructured, semi-structured, or structured data. In order to make the web-based data available for several Natural Language Processing or Data Mining tasks, the data needs to be presented as machine-readable data in a structured format. Thus, techniques for addressing the problem of capturing knowledge from unstructured data sources are needed. Knowledge extraction methods are used by the research communities to address this problem; methods that are able to capture knowledge in a natural language text and map the extracted knowledge to existing knowledge presented in knowledge graphs (KGs). These knowledge extraction methods include Named-entity recognition, Named-entity Disambiguation, Relation Recognition, and Relation Linking. This thesis addresses the problem of extracting knowledge over unstructured data and discovering patterns in the extracted knowledge. We devise a rule-based approach for entity and relation recognition and linking. The defined approach effectively maps entities and relations within a text to their resources in a target KG. Additionally, it overcomes the challenges of recognizing and linking entities and relations to a specific KG by employing devised catalogs of linguistic and domain-specific rules that state the criteria to recognize entities in a sentence of a particular language, and a deductive database that encodes knowledge in community-maintained KGs. Moreover, we define a Neuro-symbolic approach for the tasks of knowledge extraction in encyclopedic and domain-specific domains; it combines symbolic and sub-symbolic components to overcome the challenges of entity recognition and linking and the limitation of the availability of training data while maintaining the accuracy of recognizing and linking entities. Additionally, we present a context-aware framework for unveiling semantically related posts in a corpus; it is a knowledge-driven framework that retrieves associated posts effectively. We cast the problem of unveiling semantically related posts in a corpus into the Vertex Coloring Problem. We evaluate the performance of our techniques on several benchmarks related to various domains for knowledge extraction tasks. Furthermore, we apply these methods in real-world scenarios from national and international projects. The outcomes show that our techniques are able to effectively extract knowledge encoded in unstructured data and discover patterns over the extracted knowledge presented as machine-readable data. More importantly, the evaluation results provide evidence to the effectiveness of combining the reasoning capacity of the symbolic frameworks with the power of pattern recognition and classification of sub-symbolic models

    Application of Track Geometry Deterioration Modelling and Data Mining in Railway Asset Management

    Get PDF
    Modernin rautatiejärjestelmän hallinnassa rahankäyttö kohdistuu valtaosin nykyisen rataverkon korjauksiin ja parannuksiin ennemmin kuin uusien ratojen rakentamiseen. Nykyisen rataverkon kunnossapitotyöt aiheuttavat suurten kustannusten lisäksi myös usein liikennerajoitteita tai yhteyksien väliaikaisia sulkemisia, jotka heikentävät rataverkon käytettävyyttä Siispä oikea-aikainen ja pitkäaikaisia parannuksia aikaansaava kunnossapito ovat edellytyksiä kilpailukykyisille ja täsmällisille rautatiekuljetuksille. Tällainen kunnossapito vaatii vankan tietopohjan radan nykyisestä kunnosta päätöksenteon tueksi. Ratainfran omistajat teettävät päätöksenteon tueksi useita erilaisia radan kuntoa kuvaavia mittauksia ja ylläpitävät kattavia omaisuustietorekistereitä. Kenties tärkein näistä datalähteistä on koneellisen radantarkastuksen tuottamat mittaustulokset, jotka kuvastavat radan geometrian kuntoa. Nämä mittaustulokset ovat tärkeitä, koska ne tuottavat luotettavaa kuntotietoa: mittaukset tehdään toistuvasti, 2–6 kertaa vuodessa Suomessa rataosasta riippuen, mittausvaunu pysyy useita vuosia samana, tulokset ovat hyvin toistettavia ja ne antavat hyvän yleiskuvan radan kunnosta. Vaikka laadukasta dataa on paljon saatavilla, käytännön omaisuudenhallinnassa on merkittäviä haasteita datan analysoinnissa, sillä vakiintuneita menetelmiä siihen on vähän. Käytännössä seurataan usein vain mittaustulosten raja-arvojen ylittymistä ja pyritään subjektiivisesti arvioimaan rakenteiden kunnon kehittymistä ja korjaustarpeita. Kehittyneen analytiikan puutteet estävät kuntotietojen laajamittaisen hyödyntämisen kunnossapidon suunnittelussa, mikä vaikeuttaa päätöksentekoa. Tämän väitöskirjatutkimuksen päätavoitteita olivat kehittää ratageometrian heikkenemiseen mallintamismenetelmiä, soveltaa tiedonlouhintaa saatavilla olevan omaisuusdatan analysointiin sekä jalkauttaa kyseiset tutkimustulokset käytännön rataomaisuudenhallintaan. Ratageometrian heikkenemisen mallintamismenetelmien kehittämisessä keskityttiin tuottamaan nykyisin saatavilla olevasta datasta uutta tietoa radan kunnon kehityksestä, tehdyn kunnossapidon tehokkuudesta sekä tulevaisuuden kunnossapitotarpeista. Tiedonlouhintaa sovellettiin ratageometrian heikkenemisen juurisyiden selvittämiseen rataomaisuusdatan perusteella. Lopuksi hyödynnettiin kypsyysmalleja perustana ratageometrian heikkenemisen mallinnuksen ja rataomaisuusdatan analytiikan käytäntöön viennille. Tutkimustulosten perusteella suomalainen radantarkastus- ja rataomaisuusdata olivat riittäviä tavoiteltuihin analyyseihin. Tulokset osoittivat, että robusti lineaarinen optimointi soveltuu hyvin suomalaisen rataverkon ratageometrian heikkenemisen mallinnukseen. Mallinnuksen avulla voidaan tuottaa tunnuslukuja, jotka kuvaavat rakenteen kuntoa, kunnossapidon tehokkuutta ja tulevaa kunnossapitotarvetta, sekä muodostaa havainnollistavia visualisointeja datasta. Rataomaisuusdatan eksploratiiviseen tiedonlouhintaan käytetyn GUHA-menetelmän avulla voitiin selvittää mielenkiintoisia ja vaikeasti havaittavia korrelaatioita datasta. Näiden tulosten avulla saatiin uusia havaintoja ongelmallisista ratarakennetyypeistä. Havaintojen avulla voitiin kohdentaa jatkotutkimuksia näihin rakenteisiin, mikä ei olisi ollut mahdollista, jollei tiedonlouhinnan avulla olisi ensin tunnistettu näitä rakennetyyppejä. Kypsyysmallin soveltamisen avulla luotiin puitteet ratageometrian heikkenemisen mallintamisen ja rataomaisuusdatan analytiikan kehitykselle Suomen rataomaisuuden hallinnassa. Kypsyysmalli tarjosi käytännöllisen tavan lähestyä tarvittavaa kehitystyötä, kun eteneminen voitiin jaotella neljään eri kypsyystasoon, jotka loivat selkeitä välitavoitteita. Kypsyysmallin ja asetettujen välitavoitteiden avulla kehitys on suunniteltua ja edistystä voidaan jaotella, mikä antaa edellytykset tämän laajamittaisen kehityksen onnistuneelle läpiviennille. Tämän väitöskirjatutkimuksen tulokset osoittavat, miten nykyisin saatavilla olevasta datasta saadaan täysin uutta ja merkityksellistä tietoa, kun sitä käsitellään kehittyneen analytiikan avulla. Tämä väitöskirja tarjoaa datankäsittelyratkaisujen luomisen ja soveltamisen lisäksi myös keinoja niiden käytäntöönpanolle, sillä tietopohjaisen päätöksenteon todelliset hyödyt saavutetaan vasta käytännön radanpidossa.In the management of a modern European railway system, spending is predominantly allocated to maintaining and renewing the existing rail network rather than constructing completely new lines. In addition to major costs, the maintenance and renewals of the existing rail network often cause traffic restrictions or line closures, which decrease the usability of the rail network. Therefore, timely maintenance that achieves long-lasting improvements is imperative for achieving competitive and punctual rail traffic. This kind of maintenance requires a strong knowledge base for decision making regarding the current condition of track structures. Track owners commission several different measurements that depict the condition of track structures and have comprehensive asset management data repositories. Perhaps one of the most important data sources is the track recording car measurement history, which depicts the condition of track geometry at different times. These measurement results are important because they offer a reliable condition database; the measurements are done recurrently, two to six times a year in Finland depending on the track section; the same recording car is used for many years; the results are repeatable; and they provide a good overall idea of the condition of track structures. However, although high-quality data is available, there are major challenges in analysing the data in practical asset management because there are few established methods for analytics. Practical asset management typically only monitors whether given threshold values are exceeded and subjectively assesses maintenance needs and development in the condition of track structures. The lack of advanced analytics prevents the full utilisation of the available data in maintenance planning which hinders decision making. The main goals of this dissertation study were to develop track geometry deterioration modelling methods, apply data mining in analysing currently available railway asset data, and implement the results from these studies into practical railway asset management. The development of track geometry deterioration modelling methods focused on utilising currently available data for producing novel information on the development in the condition of track structures, past maintenance effectiveness, and future maintenance needs. Data mining was applied in investigating the root causes of track geometry deterioration based on asset data. Finally, maturity models were applied as the basis for implementing track geometry deterioration modelling and track asset data analytics into practice. Based on the research findings, currently available Finnish measurement and asset data was sufficient for the desired analyses. For the Finnish track inspection data, robust linear optimisation was developed for track geometry deterioration modelling. The modelling provided key figures, which depict the condition of structures, maintenance effectiveness, and future maintenance needs. Moreover, visualisations were created from the modelling to enable the practical use of the modelling results. The applied exploratory data mining method, General Unary Hypotheses Automaton (GUHA), could find interesting and hard-to-detect correlations within asset data. With these correlations, novel observations on problematic track structure types were made. The observations could be utilised for allocating further research for problematic track structures, which would not have been possible without using data mining to identify these structures. The implementation of track geometry deterioration and asset data analytics into practice was approached by applying maturity models. The use of maturity models offered a practical way of approaching future development, as the development could be divided into four maturity levels, which created clear incremental goals for development. The maturity model and the incremental goals enabled wide-scale development planning, in which the progress can be segmented and monitored, which enhances successful project completion. The results from these studies demonstrate how currently available data can be used to provide completely new and meaningful information, when advanced analytics are used. In addition to novel solutions for data analytics, this dissertation research also provided methods for implementing the solutions, as the true benefits of knowledge-based decision making are obtained in only practical railway asset management

    Boosting Reinforcement Learning and Planning with Demonstrations: A Survey

    Full text link
    Although reinforcement learning has seen tremendous success recently, this kind of trial-and-error learning can be impractical or inefficient in complex environments. The use of demonstrations, on the other hand, enables agents to benefit from expert knowledge rather than having to discover the best action to take through exploration. In this survey, we discuss the advantages of using demonstrations in sequential decision making, various ways to apply demonstrations in learning-based decision making paradigms (for example, reinforcement learning and planning in the learned models), and how to collect the demonstrations in various scenarios. Additionally, we exemplify a practical pipeline for generating and utilizing demonstrations in the recently proposed ManiSkill robot learning benchmark
    corecore