637 research outputs found

    Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

    Full text link
    Large-scale clinical data is invaluable to driving many computational scientific advances today. However, understandable concerns regarding patient privacy hinder the open dissemination of such data and give rise to suboptimal siloed research. De-identification methods attempt to address these concerns but were shown to be susceptible to adversarial attacks. In this work, we focus on the vast amounts of unstructured natural language data stored in clinical notes and propose to automatically generate synthetic clinical notes that are more amenable to sharing using generative models trained on real de-identified records. To evaluate the merit of such notes, we measure both their privacy preservation properties as well as utility in training clinical NLP models. Experiments using neural language models yield notes whose utility is close to that of the real ones in some clinical NLP tasks, yet leave ample room for future improvements.Comment: Clinical NLP Workshop 201

    A Simple Language Model based on PMI Matrix Approximations

    Full text link
    In this study, we introduce a new approach for learning language models by training them to estimate word-context pointwise mutual information (PMI), and then deriving the desired conditional probabilities from PMI at test time. Specifically, we show that with minor modifications to word2vec's algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models. A compelling aspect of our approach is that our models are trained with the same simple negative sampling objective function that is commonly used in word2vec to learn word embeddings.Comment: Accepted to EMNLP 201

    doi:10.1093/nar/gkp471 Stochastic noise in splicing machinery

    Get PDF
    The number of known alternative human isoforms has been increasing steadily with the amount of available transcription data. To date, over 100 000 isoforms have been detected in EST libraries, and at least 75 % of human genes have at least one alternative isoform. In this paper, we propose that most alternative splicing events are the result of noise in the splicing process. We show that the number of isoforms and their abundance can be predicted by a simple stochastic noise model that takes into account two factors: the number of introns in a gene and the expression level of a gene. The results strongly support the hypothesis that most alternative splicing is a consequence o

    Social Public Expenditure Analysis in the 2010 National Budget

    Get PDF
    The National Budget has a key role as an instrument of funds allocation for different social priorities and resources redistribution with localized impact in the provinces. The main objective of Social Public Expenditure (GPS) is to promote the access of most vulnerable social groups to quality basic services. In this mission, the National Government has the essential function of guaranteeing minimum levels of interregional equity between provinces. The present research analyzes the allocated funds to the various social programs, priorities and allocation criteria used to provinces in the National Budget 2010, comparing these figures turn to the 2009 budget implementation. This study is part of a Siena Foundation project made possible by the support of the Konrad Adenauer Foundation, Argentina headquarters, which includes the elaboration of Public Social Spending during the parliamentary debate, as well as monitoring their implementation in order to provide relevant and timely information on public finances in Argentina. From the study, it shows the critical importance of Social Public Expenditure in the National Budget, especially as a tool to complement the social services that provide the provinces to their inhabitants and the challenges that are imposed to improve the social conditions of the population. Among the key findings and challenges are the following: 1. 60% of the National Budget goes to Social Public Expenditure. In this way the National Budget becomes a key mechanism for prioritization and reallocation of resources 2. Increases in the GPS 2010 compared to 2009 allow to stakeout that the priorities were assigned to social security, education and science and technology programs. 3. The main distribution criteria for allocated resources among the provinces to social programs are the amount of population. That is, most resources are concentrated in provinces with larger populations. Second, in general, there are certain relationships with more objective distribution criteria, such as poverty rates, unemployment or housing deficit of the provinces. However, these indicators don’t play an important role, and there can be seen inequities in the distribution of the provinces. 4. The Finance Act 2010 has weaknesses in information and geographic distribution of social programs. In summary, the work highlights the critical importance of Social Public Expenditure in the National Budget, especially social security spending, which becomes more relevant with the new Universal Child Allocation for Social Protection program, as well as programs for education, health and advocacy and social assistance, among others, which together amount to 60% of National Budget. As a result, emerges the need to pay special attention to the priorities that are assigned each year to social spending and how it is distributed among the provinces, to thereby promote greater equity in the distribution of resources among the provinces. Therefore, it is a priority; discussing the criteria of direct or indirect resources allocation to the provinces, especially for programs with greater social impact. In these cases should be given greater weight to objective criteria related more to the social situation of the provinces, together with the amount of population, thus strengthening the role of national government as a guarantor of minimum standards in terms of interregional equity.Social Public Expenditure; National Budget; funds; allocation; National Goverment; interregional equity; social programs; priorities; guarantee; criteria; distribution; minimun standards; social impact

    DIGITALIZAR A EDUCAÇÃO NA ERA DAS PANDEMIAS: : UM ESTUDO DOS EFEITOS TRANSFORMADORES DA COVID-19

    Get PDF
    The paper investigates the impact of the COVID-19 pandemic on the acceleration of digitalization processes in the field of education. Specifically, the study examines the development and improvement of online learning formats during this period, highlighting both the advantages and disadvantages, as well as identifying the bottlenecks and challenges that emerged. The paper presents the results of a survey conducted among Bachelor's and Master's degree students at the Faculty of Distance Learning at Plekhanov Russian University of Economics, categorized according to their areas of study, to determine the efficacy of remote learning processes and methods. Additionally, in-depth interviews were conducted to provide a more detailed analysis of the benefits and drawbacks of online learning, as well as potential solutions to improve remote learning processes. Overall, this paper offers valuable insights into the role of the pandemic as a catalyst for digital transformation in education, and provides recommendations for enhancing the effectiveness and efficiency of online learning.O artigo investiga o impacto da pandemia da COVID-19 na aceleração dos processos de digitalização no domínio da educação. Especificamente, o estudo examina o desenvolvimento e a melhoria dos formatos de aprendizagem em linha durante este período, destacando tanto as vantagens como as desvantagens, bem como identificando os estrangulamentos e os desafios que surgiram. O documento apresenta os resultados de um inquérito realizado entre os estudantes de licenciatura e mestrado da Faculdade de Ensino à Distância da Universidade Russa de Economia Plekhanov, categorizados de acordo com as suas áreas de estudo, para determinar a eficácia dos processos e métodos de aprendizagem à distância. Além disso, foram realizadas entrevistas aprofundadas para fornecer uma análise mais pormenorizada das vantagens e desvantagens da aprendizagem em linha, bem como potenciais soluções para melhorar os processos de aprendizagem à distância. De um modo geral, este documento oferece informações valiosas sobre o papel da pandemia como catalisador da transformação digital na educação e fornece recomendações para melhorar a eficácia e a eficiência da aprendizagem em linha
    corecore