597 research outputs found

    Enhancing Sensitivity Classification with Semantic Features using Word Embeddings

    Get PDF
    Government documents must be reviewed to identify any sensitive information they may contain, before they can be released to the public. However, traditional paper-based sensitivity review processes are not practical for reviewing born-digital documents. Therefore, there is a timely need for automatic sensitivity classification techniques, to assist the digital sensitivity review process. However, sensitivity is typically a product of the relations between combinations of terms, such as who said what about whom, therefore, automatic sensitivity classification is a difficult task. Vector representations of terms, such as word embeddings, have been shown to be effective at encoding latent term features that preserve semantic relations between terms, which can also be beneficial to sensitivity classification. In this work, we present a thorough evaluation of the effectiveness of semantic word embedding features, along with term and grammatical features, for sensitivity classification. On a test collection of government documents containing real sensitivities, we show that extending text classification with semantic features and additional term n-grams results in significant improvements in classification effectiveness, correctly classifying 9.99% more sensitive documents compared to the text classification baseline

    DYNAMO-MAS: a multi-agent system for ontology evolution from text

    Get PDF
    International audienceManual ontology development and evolution are complex and time-consuming tasks, even when textual documents are used as knowledge sources in addition to human expertise or existing ontologies. Processing natural language in text produces huge amounts of linguistic data that need to be filtered out and structured. To support both of these tasks, we have developed DYNAMO-MAS, an interactive tool based on an adaptive multi-agent system (adaptive MAS or AMAS) that builds and evolves ontologies from text. DYNA-MO-MAS is a partner system to build ontologies; the ontologist interacts with the system to validate or modify its outputs. This paper presents the architecture of DYNAMO-MAS, its operating principles and its evaluation on three case studies

    Proliferation, bcl-2 expression and angiogenesis in pituitary adenomas: relationship to tumour behaviour

    Get PDF
    The prediction of pituitary tumour behaviour, in terms of response to treatment from which can be derived optimal management strategies, is a challenge that has been approached using several different means. Angiogenesis in other tumour types has been shown to be correlated with poor response to treatment and tumour recurrence. The aim of this paper is to assess the role of measurements of cell proliferation and angiogenesis in predicting pituitary tumour behaviour. The proliferative capacity of the tumour was assessed using the Ki-67 labelling index (LI) while bcl-2 expression was used to assess anti-apoptotic pathways. The microvessel density (MVD) was assessed using antibodies to CD31 and factor VIII-related antigen, and with biotinylated ulex europaeus agglutinin I. There was no difference between Ki-67 LI and MVD of functionless tumours that recurred and those that did not, but bcl-2 expression was significantly lower in tumours that subsequently regrew. Macroprolactinomas had significantly higher LI than microprolactinomas and than all other tumours. Cell proliferation and angiogenesis were not related, showing that both processes are under different control mechanisms in pituitary tumours. In contrast there was a positive relationship between markers of angiogenesis and bcl-2 expression in prolactinomas, GH-secreting tumours and non-recurrent functionless tumours with higher levels of bcl-2 expression being found in the more vascular tumours. These findings may suggest that angiogenesis is related to the ability of tumour cells to survive rather than their proliferative activity. © 2000 Cancer Research Campaig

    Comparing High Dimensional Word Embeddings Trained on Medical Text to Bag-of-Words For Predicting Medical Codes

    Get PDF
    Word embeddings are a useful tool for extracting knowledge from the free-form text contained in electronic health records, but it has become commonplace to train such word embeddings on data that do not accurately reflect how language is used in a healthcare context. We use prediction of medical codes as an example application to compare the accuracy of word embeddings trained on health corpora to those trained on more general collections of text. It is shown that both an increase in embedding dimensionality and an increase in the volume of health-related training data improves prediction accuracy. We also present a comparison to the traditional bag-of-words feature representation, demonstrating that in many cases, this conceptually simple method for representing text results in superior accuracy to that of word embeddings

    Soccer Team Vectors

    Full text link
    In this work we present STEVE - Soccer TEam VEctors, a principled approach for learning real valued vectors for soccer teams where similar teams are close to each other in the resulting vector space. STEVE only relies on freely available information about the matches teams played in the past. These vectors can serve as input to various machine learning tasks. Evaluating on the task of team market value estimation, STEVE outperforms all its competitors. Moreover, we use STEVE for similarity search and to rank soccer teams.Comment: 11 pages, 1 figure; This paper was presented at the 6th Workshop on Machine Learning and Data Mining for Sports Analytics at ECML/PKDD 2019, W\"urzburg, Germany, 201

    Semantically Aware Text Categorisation for Metadata Annotation

    Get PDF
    In this paper we illustrate a system aimed at solving a longstanding and challenging problem: acquiring a classifier to automatically annotate bibliographic records by starting from a huge set of unbalanced and unlabelled data. We illustrate the main features of the dataset, the learning algorithm adopted, and how it was used to discriminate philosophical documents from documents of other disciplines. One strength of our approach lies in the novel combination of a standard learning approach with a semantic one: the results of the acquired classifier are improved by accessing a semantic network containing conceptual information. We illustrate the experimentation by describing the construction rationale of training and test set, we report and discuss the obtained results and conclude by drawing future work.</p

    SerpinB2 regulates stromal remodelling and local invasion in pancreatic cancer

    Get PDF
    Pancreatic cancer has a devastating prognosis, with an overall 5-year survival rate of ~8%, restricted treatment options and characteristic molecular heterogeneity. SerpinB2 expression, particularly in the stromal compartment, is associated with reduced metastasis and prolonged survival in pancreatic ductal adenocarcinoma (PDAC) and our genomic analysis revealed that SERPINB2 is frequently deleted in PDAC. We show that SerpinB2 is required by stromal cells for normal collagen remodelling in vitro, regulating fibroblast interaction and engagement with collagen in the contracting matrix. In a pancreatic cancer allograft model, co-injection of PDAC cancer cells and SerpinB2(-/-) mouse embryonic fibroblasts (MEFs) resulted in increased tumour growth, aberrant remodelling of the extracellular matrix (ECM) and increased local invasion from the primary tumour. These tumours also displayed elevated proteolytic activity of the primary biochemical target of SerpinB2-urokinase plasminogen activator (uPA). In a large cohort of patients with resected PDAC, we show that increasing uPA mRNA expression was significantly associated with poorer survival following pancreatectomy. This study establishes a novel role for SerpinB2 in the stromal compartment in PDAC invasion through regulation of stromal remodelling and highlights the SerpinB2/uPA axis for further investigation as a potential therapeutic target in pancreatic cancer

    Evaluation of pooling operations in convolutional architectures for drug-drug interaction extraction

    Get PDF
    Background: Deep Neural Networks (DNN), in particular, Convolutional Neural Networks (CNN), has recently achieved state-of-art results for the task of Drug-Drug Interaction (DDI) extraction. Most CNN architectures incorporate a pooling layer to reduce the dimensionality of the convolution layer output, preserving relevant features and removing irrelevant details. All the previous CNN based systems for DDI extraction used max-pooling layers. Results: In this paper, we evaluate the performance of various pooling methods (in particular max-pooling, average-pooling and attentive pooling), as well as their combination, for the task of DDI extraction. Our experiments show that max-pooling exhibits a higher performance in F1-score (64.56%) than attentive pooling (59.92%) and than average-pooling (58.35%). Conclusions: Max-pooling outperforms the others alternatives because is the only one which is invariant to the special pad tokens that are appending to the shorter sentences known as padding. Actually, the combination of max-pooling and attentive pooling does not improve the performance as compared with the single max-pooling technique.Publication of this article was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R) and the TEAM project (Erasmus Mundus Action 2-Strand 2 Programme) funded by the European Commission

    Cytogerontology since 1881: A reappraisal of August Weismann and a review of modern progress

    Get PDF
    Cytogerontology, the science of cellular ageing, originated in 1881 with the prediction by August Weismann that the somatic cells of higher animals have limited division potential. Weismann's prediction was derived by considering the role of natural selection in regulating the duration of an organism's life. For various reasons, Weismann's ideas on ageing fell into neglect following his death in 1914, and cytogerontology has only reappeared as a major research area following the demonstration by Hayflick and Moorhead in the early 1960s that diploid human fibroblasts are restricted to a finite number of divisions in vitro. In this review we give a detailed account of Weismann's theory, and we reveal that his ideas were both more extensive in their scope and more pertinent to current research than is generally recognised. We also appraise the progress which has been made over the past hundred years in investigating the causes of ageing, with particular emphasis being given to (i) the evolution of ageing, and (ii) ageing at the cellular level. We critically assess the current state of knowledge in these areas and recommend a series of points as primary targets for future research

    How are time-dependent childbearing intentions realized? Realization, postponement, abandonment, bringing forward

    Full text link
    Our study aims to identify factors that facilitate or inhibit the realization of fertility intentions. The analysis uses data collected in the first two waves of a Hungarian longitudinal survey. Fertility intentions recorded at the first wave pertain to the subsequent 3-year period, just similar to the behavior variable measuring the realization of intentions, i.e., a birth within the 3-year period in question. For this analysis, we used the respondents’ demographic, socio-structural, and orientational traits recorded at the first interview. Our findings show that age, parity, and partnership play a determining role in the realization of fertility intentions, but employment status, religious affiliation, and overall life satisfaction all exhibit significant effects. A marked gender difference was detected not only with regard to employment status but in the area of values and orientations as well.L’objectif de notre Ă©tude est d’identifier les facteurs qui facilitent ou inhibent la rĂ©alisation des intentions de fĂ©conditĂ©. L’analyse s’appuie sur les deux premiĂšres vagues d’une enquĂȘte longitudinale menĂ©e en Hongrie. Les intentions de fĂ©conditĂ© recueillies dans le cadre de la premiĂšre vague concernent la pĂ©riode des trois annĂ©es Ă  venir, de la mĂȘme façon que la variable de comportement mesurant la rĂ©alisation des intentions, Ă  savoir, une naissance survenue au cours de cette mĂȘme pĂ©riode. Les caractĂ©ristiques dĂ©mographiques et socio-structurelles, de mĂȘme que certaines dispositions personnelles recueillies lors du premier entretien ont Ă©tĂ© utilisĂ©es dans l’analyse. Nos rĂ©sultats indiquent qu’à la fois l’ñge, la paritĂ©, et la situation de couple jouent un rĂŽle capital dans la rĂ©alisation des intentions et aussi que la situation d’emploi, l’appartenance religieuse et le niveau de satisfaction par rapport Ă  la vie exercent une influence significative. Une diffĂ©rence prononcĂ©e entre hommes et femmes est mise en Ă©vidence en matiĂšre de situation d’emploi et Ă©galement dans le domaine des valeurs et des dispositions personnelles
    • 

    corecore