5,784 research outputs found

    DUKweb, diachronic word representations from the UK Web Archive corpus

    Get PDF
    Lexical semantic change (detecting shifts in the meaning and usage of words) is an important task for social and cultural studies as well as for Natural Language Processing applications. Diachronic word embeddings (time-sensitive vector representations of words that preserve their meaning) have become the standard resource for this task. However, given the significant computational resources needed for their generation, very few resources exist that make diachronic word embeddings available to the scientific community. In this paper we present DUKweb, a set of large-scale resources designed for the diachronic analysis of contemporary English. DUKweb was created from the JISC UK Web Domain Dataset (1996–2013), a very large archive which collects resources from the Internet Archive that were hosted on domains ending in ‘.uk’. DUKweb consists of a series word co-occurrence matrices and two types of word embeddings for each year in the JISC UK Web Domain dataset. We show the reuse potential of DUKweb and its quality standards via a case study on word meaning change detection

    Mining the UK web archive for semantic change detection

    Get PDF
    Semantic change detection (i.e., identify- ing words whose meaning has changed over time) started emerging as a grow- ing area of research over the past decade, with important downstream applications in natural language processing, historical linguistics and computational social sci- ence. However, several obstacles make progress in the domain slow and diffi- cult. These pertain primarily to the lack of well-established gold standard datasets, resources to study the problem at a fine- grained temporal resolution, and quantita- tive evaluation approaches. In this work, we aim to mitigate these issues by (a) re- leasing a new labelled dataset of more than 47K word vectors trained on the UK Web Archive over a short time-frame (2000- 2013); (b) proposing a variant of Pro- crustes alignment to detect words that have undergone semantic shift; and (c) intro- ducing a rank-based approach for evalu- ation purposes. Through extensive nu- merical experiments and validation, we il- lustrate the effectiveness of our approach against competitive baselines. Finally, we also make our resources publicly available to further enable research in the domain.This work was supported by The Alan Turing In- stitute under the EPSRC grant EP/N510129/1 and the seed funding grant SF099

    Diffractive Higgs Production from Intrinsic Heavy Flavors in the Proton

    Full text link
    We propose a novel mechanism for exclusive diffractive Higgs production pp→pHppp \to p H p in which the Higgs boson carries a significant fraction of the projectile proton momentum. This mechanism will provide a clear experimental signal for Higgs production due to the small background in this kinematic region. The key assumption underlying our analysis is the presence of intrinsic heavy flavor components of the proton bound state, whose existence at high light-cone momentum fraction xx has growing experimental and theoretical support. We also discuss the implications of this picture for exclusive diffractive quarkonium and other channels.Comment: 30 pages, 5 figure

    A Reference Architecture for Software Protection

    Get PDF
    This paper describes the ASPIRE reference architecture designed to tackle one major problem in this domain: the lack of a clear process and an open software architecture for the composition and deployment of multiple software protections on software applications

    Odorants of Capsicum spp. dried fruits as candidate attractants for Lasioderma serricorne F. (Coleoptera: Anobiidae)

    Get PDF
    The cigarette beetle, Lasioderma serricorne F. (Coleoptera: Anobiidae) is an important food storage pest affecting the tobacco industry and is increasingly impacting museums and herbaria. Monitoring methods make use of pheromone traps which can be implemented using chili fruit powder. The objective of this study was to assess the response of L. serricorne to the volatile organic compounds (VOCs) from different chili powders in order to identify the main semiochemicals involved in this attraction. Volatiles emitted by Capsicum annuum, C. frutescens, and C. chinense dried fruit powders were tested in an olfactometer and collected and analyzed using SPME and GC-MS. Results indicated that C. annuum and C. frutescens VOCs elicit attraction toward L. serricorne adults in olfactometer, while C. chinense VOCs elicit no attraction. Chemicals analysis showed a higher presence of polar compounds in the VOCs of C. annuum and C. frutescens compared to C. chinense, with α-ionone and β-ionone being more abundant in the attractive species. Further olfactometer bioassays indicated that both α-ionone and β-ionone elicit attraction, suggesting that these compounds are candidates as synergistic attractants in pheromone monitoring traps for L. serricorne

    A Novel Approach for an Integrated Straw tube-Microstrip Detector

    Full text link
    We report on a novel concept of silicon microstrips and straw tubes detector, where integration is accomplished by a straw module with straws not subjected to mechanical tension in a Rohacell ®^{\circledR} lattice and carbon fiber reinforced plastic shell. Results on mechanical and test beam performances are reported on as well.Comment: Accepted by Transactions on Nuclear Science (2005). 11 pages, 9 figures, uses lnfprep.st

    Studies of multiplicity in relativistic heavy-ion collisions

    Full text link
    In this talk I'll review the present status of charged particle multiplicity measurements from heavy-ion collisions. The characteristic features of multiplicity distributions obtained in Au+Au collisions will be discussed in terms of collision centrality and energy and compared to those of p+p collisions. Multiplicity measurements of d+Au collisions at 200 GeV nucleon-nucleon center-of-mass energy will also be discussed. The results will be compared to various theoretical models and simple scaling properties of the data will be identified.Comment: "Focus on Multiplicity" Internationsl Workshop on Particle Multiplicity in Relativistic Heavy Ion Collisions, Bari, Italy, June 17-19, 2003, 16 pages, 15 figure
    • …