Search CORE

696 research outputs found

Jet Substructure at the Tevatron and LHC: New results, new tools, new benchmarks

Author: A Altheimer
A Davison
A E Cholakian
A Fregoso
A Gomes
A Haas
A Hook
A Hornig
A J Larkoski
A Lath
A Safonov
A Schwartzman
Appleby R
ATLAS Collaboration
ATLAS Collaboration
ATLAS Collaboration
ATLAS Collaboration
ATLAS Collaboration
ATLAS Collaboration Brooijmans G
B Chapleau
B Tweedie
Banfi A
Banfi A
Banfi A
Banfi A
Bauer C W Tackmann F J Walsh J R Zuberi S
Berger C F
Breitweg J
C K Vermilion
C Lee
Cacciari M
Cacciari M
Cacciari M
Cacciari M Dasgupta M Magnea L Salam G
Cacciari M Salam G P Soyez G
CDF Collaboration
CDF Collaboration
CDF Collaboration Aaltonen T
Chekanov S
CMS Collaboration
CMS Collaboration
CMS Collaboration
CMS Collaboration
D E Soper
D Krohn
D W Miller
D Walker
Dasgupta M
Dasgupta M
Delenda Y
Delsart P-A Geerlings K Huston J Martin B Vermilion C K
E Halkiadakis
E Izaguirre
E Strauss
Ellis S D
Falkowski A
G Brooijmans
G Kribs
G P Salam
Gallicchio J
Gleisberg T
Gleisberg T
Gleisberg T
Hoeche S
Hoecker A Speckmayer P Stelzer J Therhaag J von Toerne E Voss H
Hook A Jankowiak M Wacker J G
Hornig A
Hornig A
Hornig A
I W Stewart
J Butterworth
J Dolen
J Gallicchio
J J Fan
J P Chou
J R Walsh
J Shao
J Thaler
J Wacker
Jankowiak M
K Prokofiev
Krohn D
Krohn D Randall L Wang L-T
L Asquith
L-T Wang
Lampl W Laplace S Lelas D Loch P Ma H Menke S Rajagopalan S Rousseau D Snyder S Unal G
Lee C
M Campanelli
M D Schwartz
M Dasgupta
M Jankowiak
M Martinez
M Seymour
M Son
M Spannowsky
M Strassler
M Takeuchi
M Villaplana
M Vos
Miller D W
P Huang
P Loch
P Maksimovic
P Sinervo
Plehn T
Plehn T
Plehn T Spannowsky M
R Essig
R Field
R Rahmat
R Vasquez Sierra
Rappoccio S
S Arora
S D Ellis
S Hoeche
S J Lee
S Rappoccio
S Schumann
S Thomas
S Wilbur
Sapeta S
Schumann S
Seymour M
T Plehn
Thaler J
Thaler J
Thaler J
V Halyo
W Zhu
Walsh J R Zuberi S
Y Gershtein
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

In this report we review recent theoretical progress and the latest experimental results in jet substructure from the Tevatron and the LHC. We review the status of and outlook for calculation and simulation tools for studying jet substructure. Following up on the report of the Boost 2010 workshop, we present a new set of benchmark comparisons of substructure techniques, focusing on the set of variables and grooming methods that are collectively known as "top taggers". To facilitate further exploration, we have attempted to collect, harmonise, and publish software implementations of these techniques.Comment: 53 pages, 17 figures. L. Asquith, S. Rappoccio, C. K. Vermilion, editors; v2: minor edits from journal revision

arXiv.org e-Print Archive

Hong Kong University of Science and Technology Institutional Repository

The University of Manchester - Institutional Repository

CERN Document Server

Usefulness of social tagging in organizing and providing access to the web: An analysis of indexing consistency and quality

Author: Choi Yunseon
Publication venue
Publication date: 01/08/2011
Field of study

This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing

Illinois Digital Environment for Access to Learning and Scholarship Repository

Heavy quark jets at the LHC

Author: Voutilainen Mikko
Publication venue
Publication date: 16/09/2015
Field of study

We summarize measurements of b and c jet production at the LHC, which are an important signature and background for decays of massive particles such as H-to-b-bbar. These include measurements of the inclusive and dijet production of heavy quark jets, b and c jets produced in association with vector bosons Z and W, and decays of boosted Z bosons into pairs of b-bbar. The current status of b tagging and b jet energy scale is also reviewed. These measurements test perturbative QCD in the four and five-flavor number schemes, and provide insight into the relative importance of heavy flavor production through flavor creation, flavor excitation and gluon splitting channels. The W+c measurement provides additionally a powerful way to probe the strange quark and antiquark sea in the proton. The recent studies looking separately at production of one and two b jets find generally good agreement with theory predictions for two b-jet production, while some discrepancies are observed for singly produced b jets, particularly at large b-jet pT , where gluon splitting becomes dominant.Comment: Article submitted to the International Journal of Modern Physics A (IJMPA) as part of the special issue on the "Jet Measurements at the LHC", editor G. Dissertori. 16 pages, 27 figure

arXiv.org e-Print Archive

CERN Document Server

Hadronic ′ search at the LHC with top and W taggers

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

Improving Robustness and Scalability of Available Ner Systems

Author: McKenzie Amber
Publication venue: Scholar Commons
Publication date: 01/01/2013
Field of study

The focus of this research is to study and develop techniques to adapt existing NER resources to serve the needs of a broad range of organizations without expert NLP manpower. My methods emphasize usability, robustness and scalability of existing NER systems to ensure maximum functionality to a broad range of organizations. Usability is facilitated by ensuring that the methodologies are compatible with any available open-source NER tagger or data set, thus allowing organizations to choose resources that are easy to deploy and maintain and fit their requirements. One way of making use of available tagged data would be to aggregate a number of different tagged sets in an effort to increase the coverage of the NER system. Though, generally, more tagged data can mean a more robust NER model, extra data also introduces a significant amount of noise and complexity into the model as well. Because adding in additional training data to scale up an NER system presents a number of challenges in terms of scalability, this research aims to address these difficulties and provide a means for multiple available training sets to be aggregated while reducing noise, model complexity and training times. In an effort to maintain usability, increase robustness and improve scalability, I designed an approach to merge document clustering of the training data with open-source or available NER software packages and tagged data that can be easily acquired and implemented. Here, a tagged training set is clustered into smaller data sets, and models are then trained on these smaller clusters. This is designed not only to reduce noise by creating more focused models, but also to increase scalability and robustness. Document clustering is used extensively in information retrieval, but has never been used in conjunction with NER

Scholar Commons - Institutional Repository of the University of South Carolina

Human-competitive automatic topic indexing

Author: Medelyan Olena
Publication venue: The University of Waikato
Publication date: 01/01/2009
Field of study

Topic indexing is the task of identifying the main topics covered by a document. These are useful for many purposes: as subject headings in libraries, as keywords in academic publications and as tags on the web. Knowing a document's topics helps people judge its relevance quickly. However, assigning topics manually is labor intensive. This thesis shows how to generate them automatically in a way that competes with human performance. Three kinds of indexing are investigated: term assignment, a task commonly performed by librarians, who select topics from a controlled vocabulary; tagging, a popular activity of web users, who choose topics freely; and a new method of keyphrase extraction, where topics are equated to Wikipedia article names. A general two-stage algorithm is introduced that first selects candidate topics and then ranks them by significance based on their properties. These properties draw on statistical, semantic, domain-specific and encyclopedic knowledge. They are combined using a machine learning algorithm that models human indexing behavior from examples. This approach is evaluated by comparing automatically generated topics to those assigned by professional indexers, and by amateurs. We claim that the algorithm is human-competitive because it chooses topics that are as consistent with those assigned by humans as their topics are with each other. The approach is generalizable, requires little training data and applies across different domains and languages

Research Commons@Waikato

CERN Document Server