10,264 research outputs found
Detecting a Boosted Diboson Resonance
New light scalar particles in the mass range of hundreds of GeV, decaying
into a pair of bosons can appear in several extensions of the SM. The
focus of collider studies for such a scalar is often on its direct production,
where the scalar is typically only mildly boosted. The observed are
therefore well-separated, allowing analyses for the scalar resonance in a
standard fashion as a low-mass diboson resonance. In this work we instead focus
on the scenario where the direct production of the scalar is suppressed, and it
is rather produced via the decay of a significantly heavier (a few TeV mass)
new particle, in conjunction with SM particles. Such a process results in the
scalar being highly boosted, rendering the 's from its decay merged. The
final state in such a decay is a "fat" jet, which can be either four-pronged
(for fully hadronic decays), or may be like a jet, but with leptons
buried inside (if one of the decays leptonically). In addition, this fat
jet has a jet mass that can be quite different from that of the /Higgs/top
quark-induced jet, and may be missed by existing searches. In this work, we
develop dedicated algorithms for tagging such multi-layered "boosted dibosons"
at the LHC. As a concrete application, we discuss an extension of the standard
warped extra-dimensional framework where such a light scalar can arise. We
demonstrate that the use of these algorithms gives sensitivity in mass ranges
that are otherwise poorly constrained.Comment: 33 pages, 13 figure
Automatic document classification of biological literature
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, a text-mining system for biological literature, which marks up full text according to a shallow ontology that includes terms of biological interest. This project investigates document classification in the context of biological literature, making use of the Textpresso markup of a corpus of Caenorhabditis elegans literature.
Results: We present a two-step text categorization algorithm to classify a corpus of C. elegans papers. Our classification method first uses a support vector machine-trained classifier, followed by a novel, phrase-based clustering algorithm. This clustering step autonomously creates cluster labels that are descriptive and understandable by humans. This clustering engine performed better on a standard test-set (Reuters 21578) compared to previously published results (F-value of 0.55 vs. 0.49), while producing cluster descriptions that appear more useful. A web interface allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept.
Conclusions: We have demonstrated a simple method to classify biological documents that embodies an improvement over current methods. While the classification results are currently optimized for Caenorhabditis elegans papers by human-created rules, the classification engine can be adapted to different types of documents. We have demonstrated this by presenting a web interface that allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept
Jet Trimming
Initial state radiation, multiple interactions, and event pileup can
contaminate jets and degrade event reconstruction. Here we introduce a
procedure, jet trimming, designed to mitigate these sources of contamination in
jets initiated by light partons. This procedure is complimentary to existing
methods developed for boosted heavy particles. We find that jet trimming can
achieve significant improvements in event reconstruction, especially at high
energy/luminosity hadron colliders like the LHC.Comment: 20 pages, 11 figures, 3 tables - Minor changes to text/figure
Resummation prediction on the jet mass spectrum in one-jet inclusive production at the LHC
We study the factorization and resummation prediction on the jet mass
spectrum in one-jet inclusive production at the LHC based on soft-collinear
effective theory. The soft function with anti- algorithm is calculated at
next-to-leading order and its validity is demonstrated by checking the
agreement between the expanded leading singular terms with the exact
fixed-order result. The large logarithms and the global
logarithms in the process are resummed to all order at
next-to-leading logarithmic and next-to-next-to-leading logarithmic level,
respectively. The cross section is enhanced by about 23% from the
next-to-leading logarithmic level to next-to-next-to-leading logarithmic level.
Comparing our resummation predictions with those from Monte Carlo tool PYTHIA
and ATLAS data at the 7 TeV LHC, we find that the peak positions of the jet
mass spectra agree with those from PYTHIA at parton level, and the predictions
of the jet mass spectra with non-perturbative effects are in coincidence with
the ATLAS data. We also show the predictions at the future 13 TeV LHC.Comment: 43 pages, 10 figure
Twin Learning for Similarity and Clustering: A Unified Kernel Approach
Many similarity-based clustering methods work in two separate steps including
similarity matrix computation and subsequent spectral clustering. However,
similarity measurement is challenging because it is usually impacted by many
factors, e.g., the choice of similarity metric, neighborhood size, scale of
data, noise and outliers. Thus the learned similarity matrix is often not
suitable, let alone optimal, for the subsequent clustering. In addition,
nonlinear similarity often exists in many real world data which, however, has
not been effectively considered by most existing methods. To tackle these two
challenges, we propose a model to simultaneously learn cluster indicator matrix
and similarity information in kernel spaces in a principled way. We show
theoretical relationships to kernel k-means, k-means, and spectral clustering
methods. Then, to address the practical issue of how to select the most
suitable kernel for a particular clustering task, we further extend our model
with a multiple kernel learning ability. With this joint model, we can
automatically accomplish three subtasks of finding the best cluster indicator
matrix, the most accurate similarity relations and the optimal combination of
multiple kernels. By leveraging the interactions between these three subtasks
in a joint framework, each subtask can be iteratively boosted by using the
results of the others towards an overall optimal solution. Extensive
experiments are performed to demonstrate the effectiveness of our method.Comment: Published in AAAI 201
Electron-hadron shower discrimination in a liquid argon time projection chamber
By exploiting structural differences between electromagnetic and hadronic showers in a multivariate analysis we present an efficient Electron-Hadron discrimination algorithm for liquid argon time projection chambers, validated using Geant4 simulated data
- …