589 research outputs found
Reconfigurable Inverted Index
Existing approximate nearest neighbor search systems suffer from two
fundamental problems that are of practical importance but have not received
sufficient attention from the research community. First, although existing
systems perform well for the whole database, it is difficult to run a search
over a subset of the database. Second, there has been no discussion concerning
the performance decrement after many items have been newly added to a system.
We develop a reconfigurable inverted index (Rii) to resolve these two issues.
Based on the standard IVFADC system, we design a data layout such that items
are stored linearly. This enables us to efficiently run a subset search by
switching the search method to a linear PQ scan if the size of a subset is
small. Owing to the linear layout, the data structure can be dynamically
adjusted after new items are added, maintaining the fast speed of the system.
Extensive comparisons show that Rii achieves a comparable performance with
state-of-the art systems such as Faiss.Comment: ACMMM 2018 (oral). Code: https://github.com/matsui528/ri
GGNN: Graph-based GPU Nearest Neighbor Search
Approximate nearest neighbor (ANN) search in high dimensions is an integral
part of several computer vision systems and gains importance in deep learning
with explicit memory representations. Since PQT and FAISS started to leverage
the massive parallelism offered by GPUs, GPU-based implementations are a
crucial resource for today's state-of-the-art ANN methods. While most of these
methods allow for faster queries, less emphasis is devoted to accelerate the
construction of the underlying index structures. In this paper, we propose a
novel search structure based on nearest neighbor graphs and information
propagation on graphs. Our method is designed to take advantage of GPU
architectures to accelerate the hierarchical building of the index structure
and for performing the query. Empirical evaluation shows that GGNN
significantly surpasses the state-of-the-art GPU- and CPU-based systems in
terms of build-time, accuracy and search speed
Survey of Vector Database Management Systems
There are now over 20 commercial vector database management systems (VDBMSs),
all produced within the past five years. But embedding-based retrieval has been
studied for over ten years, and similarity search a staggering half century and
more. Driving this shift from algorithms to systems are new data intensive
applications, notably large language models, that demand vast stores of
unstructured data coupled with reliable, secure, fast, and scalable query
processing capability. A variety of new data management techniques now exist
for addressing these needs, however there is no comprehensive survey to
thoroughly review these techniques and systems. We start by identifying five
main obstacles to vector data management, namely vagueness of semantic
similarity, large size of vectors, high cost of similarity comparison, lack of
natural partitioning that can be used for indexing, and difficulty of
efficiently answering hybrid queries that require both attributes and vectors.
Overcoming these obstacles has led to new approaches to query processing,
storage and indexing, and query optimization and execution. For query
processing, a variety of similarity scores and query types are now well
understood; for storage and indexing, techniques include vector compression,
namely quantization, and partitioning based on randomization, learning
partitioning, and navigable partitioning; for query optimization and execution,
we describe new operators for hybrid queries, as well as techniques for plan
enumeration, plan selection, and hardware accelerated execution. These
techniques lead to a variety of VDBMSs across a spectrum of design and runtime
characteristics, including native systems specialized for vectors and extended
systems that incorporate vector capabilities into existing systems. We then
discuss benchmarks, and finally we outline research challenges and point the
direction for future work.Comment: 25 page
The IceCube Neutrino Observatory: Instrumentation and Online Systems
The IceCube Neutrino Observatory is a cubic-kilometer-scale high-energy
neutrino detector built into the ice at the South Pole. Construction of
IceCube, the largest neutrino detector built to date, was completed in 2011 and
enabled the discovery of high-energy astrophysical neutrinos. We describe here
the design, production, and calibration of the IceCube digital optical module
(DOM), the cable systems, computing hardware, and our methodology for drilling
and deployment. We also describe the online triggering and data filtering
systems that select candidate neutrino and cosmic ray events for analysis. Due
to a rigorous pre-deployment protocol, 98.4% of the DOMs in the deep ice are
operating and collecting data. IceCube routinely achieves a detector uptime of
99% by emphasizing software stability and monitoring. Detector operations have
been stable since construction was completed, and the detector is expected to
operate at least until the end of the next decade.Comment: 83 pages, 50 figures; updated with minor changes from journal review
and proofin
Operational experience, improvements, and performance of the CDF Run II silicon vertex detector
The Collider Detector at Fermilab (CDF) pursues a broad physics program at
Fermilab's Tevatron collider. Between Run II commissioning in early 2001 and
the end of operations in September 2011, the Tevatron delivered 12 fb-1 of
integrated luminosity of p-pbar collisions at sqrt(s)=1.96 TeV. Many physics
analyses undertaken by CDF require heavy flavor tagging with large charged
particle tracking acceptance. To realize these goals, in 2001 CDF installed
eight layers of silicon microstrip detectors around its interaction region.
These detectors were designed for 2--5 years of operation, radiation doses up
to 2 Mrad (0.02 Gy), and were expected to be replaced in 2004. The sensors were
not replaced, and the Tevatron run was extended for several years beyond its
design, exposing the sensors and electronics to much higher radiation doses
than anticipated. In this paper we describe the operational challenges
encountered over the past 10 years of running the CDF silicon detectors, the
preventive measures undertaken, and the improvements made along the way to
ensure their optimal performance for collecting high quality physics data. In
addition, we describe the quantities and methods used to monitor radiation
damage in the sensors for optimal performance and summarize the detector
performance quantities important to CDF's physics program, including vertex
resolution, heavy flavor tagging, and silicon vertex trigger performance.Comment: Preprint accepted for publication in Nuclear Instruments and Methods
A (07/31/2013
Advanced electronic structure theory: from molecules to crystals
In dieser Dissertation werden ab initio Theorien zur Beschreibung der ZustĂ€nde von perfekten halbleitenden und nichtleitenden Kristallen, unter BerĂŒcksichtigung elektronischer Korrelationen, abgeleitet und angewandt. Als Ausgangsbasis dient hierzu die Hartree-Fock Approximation in Verbindung mit Wannier-Orbitalen. Darauf aufbauend studiere ich zunĂ€chst in Teil I der Abhandlung den Grundzustand der wasserstoffbrĂŒckengebundenen Fluorwasserstoff und Chlorwasserstoff zick-zack Ketten und analysiere die langreichweitigen KorrelationsbeitrĂ€ge. Dabei mache ich die Basissatzextrapolationstechniken, die fĂŒr kleine MolekĂŒle entwickelt wurden, zur Berechnung von hochgenauen Bindungsenergien von Kristallen nutzbar. In Teil II der Arbeit leite ich zunĂ€chst eine quantenfeldtheoretische ab initio Beschreibung von ElektroneneinfangzustĂ€nden und LochzustĂ€nden in Kristallen her. Grundlage hierbei ist das etablierte algebraische diagrammatische Konstruktionsschema (ADC) zur Approximation der Selbstenergie fĂŒr die Bestimmung der Vielteilchen-Green's-Funktion mittels der Dyson-Gleichung. Die volle Translationssymmetrie des Problems wird hierbei beachtet und die LokalitĂ€t elektronischer Korrelationen ausgenutzt. Das resultierende Schema wird Kristallorbital-ADC (CO-ADC) genannt. Ich berechne damit die Quasiteilchenbandstruktur einer Fluorwasserstoffkette und eines Lithiumfluoridkristalls. In beiden FĂ€llen erhalte ich eine sehr gute Ăbereinstimmung zwischen meinen Resultaten und den Ergebnissen aus anderen Methoden.In this dissertation, theories for the ab initio description of the states of perfect semiconducting and insulating crystals are derived and applied. Electron correlations are treated thoroughly based on the Hartree-Fock approximation formulated in terms of Wannier orbitals. In part I of the treatise, I study the ground state of hydrogen-bonded hydrogen fluoride and hydrogen chloride zig-zag chains. I analyse the long-range contributions of electron correlations. Thereby, I employ basis set extrapolation techniques, which have originally been developed for small molecules, to also obtain highly accurate binding energies of crystals. In part II of the thesis, I devise an ab initio description of the electron attachment and electron removal states of crystals using methods of quantum field theory. I harness the well-established algebraic diagrammatic construction scheme (ADC) to approximate the self-energy, used in conjunction with the Dyson equation, to determine the many-particle Green's function for crystals. Thereby, the translational symmetry of the problem and the locality of electron correlations are fully exploited. The resulting scheme is termed crystal orbital ADC (CO-ADC). It is applied to obtain the quasiparticle band structure of a hydrogen fluoride chain and a lithium fluoride crystal. In both cases, a very good agreement of my results to those determined with other methods is observed
Lycoris -- a large-area, high resolution beam telescope
A high-resolution beam telescope is one of the most important and demanding
infrastructure components at any test beam facility. Its main purpose is to
provide reference particle tracks from the incoming test beam particles to the
test beam users, which allows measurement of the performance of the
device-under-test (DUT). \LYCORIS, a six-plane compact beam telescope with an
active area of 10\SI{10}{\square\centi\metre} (extensible to
10\SI{20}{\square\centi\metre}) was installed at the \DIITBF in 2019,
to provide a precise momentum measurement in a \SI{1}{\tesla} solenoid magnet
or to provide tracking over a large area. The overall design of \LYCORIS will
be described as well as the performance of the chosen silicon sensor. The
\SI{25}{\micro\metre} pitch micro-strip sensor used for \LYCORIS was originally
designed for the \SID detector concept for the International Linear Collider.
It adopts a second metallization layer to route signals from strips to the
bump-bonded \KPIX ASIC and uses a wire-bonded flex cable for the connection to
the DAQ and the power supply system. This arrangement eliminates the need for a
dedicated hybrid PCB. Its performance was tested for the first time in this
project. The system has been evaluated at the \DIITBF in several test-beam
campaigns and has demonstrated an average single-point resolution of
\SI{7.07}{\micro\meter}.Comment: 43 pages, 37 figure
An Analysis of Muon Neutrino Disappearance from the NuMI Beam Using an Optimal Track Fitter
Thesis (Ph.D.) - Indiana University, Physics, 2015The NOvA experiment is a long-baseline neutrino oscillation experiment based out of Fermilab National Accelerator Laboratory that uses two liquid scintillator detectors, one at Fermilab (the "near" detector) and a second 14 kton detector in northern Minnesota (the "far" detector.) The primary physics goals of the NOvA experiment are to measure neutrino mixing parameters through both the disappearance and appearance channels using neutrinos from the newly upgraded NuMI beam line. The NOvA disappearance analysis can significantly improve the world's best measurement of . This analysis proceeds by using the measured charged-current energy spectrum in the near detector to predict the spectrum in the far detector, and comparing this to the measured spectrum to obtain a best fit for the oscillation parameters and . Since this fit is governed by the shape of the energy spectrum, the best fit will be maximized by obtaining the best possible energy resolution for the individual neutrino events. This dissertation describes an alternate disappearance analysis technique for the NOvA experiment, based on the idea that estimating the energy resolution of the individual events will allow them to be separated into different energy resolution samples in order to improve the final fit. This involves using an optimal tracker to reconstruct particle tracks and momenta, and multivariate methods for estimating the event energies and energy resolutions. The data used for this analysis was taken by the NOvA experiment from February 2014 to May 2015, representing approximately protons on target from the NuMI beam. The best fit oscillation parameters obtained by this alternate technique are ~ and ~ which is consistent with the hypothesis of maximal mixing, and with the results from T2K and MINOS+ published in 2015
- âŠ