30 research outputs found

    Scalable Nearest Neighbor Search with Compact Codes

    Get PDF
    An important characteristic of the recent decade is the dramatic growth in the use and generation of data. From collections of images, documents and videos, to genetic data, and to network traffic statistics, modern technologies and cheap storage have made it possible to accumulate huge datasets. But how can we effectively use all this data? The growing sizes of the modern datasets make it crucial to develop new algorithms and tools capable of sifting through this data efficiently. A central computational primitive for analyzing large datasets is the Nearest Neighbor Search problem in which the goal is to preprocess a set of objects, so that later, given a query object, one can find the data object closest to the query. In most situations involving high-dimensional objects, the exhaustive search which compares the query with every item in the dataset has a prohibitive cost both for runtime and memory space. This thesis focuses on the design of algorithms and tools for fast and cost efficient nearest neighbor search. The proposed techniques advocate the use of compressed and discrete codes for representing the neighborhood structure of data in a compact way. Transforming high-dimensional items, such as raw images, into similarity-preserving compact codes has both computational and storage advantages as compact codes can be stored efficiently using only a few bits per data item, and more importantly they can be compared extremely fast using bit-wise or look-up table operators. Motivated by this view, the present work explores two main research directions: 1) finding mappings that better preserve the given notion of similarity while keeping the codes as compressed as possible, and 2) building efficient data structures that support non-exhaustive search among the compact codes. Our large-scale experimental results reported on various benchmarks including datasets upto one billion items, show boost in retrieval performance in comparison to the state-of-the-art

    Advancements in seismic tomography with application to tunnel detection and volcano imaging

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 1998Practical geotomography is an inverse problem with no unique solution. A priori information must be imposed for a stable solution to exist. Commonly used types of a priori information smooth and attenuate anomalies, resulting in 'blurred' tomographic images. Small or discrete anomalies, such as tunnels, magma conduits, or buried channels are extremely difficult imaging objectives. Composite distribution inversion (CDI) is introduced as a theory seeking physically simple, rather than distributionally simple, solutions of non-unique problems. Parameters are assumed to be members of a composite population, including both well-known and anomalous components. Discrete and large amplitude anomalies are allowed, while a well-conditioned inverse is maintained. Tunnel detection is demonstrated using CDI tomography and data collected near the northern border of South Korea. Accurate source and receiver location information is necessary. Borehole deviation corrections are estimated by minimizing the difference between empirical distributions of apparent parameter values as a function of location correction. Improved images result. Traveltime computation and raytracing are the most computationally intensive components of seismic tomography when imaging structurally complex media. Efficient, accurate, and robust raytracing is possible by first recovering approximate raypaths from traveltime fields, and then refining the raypaths to a desired accuracy level. Dynamically binned queuing is introduced. The approach optimizes graph-theoretic traveltime computation costs. Pseudo-bending is modified to efficiently refine raypaths in general media. Hypocentral location density functions and relative phase arrival population analysis are used to investigate the Spring, 1996, earthquake swarm at Akutan Volcano, Alaska. The main swarm is postulated to have been associated with a 0.2 km\sp3 intrusion at a depth of less than four kilometers. Decay sequence seismicity is postulated to be a passive response to the stress transient caused by the intrusion. Tomograms are computed for Mt. Spurr, Augustine, and Redoubt Volcanoes, Alaska. Relatively large amplitude, shallow anomalies explain most of the traveltime residual. No large amplitude anomalies are found at depth, and no magma storage areas are imaged. A large amplitude low-velocity anomaly is coincident with a previously proposed geothermal region on the southeast flank of Mt. Spurr. Mt. St. Augustine is found to have a high velocity core

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum
    corecore