201 research outputs found

    Efficient Querying from Weighted Binary Codes

    Full text link
    Binary codes are widely used to represent the data due to their small storage and efficient computation. However, there exists an ambiguity problem that lots of binary codes share the same Hamming distance to a query. To alleviate the ambiguity problem, weighted binary codes assign different weights to each bit of binary codes and compare the binary codes by the weighted Hamming distance. Till now, performing the querying from the weighted binary codes efficiently is still an open issue. In this paper, we propose a new method to rank the weighted binary codes and return the nearest weighted binary codes of the query efficiently. In our method, based on the multi-index hash tables, two algorithms, the table bucket finding algorithm and the table merging algorithm, are proposed to select the nearest weighted binary codes of the query in a non-exhaustive and accurate way. The proposed algorithms are justified by proving their theoretic properties. The experiments on three large-scale datasets validate both the search efficiency and the search accuracy of our method. Especially for the number of weighted binary codes up to one billion, our method shows a great improvement of more than 1000 times faster than the linear scan.Comment: 13 pages, accepted by AAAI202

    Constant Sequence Extension for Fast Search Using Weighted Hamming Distance

    Full text link
    Representing visual data using compact binary codes is attracting increasing attention as binary codes are used as direct indices into hash table(s) for fast non-exhaustive search. Recent methods show that ranking binary codes using weighted Hamming distance (WHD) rather than Hamming distance (HD) by generating query-adaptive weights for each bit can better retrieve query-related items. However, search using WHD is slower than that using HD. One main challenge is that the complexity of extending a monotone increasing sequence using WHD to probe buckets in hash table(s) for existing methods is at least proportional to the square of the sequence length, while that using HD is proportional to the sequence length. To overcome this challenge, we propose a novel fast non-exhaustive search method using WHD. The key idea is to design a constant sequence extension algorithm to perform each sequence extension in constant computational complexity and the total complexity is proportional to the sequence length, which is justified by theoretical analysis. Experimental results show that our method is faster than other WHD-based search methods. Also, compared with the HD-based non-exhaustive search method, our method has comparable efficiency but retrieves more query-related items for the dataset of up to one billion items

    ACIL: Analytic Class-Incremental Learning with Absolute Memorization and Privacy Protection

    Full text link
    Class-incremental learning (CIL) learns a classification model with training data of different classes arising progressively. Existing CIL either suffers from serious accuracy loss due to catastrophic forgetting, or invades data privacy by revisiting used exemplars. Inspired by linear learning formulations, we propose an analytic class-incremental learning (ACIL) with absolute memorization of past knowledge while avoiding breaching of data privacy (i.e., without storing historical data). The absolute memorization is demonstrated in the sense that class-incremental learning using ACIL given present data would give identical results to that from its joint-learning counterpart which consumes both present and historical samples. This equality is theoretically validated. Data privacy is ensured since no historical data are involved during the learning process. Empirical validations demonstrate ACIL's competitive accuracy performance with near-identical results for various incremental task settings (e.g., 5-50 phases). This also allows ACIL to outperform the state-of-the-art methods for large-phase scenarios (e.g., 25 and 50 phases).Comment: published in NeurIPS 202

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Full text link
    Core-collapse supernova (CCSN) is one of the most energetic astrophysical events in the Universe. The early and prompt detection of neutrinos before (pre-SN) and during the SN burst is a unique opportunity to realize the multi-messenger observation of the CCSN events. In this work, we describe the monitoring concept and present the sensitivity of the system to the pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is a 20 kton liquid scintillator detector under construction in South China. The real-time monitoring system is designed with both the prompt monitors on the electronic board and online monitors at the data acquisition stage, in order to ensure both the alert speed and alert coverage of progenitor stars. By assuming a false alert rate of 1 per year, this monitoring system can be sensitive to the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos up to about 370 (360) kpc for a progenitor mass of 30MM_{\odot} for the case of normal (inverted) mass ordering. The pointing ability of the CCSN is evaluated by using the accumulated event anisotropy of the inverse beta decay interactions from pre-SN or SN neutrinos, which, along with the early alert, can play important roles for the followup multi-messenger observations of the next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure

    Search for heavy resonances decaying to two Higgs bosons in final states containing four b quarks

    Get PDF
    A search is presented for narrow heavy resonances X decaying into pairs of Higgs bosons (H) in proton-proton collisions collected by the CMS experiment at the LHC at root s = 8 TeV. The data correspond to an integrated luminosity of 19.7 fb(-1). The search considers HH resonances with masses between 1 and 3 TeV, having final states of two b quark pairs. Each Higgs boson is produced with large momentum, and the hadronization products of the pair of b quarks can usually be reconstructed as single large jets. The background from multijet and t (t) over bar events is significantly reduced by applying requirements related to the flavor of the jet, its mass, and its substructure. The signal would be identified as a peak on top of the dijet invariant mass spectrum of the remaining background events. No evidence is observed for such a signal. Upper limits obtained at 95 confidence level for the product of the production cross section and branching fraction sigma(gg -> X) B(X -> HH -> b (b) over barb (b) over bar) range from 10 to 1.5 fb for the mass of X from 1.15 to 2.0 TeV, significantly extending previous searches. For a warped extra dimension theory with amass scale Lambda(R) = 1 TeV, the data exclude radion scalar masses between 1.15 and 1.55 TeV

    Measurement of the top quark mass using charged particles in pp collisions at root s=8 TeV

    Get PDF
    Peer reviewe

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Online Hashing with Efficient Updating of Binary Codes

    No full text
    Online hashing methods are efficient in learning the hash functions from the streaming data. However, when the hash functions change, the binary codes for the database have to be recomputed to guarantee the retrieval accuracy. Recomputing the binary codes by accumulating the whole database brings a timeliness challenge to the online retrieval process. In this paper, we propose a novel online hashing framework to update the binary codes efficiently without accumulating the whole database. In our framework, the hash functions are fixed and the projection functions are introduced to learn online from the streaming data. Therefore, inefficient updating of the binary codes by accumulating the whole database can be transformed to efficient updating of the binary codes by projecting the binary codes into another binary space. The queries and the binary code database are projected asymmetrically to further improve the retrieval accuracy. The experiments on two multi-label image databases demonstrate the effectiveness and the efficiency of our method for multi-label image retrieval
    corecore