350 research outputs found

    Balancing clusters to reduce response time variability in large scale image search

    Get PDF
    Many algorithms for approximate nearest neighbor search in high-dimensional spaces partition the data into clusters. At query time, in order to avoid exhaustive search, an index selects the few (or a single) clusters nearest to the query point. Clusters are often produced by the well-known kk-means approach since it has several desirable properties. On the downside, it tends to produce clusters having quite different cardinalities. Imbalanced clusters negatively impact both the variance and the expectation of query response times. This paper proposes to modify kk-means centroids to produce clusters with more comparable sizes without sacrificing the desirable properties. Experiments with a large scale collection of image descriptors show that our algorithm significantly reduces the variance of response times without seriously impacting the search quality

    Integration of Exploration and Search: A Case Study of the M3 Model

    Get PDF
    International audienceEffective support for multimedia analytics applications requires exploration and search to be integrated seamlessly into a single interaction model. Media metadata can be seen as defining a multidimensional media space, casting multimedia analytics tasks as exploration, manipulation and augmentation of that space. We present an initial case study of integrating exploration and search within this multidimensional media space. We extend the M3 model, initially proposed as a pure exploration tool, and show that it can be elegantly extended to allow searching within an exploration context and exploring within a search context. We then evaluate the suitability of relational database management systems, as representatives of today’s data management technologies, for implementing the extended M3 model. Based on our results, we finally propose some research directions for scalability of multimedia analytics

    Privacy-Preserving Outsourced Media Search

    Get PDF
    International audienceThis work proposes a privacy-protection framework for an important application called outsourced media search. This scenario involves a data owner, a client, and an untrusted server, where the owner outsources a search service to the server. Due to lack of trust, the privacy of the client and the owner should be protected. The framework relies on multimedia hashing and symmetric encryption. It requires involved parties to participate in a privacy-enhancing protocol. Additional processing steps are carried out by the owner and the client: (i) before outsourcing low-level media features to the server, the owner has to one-way hash them, and partially encrypt each hash-value; (ii) the client completes the similarity search by re-ranking the most similar candidates received from the server. One-way hashing and encryption add ambiguity to data and make it difficult for the server to infer contents from database items and queries, so the privacy of both the owner and the client is enforced. The proposed framework realizes trade-offs among strength of privacy enforcement, quality of search, and complexity, because the information loss can be tuned during hashing and encryption. Extensive experiments demonstrate the effectiveness and the flexibility of the framework

    Scalability of the NV-tree: Three Experiments

    Get PDF
    International audienceThe NV-tree is a scalable approximate high-dimensional indexing method specifically designed for large-scale visual instance search. In this paper, we report on three experiments designed to evaluate the performance of the NV-tree. Two of these experiments embed standard benchmarks within collections of up to 28.5 billion features, representing the largest single-server collection ever reported in the literature. The results show that indeed the NV-tree performs very well for visual instance search applications over large-scale collections

    The role of local dimensionality measures in benchmarking nearest neighbor search

    Get PDF
    This paper reconsiders common benchmarking approaches to nearest neighbor search. It is shown that the concepts of local intrinsic dimensionality (LID), local relative contrast (RC), and query expansion allow to choose query sets of a wide range of difficulty for real-world datasets. Moreover, the effect of the distribution of these dimensionality measures on the running time performance of implementations is empirically studied. To this end, different visualization concepts are introduced that allow to get a more fine-grained overview of the inner workings of nearest neighbor search principles. Interactive visualizations are available on the companion website.1 The paper closes with remarks about the diversity of datasets commonly used for nearest neighbor search benchmarking. It is shown that such real-world datasets are not diverse: results on a single dataset predict results on all other datasets well

    Do antibiotics have environmental side-effects? Impact of synthetic antibiotics on biogeochemical processes

    No full text
    International audienceAntibiotic use in the early 1900 vastly improved human health but at the same time started an arms race of antibiotic resistance. The widespread use of antibiotics has resulted in ubiquitous trace concentrations of many antibiotics in most environments. Little is known about the impact of these antibiotics on microbial processes or “non-target” organisms. This mini-review summarizes our knowledge of the effect of synthetically produced antibiotics on microorganisms involved in biogeochemical cycling. We found only 31 articles that dealt with the effects of antibiotics on such processes in soil, sediment, or freshwater. We compare the processes, antibiotics, concentration range, source, environment, and experimental approach of these studies. Examining the effects of antibiotics on biogeochemical processes should involve environmentally relevant concentrations (instead of therapeutic), chronic exposure (versus acute), and monitoring of the administered antibiotics. Furthermore, the lack of standardized tests hinders generalizations regarding the effects of antibiotics on biogeochemical processes. We investigated the effects of antibiotics on biogeochemical N cycling, specifically nitrification, denitrification, and anammox. We found that environmentally relevant concentrations of fluoroquinolones and sulfonamides could partially inhibit denitrification. So far, the only documented effects of antibiotic inhibitions were at therapeutic doses on anammox activities. The most studied and inhibited was nitrification (25–100 %) mainly at therapeutic doses and rarely environmentally relevant. We recommend that firm conclusions regarding inhibition of antibiotics at environmentally relevant concentrations remain difficult due to the lack of studies testing low concentrations at chronic exposure. There is thus a need to test the effects of these environmental concentrations on biogeochemical processes to further establish the possible effects on ecosystem functionin

    Characterization of the pace-and-drive capacity of the human sinoatrial node: A 3D in silico study

    Get PDF
    The sinoatrial node (SAN) is a complex structure that spontaneously depolarizes rhythmically (“pacing”) and excites the surrounding non-automatic cardiac cells (“drive”) to initiate each heart beat. However, the mechanisms by which the SAN cells can activate the large and hyperpolarized surrounding cardiac tissue are incompletely understood. Experimental studies demonstrated the presence of an insulating border that separates the SAN from the hyperpolarizing influence of the surrounding myocardium, except at a discrete number of sinoatrial exit pathways (SEPs). We propose a highly detailed 3D model of the human SAN, including 3D SEPs to study the requirements for successful electrical activation of the primary pacemaking structure of the human heart. A total of 788 simulations investigate the ability of the SAN to pace and drive with different heterogeneous characteristics of the nodal tissue (gradient and mosaic models) and myocyte orientation. A sigmoidal distribution of the tissue conductivity combined with a mosaic model of SAN and atrial cells in the SEP was able to drive the right atrium (RA) at varying rates induced by gradual If block. Additionally, we investigated the influence of the SEPs by varying their number, length, and width. SEPs created a transition zone of transmembrane voltage and ionic currents to enable successful pace and drive. Unsuccessful simulations showed a hyperpolarized transmembrane voltage (−66 mV), which blocked the L-type channels and attenuated the sodium-calcium exchanger. The fiber direction influenced the SEPs that preferentially activated the crista terminalis (CT). The location of the leading pacemaker site (LPS) shifted toward the SEP-free areas. LPSs were located closer to the SEP-free areas (3.46 1.42 mm), where the hyperpolarizing influence of the CT was reduced, compared with a larger distance from the LPS to the areas where SEPs were located (7.17 0.98 mm). This study identified the geometrical and electrophysiological aspects of the 3D SAN-SEP-CT structure required for successful pace and drive in silico

    Understanding the Security and Robustness of SIFT

    Get PDF
    Many content-based retrieval systems (CBIRS) describe images using the SIFT local features because they provide very robust recognition capabilities. While SIFT features proved to cope with a wide spectrum of general purpose image distortions, its security has not fully been assessed yet. Hsu \emph{et al.} in~\cite{hsu09:_secur_robus_sift} show that very specific anti-SIFT attacks can jeopardize the keypoint detection. These attacks can delude systems using SIFT targeting application such as image authentication and (pirated) copy detection. Having some expertise in CBIRS, we were extremely concerned by their analysis. This paper presents our own investigations on the impact of these anti-SIFT attacks on a real CBIRS indexing a large collection of images. The attacks are indeed not able to break the system. A detailed analysis explains this assessment

    Searching in one billion vectors: re-rank with source coding

    Get PDF
    Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.Comment: International Conference on Acoustics, Speech and Signal Processing, Prague : Czech Republic (2011
    • 

    corecore