295 research outputs found
Fast Locality-Sensitive Hashing Frameworks for Approximate Near Neighbor Search
The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a
general technique for constructing a data structure to answer approximate near
neighbor queries by using a distribution over locality-sensitive
hash functions that partition space. For a collection of points, after
preprocessing, the query time is dominated by evaluations
of hash functions from and hash table lookups and
distance computations where is determined by the
locality-sensitivity properties of . It follows from a recent
result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive
hash functions can be reduced to , leaving the query time to be
dominated by distance computations and
additional word-RAM operations. We state this result as a general framework and
provide a simpler analysis showing that the number of lookups and distance
computations closely match the Indyk-Motwani framework, making it a viable
replacement in practice. Using ideas from another locality-sensitive hashing
framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of
additional word-RAM operations to .Comment: 15 pages, 3 figure
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Retrieval pipelines commonly rely on a term-based search to obtain candidate
records, which are subsequently re-ranked. Some candidates are missed by this
approach, e.g., due to a vocabulary mismatch. We address this issue by
replacing the term-based search with a generic k-NN retrieval algorithm, where
a similarity function can take into account subtle term associations. While an
exact brute-force k-NN search using this similarity function is slow, we
demonstrate that an approximate algorithm can be nearly two orders of magnitude
faster at the expense of only a small loss in accuracy. A retrieval pipeline
using an approximate k-NN search can be more effective and efficient than the
term-based pipeline. This opens up new possibilities for designing effective
retrieval pipelines. Our software (including data-generating code) and
derivative data based on the Stack Overflow collection is available online
Vertex Sparsifiers: New Results from Old Techniques
Given a capacitated graph and a set of terminals ,
how should we produce a graph only on the terminals so that every
(multicommodity) flow between the terminals in could be supported in
with low congestion, and vice versa? (Such a graph is called a
flow-sparsifier for .) What if we want to be a "simple" graph? What if
we allow to be a convex combination of simple graphs?
Improving on results of Moitra [FOCS 2009] and Leighton and Moitra [STOC
2010], we give efficient algorithms for constructing: (a) a flow-sparsifier
that maintains congestion up to a factor of , where , (b) a convex combination of trees over the terminals that maintains
congestion up to a factor of , and (c) for a planar graph , a
convex combination of planar graphs that maintains congestion up to a constant
factor. This requires us to give a new algorithm for the 0-extension problem,
the first one in which the preimages of each terminal are connected in .
Moreover, this result extends to minor-closed families of graphs.
Our improved bounds immediately imply improved approximation guarantees for
several terminal-based cut and ordering problems.Comment: An extended abstract appears in the 13th International Workshop on
Approximation Algorithms for Combinatorial Optimization Problems (APPROX),
2010. Final version to appear in SIAM J. Computin
MultiPic: a standardized set of 750 drawings with norms for six European languages
Numerous studies in psychology, cognitive neuroscience and psycholinguistics have used pictures of objects as stimulus materials. Currently, authors engaged in cross-linguistic work or wishing to run parallel studies at multiple sites where different languages are spoken must rely on rather small sets of black-and-white or colored line drawings. These sets are increasingly experienced as being too limited. Therefore, we constructed a new set of 750 colored pictures of concrete concepts. This set, MultiPic, constitutes a new valuable tool for cognitive scientists investigating language, visual perception, memory and/or attention in monolingual or multilingual populations. Importantly, the MultiPic databank has been normed in six different European languages (British English, Spanish, French, Dutch, Italian and German). All stimuli and norms are freely available at http://www.bcbl.eu/databases/multipi
The prevalence of axial spondyloarthritis in the UK: a cross-sectional cohort study
Background: Accurate prevalence data are important when interpreting diagnostic tests and planning for the health needs of a population, yet no such data exist for axial spondyloarthritis (axSpA) in the UK. In this cross-sectional cohort study we aimed to estimate the prevalence of axSpA in a UK primary care population. Methods: A validated self-completed questionnaire was used to screen primary care patients with low back pain for inflammatory back pain (IBP). Patients with a verifiable pre-existing diagnosis of axSpA were included as positive cases. All other patients meeting the Assessment of SpondyloArthritis international Society (ASAS) IBP criteria were invited to undergo further assessment including MRI scanning, allowing classification according to the European Spondyloarthropathy Study Group (ESSG) and ASAS axSpA criteria, and the modified New York (mNY) criteria for ankylosing spondylitis (AS). Results: Of 978 questionnaires sent to potential participants 505 were returned (response rate 51.6 %). Six subjects had a prior diagnosis of axSpA, 4 of whom met mNY criteria. Thirty eight of 75 subjects meeting ASAS IBP criteria attended review (mean age 53.5 years, 37 % male). The number of subjects satisfying classification criteria was 23 for ESSG, 3 for ASAS (2 clinical, 1 radiological) and 1 for mNY criteria. This equates to a prevalence of 5.3 % (95 % CI 4.0, 6.8) using ESSG, 1.3 % (95 % CI 0.8, 2.3) using ASAS, 0.66 % (95 % CI 0.28, 1.3) using mNY criteria in chronic back pain patients, and 1.2 % (95 % CI 0.9, 1.4) using ESSG, 0.3 % (95 % CI 0.13, 0.48) using ASAS, 0.15 % (95 % CI 0.02, 0.27) using mNY criteria in the general adult primary care population. Conclusions: These are the first prevalence estimates for axSpA in the UK, and will be of importance in planning for the future healthcare needs of this population. Trial registration: Current Controlled Trials ISRCTN7687321
Quantization and Compressive Sensing
Quantization is an essential step in digitizing signals, and, therefore, an
indispensable component of any modern acquisition system. This book chapter
explores the interaction of quantization and compressive sensing and examines
practical quantization strategies for compressive acquisition systems.
Specifically, we first provide a brief overview of quantization and examine
fundamental performance bounds applicable to any quantization approach. Next,
we consider several forms of scalar quantizers, namely uniform, non-uniform,
and 1-bit. We provide performance bounds and fundamental analysis, as well as
practical quantizer designs and reconstruction algorithms that account for
quantization. Furthermore, we provide an overview of Sigma-Delta
() quantization in the compressed sensing context, and also
discuss implementation issues, recovery algorithms and performance bounds. As
we demonstrate, proper accounting for quantization and careful quantizer design
has significant impact in the performance of a compressive acquisition system.Comment: 35 pages, 20 figures, to appear in Springer book "Compressed Sensing
and Its Applications", 201
Promoting the use of Motor Function Measure (MFM) as outcome measure in patients with Duchenne Muscular Dystrophy (DMD) treated by corticosteroids
ObjectivesAssessing muscle function is a key step in measuring changes and evaluating the outcomes of therapeutic interventions in Duchenne Muscular Dystrophy (DMD). Regarding the large use of corticosteroids (CS) in this population to delay the loss of function, our goal was to monitor the evolution of motor function in patients with DMD treated by corticosteroids (CS) and to study the responsiveness of Motor Function Measure (MFM) in this population in order to provide an estimation of the number of subject needed for a clinical trial.MethodA total of 76 patients with DMD, aged 5.9 to 11.8 years, with at least 6 months of follow-up and 2 MFM were enrolled, 30 in the CS treated group (8±1.62 y) and 46 in the untreated group (7.91±1.50 y).ResultsThe relationship between MFM scores and age was studied in CS treated patients and untreated patients. The evolution of these scores was compared between groups, on a 6-, 12- and 24-month period by calculating slopes of change and standardized response mean. At 6, 12 and 24 months, significant differences in the mean score change were found, for all MFM scores, between CS treated patients and untreated patients. For D1 subscore specifically, at 6 months, the increase is significant in the treated group (11.3±14%/y; SRM 0.8) while a decrease is observed in the untreated group (–17.8±17.7%/y; SRM 1). At 12 and 24 months, D1 subscore stabilized for treated patients but declined significantly for untreated boys (–15.5±15.1%/y; SRM 1 at 12 mo and–18.8±7.1%/y; SRM 2.6 at 24 mo). 21 patients lost the ability to walk during the study: 6 in the CS treated group (25% at 24 months, mean age: 10.74±1.28 y) and 15 in the untreated group (64.71% at 24 months, mean age: 9.20±1.78 y).Discussion and conclusionPatients with DMD treated by CS present a different course of the disease described in this paper using the MFM. Based on these results, an estimation of the number of patients needed for clinical trial could be done
Sublinear time algorithms for earth mover's distance
We study the problem of estimating the Earth Mover’s Distance (EMD) between probability distributions
when given access only to samples of the distributions. We give closeness testers and additive-error
estimators over domains in [0, 1][superscript d], with sample complexities independent of domain size – permitting
the testability even of continuous distributions over infinite domains. Instead, our algorithms depend on
other parameters, such as the diameter of the domain space, which may be significantly smaller. We also
prove lower bounds showing the dependencies on these parameters to be essentially optimal. Additionally,
we consider whether natural classes of distributions exist for which there are algorithms with better
dependence on the dimension, and show that for highly clusterable data, this is indeed the case. Lastly,
we consider a variant of the EMD, defined over tree metrics instead of the usual l 1 metric, and give tight
upper and lower bounds
Electrocatalytic Oxidation of Nitric Oxide at Carbon Paste Electrode Modified with Chromium (III) Oxide
Chromium (III) oxide was used as a bulk mediator in carbon paste electrodes to improve the better performance of the carbon electrodes for the detection of nitric oxide in comparison with unmodified electrodes. The reaction mechanism of the electrocatalytic oxidation of NO at the modified electrode was studied using cyclic voltammetry and differential pulse voltammetry. The chemical sensor could be operated under physiological conditions (pH 7.5, 0.1 M phosphate buffer), with an operating potential of 750 mV (vs. Ag/AgCl), in hydrodynamic amperometry. The amperometric response of the sensor showed good linearity up to 200 mmol/L with a detection limit (3σ) of 0.69 mmol/L. The effect of the interferent nitrite was not fatal and could be eliminated by the use of the standard addition method. The new chemical sensor seems also promising to detect NO in car exhaust fumes
- …
