824 research outputs found
Exponential Space Improvement for minwise Based Algorithms
In this paper we introduce a general framework that exponentially improves the space, the degree of independence, and the time needed by min-wise based algorithms. The authors, in SODA 2011, we introduced an exponential time improvement for min-wise based algorithms by defining and constructing an almost k-min-wise independent family of hash functions. Here we develop an alternative approach that achieves both exponential time and exponential space improvement. The new approach relaxes the need for approximately min-wise hash functions, hence gets around the Omega(log(1/epsilon)) independence lower bound in [Patrascu 2010]. This is done by defining and constructing a d-k-min-wise independent family of hash functions. Surprisingly, for most cases only 8-wise independence is needed for the additional improvement. Moreover, as the degree of independence is a small constant, our function can be implemented efficiently.
Informally, under this definition, all subsets of size d of any fixed set X have an equal probability to have hash values among the minimal k values in X, where the probability is over the random choice of hash function from the family. This property measures the randomness of the family, as choosing a truly random function, obviously, satisfies the definition for d=k=|X|. We define and give an efficient time and space construction of approximately d-k-min-wise independent family of hash functions for the case where d=2, as this is sufficient for the additional exponential improvement.
We discuss how this construction can be used to improve many min-wise based algorithms. To our knowledge such definitions, for hash functions, were never studied and no construction was given before.
As an example we show how to apply it for similarity and rarity estimation over data streams. Other min-wise based algorithms, can be adjusted in the same way
Recommended from our members
Inflation and Dark Energy from spectroscopy at z > 2
The expansion of the Universe is understood to have accelerated during two
epochs: in its very first moments during a period of Inflation and much more
recently, at z < 1, when Dark Energy is hypothesized to drive cosmic
acceleration. The undiscovered mechanisms behind these two epochs represent
some of the most important open problems in fundamental physics. The large
cosmological volume at 2 < z < 5, together with the ability to efficiently
target high- galaxies with known techniques, enables large gains in the
study of Inflation and Dark Energy. A future spectroscopic survey can test the
Gaussianity of the initial conditions up to a factor of ~50 better than our
current bounds, crossing the crucial theoretical threshold of
of order unity that separates single field and
multi-field models. Simultaneously, it can measure the fraction of Dark Energy
at the percent level up to , thus serving as an unprecedented test of
the standard model and opening up a tremendous discovery space
Analysis of Blood Stem Cell Activity and Cystatin Gene Expression in a Mouse Model Presenting a Chromosomal Deletion Encompassing Csta and Stfa2l1
The cystatin protein superfamily is characterized by the presence of conserved sequences that display cysteine protease inhibitory activity (e.g., towards cathepsins). Type 1 and 2 cystatins are encoded by 25 genes of which 23 are grouped in 2 clusters localized on mouse chromosomes 16 and 2. The expression and essential roles of most of these genes in mouse development and hematopoiesis remain poorly characterized. In this study, we describe a set of quantitative real-time PCR assays and a global expression profile of cystatin genes in normal mouse tissues. Benefiting from our collection of DelES embryonic stem cell clones harboring large chromosomal deletions (to be reported elsewhere), we selected a clone in which a 95-kb region of chromosome 16 is missing (Del16qB3Δ/+). In this particular clone, 2 cystatin genes, namely Csta and Stfa2l1 are absent along with 2 other genes (Fam162a, Ccdc58) and associated intergenic regions. From this line, we established a new homozygous mutant mouse model (Del16qB3Δ/16qB3Δ) to assess the in vivo biological functions of the 2 deleted cystatins. Stfa2l1 gene expression is high in wild-type fetal liver, bone marrow, and spleen, while Csta is ubiquitously expressed. Homozygous Del16qB3Δ/16qB3Δ animals are phenotypically normal, fertile, and not overtly susceptible to spontaneous or irradiation-induced tumor formation. The hematopoietic stem and progenitor cell activity in these mutant mice are also normal. Interestingly, quantitative real-time PCR expression profiling reveals a marked increase in the expression levels of Stfa2l1/Csta phylogenetically-related genes (Stfa1, Stfa2, and Stfa3) in Del16qB3Δ/16qB3Δ hematopoietic tissues, suggesting that these candidate genes might be contributing to compensatory mechanisms. Overall, this study presents an optimized approach to globally monitor cystatin gene expression as well as a new mouse model deficient in Stfa2l1/Csta genes, expanding the available tools to dissect cystatin roles under normal and pathological conditions
INSPIRE: A phase III study of the BLP25 liposome vaccine (L-BLP25) in Asian patients with unresectable stage III non-small cell lung cancer
<p>Abstract</p> <p>Background</p> <p>Previous research suggests the therapeutic cancer vaccine L-BLP25 potentially provides a survival benefit in patients with locally advanced unresectable stage III non-small cell lung carcinoma (NSCLC). These promising findings prompted the phase III study, INSPIRE, in patients of East-Asian ethnicity. East-Asian ethnicity is an independent favourable prognostic factor for survival in NSCLC. The favourable prognosis is most likely due to a higher incidence of EGFR mutations among this patient population.</p> <p>Methods/design</p> <p>The primary objective of the INSPIRE study is to assess the treatment effect of L-BLP25 plus best supportive care (BSC), as compared to placebo plus BSC, on overall survival time in East-Asian patients with unresectable stage III NSCLC and either documented stable disease or an objective response according to the Response Evaluation Criteria in Solid Tumors (RECIST) criteria following primary chemoradiotherapy. Those in the L-BLP25 arm will receive a single intravenous infusion of cyclophosphamide (300 mg/m<sup>2</sup>) 3 days before the first L-BLP25 vaccination, with a corresponding intravenous infusion of saline to be given in the control arm. A primary treatment phase of 8 subcutaneous vaccinations of L-BLP25 930 μg or placebo at weekly intervals will be followed by a maintenance treatment phase of 6-weekly vaccinations continued until disease progression or discontinuation from the study.</p> <p>Discussion</p> <p>The ongoing INSPIRE study is the first large study of a therapeutic cancer vaccine specifically in an East-Asian population. It evaluates the potential of maintenance therapy with L-BLP25 to prolong survival in East-Asian patients with stage III NSCLC where there are limited treatment options currently available.</p> <p>Study number</p> <p>EMR 63325-012</p> <p>Trial Registration</p> <p>Clinicaltrials.gov reference: <a href="http://www.clinicaltrials.gov/ct2/show/NCT01015443">NCT01015443</a></p
The science behind competition and winning in athletics: Using world-level competition data to explore pacing and tactics
The purpose of this study was to examine whether World Championship and Olympic medallist endurance athletes pace similarly to their race opponents, where and when critical differences in intra-race pacing occur, and the tactical strategies employed to optimally manage energy resources. We analyzed pacing and tactics across the 800, 1,500, 5,000, 10,000 m, marathon and racewalk events, providing a broad overview for optimal preparation for racing and pacing. Official electronic splits from men's (n = 275 performances) and women's (n = 232 performances) distance races between 2013 and 2017 were analyzed. Athletes were grouped for the purposes of analysis and comparison. For the 800 m, these groups were the medalists and those finishing 4th to 8th (“Top 8”). For the 1,500 m, the medalists and Top 8 were joined by those finishing 9th to 12th (“Top 12”), whereas for all other races, the Top 15 were analyzed (those finishing 9th to 15th). One-way repeated measures analysis of variance was conducted on the segment speeds (p < 0.05), with effect sizes for differences calculated using Cohen's d. Positive pacing profiles were common to most 800 m athletes, whereas negative pacing was more common over longer distances. In the 1,500 m, male medalists separated from their rivals in the last 100 m, whereas for women it was after 1,200 m. Similarly, over 5,000 m, male medalists separated from the slowest pack members later (4,200 m; 84% of duration) than women (2,500 m; 50% of duration). In the 10,000 m race, the effect was very pronounced with men packing until 8,000 m, with the Top 8 athletes only dropped at 9,600 m (96% of duration). For women, the slowest pack begin to run slower at only 1,700 m, with the Top 8 finishers dropped at 5,300 m (53% of duration). Such profiles and patterns were seen across all events. It is possible the earlier separation in pacing for women between the medalists and the other runners was because of tactical racing factors such as an early realization of being unable to sustain the required speed, or perhaps because of greater variation in performance abilities
Les droits disciplinaires des fonctions publiques : « unification », « harmonisation » ou « distanciation ». A propos de la loi du 26 avril 2016 relative à la déontologie et aux droits et obligations des fonctionnaires
The production of tt‾ , W+bb‾ and W+cc‾ is studied in the forward region of proton–proton collisions collected at a centre-of-mass energy of 8 TeV by the LHCb experiment, corresponding to an integrated luminosity of 1.98±0.02 fb−1 . The W bosons are reconstructed in the decays W→ℓν , where ℓ denotes muon or electron, while the b and c quarks are reconstructed as jets. All measured cross-sections are in agreement with next-to-leading-order Standard Model predictions.The production of , and is studied in the forward region of proton-proton collisions collected at a centre-of-mass energy of 8 TeV by the LHCb experiment, corresponding to an integrated luminosity of 1.98 0.02 \mbox{fb}^{-1}. The bosons are reconstructed in the decays , where denotes muon or electron, while the and quarks are reconstructed as jets. All measured cross-sections are in agreement with next-to-leading-order Standard Model predictions
Multidifferential study of identified charged hadron distributions in -tagged jets in proton-proton collisions at 13 TeV
Jet fragmentation functions are measured for the first time in proton-proton
collisions for charged pions, kaons, and protons within jets recoiling against
a boson. The charged-hadron distributions are studied longitudinally and
transversely to the jet direction for jets with transverse momentum 20 GeV and in the pseudorapidity range . The
data sample was collected with the LHCb experiment at a center-of-mass energy
of 13 TeV, corresponding to an integrated luminosity of 1.64 fb. Triple
differential distributions as a function of the hadron longitudinal momentum
fraction, hadron transverse momentum, and jet transverse momentum are also
measured for the first time. This helps constrain transverse-momentum-dependent
fragmentation functions. Differences in the shapes and magnitudes of the
measured distributions for the different hadron species provide insights into
the hadronization process for jets predominantly initiated by light quarks.Comment: All figures and tables, along with machine-readable versions and any
supplementary material and additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-013.html (LHCb
public pages
Study of the decay
The decay is studied
in proton-proton collisions at a center-of-mass energy of TeV
using data corresponding to an integrated luminosity of 5
collected by the LHCb experiment. In the system, the
state observed at the BaBar and Belle experiments is
resolved into two narrower states, and ,
whose masses and widths are measured to be where the first uncertainties are statistical and the second
systematic. The results are consistent with a previous LHCb measurement using a
prompt sample. Evidence of a new
state is found with a local significance of , whose mass and width
are measured to be and , respectively. In addition, evidence of a new decay mode
is found with a significance of
. The relative branching fraction of with respect to the
decay is measured to be , where the first
uncertainty is statistical, the second systematic and the third originates from
the branching fractions of charm hadron decays.Comment: All figures and tables, along with any supplementary material and
additional information, are available at
https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-028.html (LHCb
public pages
- …