148 research outputs found
Local Decoding and Testing of Polynomials over Grids
The well-known DeMillo-Lipton-Schwartz-Zippel lemma says that n-variate polynomials of total degree at most d over grids, i.e. sets of the form A_1 times A_2 times cdots times A_n, form error-correcting codes (of distance at least 2^{-d} provided min_i{|A_i|}geq 2). In this work we explore their local decodability and local testability. While these aspects have been studied extensively when A_1 = cdots = A_n = F_q are the same finite field, the setting when A_i\u27s are not the full field does not seem to have been explored before.
In this work we focus on the case A_i = {0,1} for every i. We show that for every field (finite or otherwise) there is a test whose query complexity depends only on the degree (and not on the number of variables). In contrast we show that decodability is possible over fields of positive characteristic (with query complexity growing with the degree of the polynomial and the characteristic), but not over the reals, where the query complexity must grow with . As a consequence we get a natural example of a code (one with a transitive group of symmetries) that is locally testable but not locally decodable.
Classical results on local decoding and testing of polynomials have relied on the 2-transitive symmetries of the space of low-degree polynomials (under affine transformations). Grids do not possess this symmetry: So we introduce some new techniques to overcome this handicap and in particular use the hypercontractivity of the (constant weight) noise operator on the Hamming cube
Robot Acting on Moving Bodies (RAMBO): Interaction with tumbling objects
Interaction with tumbling objects will become more common as human activities in space expand. Attempting to interact with a large complex object translating and rotating in space, a human operator using only his visual and mental capacities may not be able to estimate the object motion, plan actions or control those actions. A robot system (RAMBO) equipped with a camera, which, given a sequence of simple tasks, can perform these tasks on a tumbling object, is being developed. RAMBO is given a complete geometric model of the object. A low level vision module extracts and groups characteristic features in images of the object. The positions of the object are determined in a sequence of images, and a motion estimate of the object is obtained. This motion estimate is used to plan trajectories of the robot tool to relative locations rearby the object sufficient for achieving the tasks. More specifically, low level vision uses parallel algorithms for image enhancement by symmetric nearest neighbor filtering, edge detection by local gradient operators, and corner extraction by sector filtering. The object pose estimation is a Hough transform method accumulating position hypotheses obtained by matching triples of image features (corners) to triples of model features. To maximize computing speed, the estimate of the position in space of a triple of features is obtained by decomposing its perspective view into a product of rotations and a scaled orthographic projection. This allows use of 2-D lookup tables at each stage of the decomposition. The position hypotheses for each possible match of model feature triples and image feature triples are calculated in parallel. Trajectory planning combines heuristic and dynamic programming techniques. Then trajectories are created using dynamic interpolations between initial and goal trajectories. All the parallel algorithms run on a Connection Machine CM-2 with 16K processors
Cervicovaginal fluid acetate: a metabolite marker of preterm birth in symptomatic pregnant women
Changes in vaginal microbiota that is associated with preterm birth (PTB) leave specific metabolite fingerprints that can be detected in the cervicovaginal fluid (CVF) using metabolomics techniques. In this study, we characterize and validate the CVF metabolite profile of pregnant women presenting with symptoms of threatened preterm labor (PTL) by both 1H-nuclear magnetic resonance spectroscopy (NMR) and enzyme-based spectrophotometry. We also determine their predictive capacity for PTB, singly, and in combination, with current clinical screening tools – cervicovaginal fetal fibronectin (FFN) and ultrasound cervical length (CL). CVF was obtained by high-vaginal swabs from 82 pregnant women with intact fetal membranes presenting between 24 and 36 weeks gestation with symptoms of threatened, but not established, PTL. Dissolved CVF samples were scanned with a 400 MHz NMR spectrometer. Acetate and other metabolites were identified in the NMR spectrum, integrated for peak area, and normalized to the total spectrum integral. To confirm and validate our observations, acetate concentrations (AceConc) were also determined from a randomly-selected subset of the same samples (n = 57), by spectrophotometric absorption of NADH using an acetic acid assay kit. CVF FFN level, transvaginal ultrasound CL, and vaginal pH were also ascertained. Acetate normalized integral and AceConc were significantly higher in the women who delivered preterm compared to their term counterparts (P = 0.002 and P = 0.006, respectively). The 1H-NMR-derived acetate integrals were strongly correlated with the AceConc estimated by spectrophotometry (r = 0.69; P 0.53 g/l), and of delivery within 2 weeks of the index assessment (acetate integral: AUC = 0.77, 95% CI = 0.58–0.96; AceConc: AUC = 0.68, 95% CI = 0.5–0.9). The predictive accuracy of CVF acetate was similar to CL and FFN. The combination of CVF acetate, FFN, and ultrasound CL in a binary logistic regression model improved the prediction of PTB compared to the three markers individually, but CVF acetate offered no predictive improvement over ultrasound CL combined with CVF FFN. Elevated CVF acetate in women with symptoms of PTL appears predictive of preterm delivery, as well as delivery within 2 weeks of presentation. An assay of acetate in CVF may prove of clinical utility for predicting PTB
A Near-Optimal Polynomial Distance Lemma over Boolean Slices
The celebrated Ore-DeMillo-Lipton-Schwartz-Zippel (ODLSZ) lemma asserts that n-variate non-zero polynomial functions of degree d over a field , are non-zero over any "grid" (points of the form Sⁿ for finite subset S ⊆ ) with probability at least max{|S|^{-d/(|S|-1)},1-d/|S|} over the choice of random point from the grid. In particular, over the Boolean cube (S = {0,1} ⊆ ), the lemma asserts non-zero polynomials are non-zero with probability at least 2^{-d}. In this work we extend the ODLSZ lemma optimally (up to lower-order terms) to "Boolean slices" i.e., points of Hamming weight exactly k. We show that non-zero polynomials on the slice are non-zero with probability (t/n)^{d}(1 - o_{n}(1)) where t = min{k,n-k} for every d ≤ k ≤ (n-d). As with the ODLSZ lemma, our results extend to polynomials over Abelian groups. This bound is tight upto the error term as evidenced by multilinear monomials of degree d, and it is also the case that some corrective term is necessary. A particularly interesting case is the "balanced slice" (k = n/2) where our lemma asserts that non-zero polynomials are non-zero with roughly the same probability on the slice as on the whole cube.
The behaviour of low-degree polynomials over Boolean slices has received much attention in recent years. However, the problem of proving a tight version of the ODLSZ lemma does not seem to have been considered before, except for a recent work of Amireddy, Behera, Paraashar, Srinivasan and Sudan (SODA 2025), who established a sub-optimal bound of approximately ((k/n)⋅ (1-(k/n)))^d using a proof similar to that of the standard ODLSZ lemma.
While the statement of our result mimics that of the ODLSZ lemma, our proof is significantly more intricate and involves spectral reasoning which is employed to show that a natural way of embedding a copy of the Boolean cube inside a balanced Boolean slice is a good sampler
Local Correction of Linear Functions over the Boolean Cube
We consider the task of locally correcting, and locally list-correcting,
multivariate linear functions over the domain over arbitrary fields
and more generally Abelian groups. Such functions form error-correcting codes
of relative distance and we give local-correction algorithms correcting
up to nearly -fraction errors making
queries. This query complexity is optimal up to
factors. We also give local list-correcting algorithms correcting -fraction errors with queries.
These results may be viewed as natural generalizations of the classical work
of Goldreich and Levin whose work addresses the special case where the
underlying group is . By extending to the case where the
underlying group is, say, the reals, we give the first non-trivial locally
correctable codes (LCCs) over the reals (with query complexity being sublinear
in the dimension (also known as message length)).
The central challenge in constructing the local corrector is constructing
"nearly balanced vectors" over that span -- we show how to
construct vectors that do so, with entries in each vector
summing to . The challenge to the local-list-correction algorithms, given
the local corrector, is principally combinatorial, i.e., in proving that the
number of linear functions within any Hamming ball of radius
is . Getting this general
result covering every Abelian group requires integrating a variety of known
methods with some new combinatorial ingredients analyzing the structural
properties of codewords that lie within small Hamming balls.Comment: 61 pages, To Appear in the Proceedings of the 56th Annual ACM
Symposium on Theory of Computing, June 24-28 2024, Vancouver, Canada. Added a
remark on local testing in the revisio
Influence of cochlear implantation on sentence intelligibility and duration
Cochlear implants (CI) allow children with hearing loss (HL) to achieve speech perception and production outcomes that make their spoken speech understandable to normal hearing adult listeners. This capability is characterize by wide variability of scores. In order to understand the factors that contribute to the overall variability, we investigated the effects of duration of cochlear implantation on speech intelligibility and sentence duration over time. Participants were 107 children implanted between the ages of 2 and 4 and tested at 2 time points there they were 8 and 16 years old. Participants repeated McGarr sentences, which vary in length from 3 to 5 to 7 syllables. Recording were analyze using acoustic software to designate the beginning and end of each sentence in listeners who only heard one sentence from one child. Speech intelligibility scores were related statistically to the duration of each sentence. Durations of sentences that approximated that of normal hearing listeners were those with high intelligibility judgment. In addition, it appears that the children with the longest experience of CI use continue to improve their intelligibility. (Sponsored by NIH). © 2013 Acoustical Society of America
Demystifying Platform Requirements for Diverse LLM Inference Use Cases
Large language models (LLMs) have shown remarkable performance across a wide
range of applications, often outperforming human experts. However, deploying
these parameter-heavy models efficiently for diverse inference use cases
requires carefully designed hardware platforms with ample computing, memory,
and network resources. With LLM deployment scenarios and models evolving at
breakneck speed, the hardware requirements to meet SLOs remains an open
research question. In this work, we present an analytical tool, GenZ, to study
the relationship between LLM inference performance and various platform design
parameters. Our analysis provides insights into configuring platforms for
different LLM workloads and use cases. We quantify the platform requirements to
support SOTA LLMs models like LLaMA and GPT-4 under diverse serving settings.
Furthermore, we project the hardware capabilities needed to enable future LLMs
potentially exceeding hundreds of trillions of parameters. The trends and
insights derived from GenZ can guide AI engineers deploying LLMs as well as
computer architects designing next-generation hardware accelerators and
platforms. Ultimately, this work sheds light on the platform design
considerations for unlocking the full potential of large language models across
a spectrum of applications. The source code is available at
https://github.com/abhibambhaniya/GenZ-LLM-Analyzer .Comment: 12 Pages, https://github.com/abhibambhaniya/GenZ-LLM-Analyze
Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017 : a systematic analysis for the Global Burden of Disease Study 2017
Background: The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017 comparative risk assessment (CRA) is a comprehensive approach to risk factor quantification that offers a useful tool for synthesising evidence on risks and risk outcome associations. With each annual GBD study, we update the GBD CRA to incorporate improved methods, new risks and risk outcome pairs, and new data on risk exposure levels and risk outcome associations.
Methods: We used the CRA framework developed for previous iterations of GBD to estimate levels and trends in exposure, attributable deaths, and attributable disability-adjusted life-years (DALYs), by age group, sex, year, and location for 84 behavioural, environmental and occupational, and metabolic risks or groups of risks from 1990 to 2017. This study included 476 risk outcome pairs that met the GBD study criteria for convincing or probable evidence of causation. We extracted relative risk and exposure estimates from 46 749 randomised controlled trials, cohort studies, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. Using the counterfactual scenario of theoretical minimum risk exposure level (TMREL), we estimated the portion of deaths and DALYs that could be attributed to a given risk. We explored the relationship between development and risk exposure by modelling the relationship between the Socio-demographic Index (SDI) and risk-weighted exposure prevalence and estimated expected levels of exposure and risk-attributable burden by SDI. Finally, we explored temporal changes in risk-attributable DALYs by decomposing those changes into six main component drivers of change as follows: (1) population growth; (2) changes in population age structures; (3) changes in exposure to environmental and occupational risks; (4) changes in exposure to behavioural risks; (5) changes in exposure to metabolic risks; and (6) changes due to all other factors, approximated as the risk-deleted death and DALY rates, where the risk-deleted rate is the rate that would be observed had we reduced the exposure levels to the TMREL for all risk factors included in GBD 2017.
Findings: In 2017,34.1 million (95% uncertainty interval [UI] 33.3-35.0) deaths and 121 billion (144-1.28) DALYs were attributable to GBD risk factors. Globally, 61.0% (59.6-62.4) of deaths and 48.3% (46.3-50.2) of DALYs were attributed to the GBD 2017 risk factors. When ranked by risk-attributable DALYs, high systolic blood pressure (SBP) was the leading risk factor, accounting for 10.4 million (9.39-11.5) deaths and 218 million (198-237) DALYs, followed by smoking (7.10 million [6.83-7.37] deaths and 182 million [173-193] DALYs), high fasting plasma glucose (6.53 million [5.23-8.23] deaths and 171 million [144-201] DALYs), high body-mass index (BMI; 4.72 million [2.99-6.70] deaths and 148 million [98.6-202] DALYs), and short gestation for birthweight (1.43 million [1.36-1.51] deaths and 139 million [131-147] DALYs). In total, risk-attributable DALYs declined by 4.9% (3.3-6.5) between 2007 and 2017. In the absence of demographic changes (ie, population growth and ageing), changes in risk exposure and risk-deleted DALYs would have led to a 23.5% decline in DALYs during that period. Conversely, in the absence of changes in risk exposure and risk-deleted DALYs, demographic changes would have led to an 18.6% increase in DALYs during that period. The ratios of observed risk exposure levels to exposure levels expected based on SDI (O/E ratios) increased globally for unsafe drinking water and household air pollution between 1990 and 2017. This result suggests that development is occurring more rapidly than are changes in the underlying risk structure in a population. Conversely, nearly universal declines in O/E ratios for smoking and alcohol use indicate that, for a given SDI, exposure to these risks is declining. In 2017, the leading Level 4 risk factor for age-standardised DALY rates was high SBP in four super-regions: central Europe, eastern Europe, and central Asia; north Africa and Middle East; south Asia; and southeast Asia, east Asia, and Oceania. The leading risk factor in the high-income super-region was smoking, in Latin America and Caribbean was high BMI, and in sub-Saharan Africa was unsafe sex. O/E ratios for unsafe sex in sub-Saharan Africa were notably high, and those for alcohol use in north Africa and the Middle East were notably low.
Interpretation: By quantifying levels and trends in exposures to risk factors and the resulting disease burden, this assessment offers insight into where past policy and programme efforts might have been successful and highlights current priorities for public health action. Decreases in behavioural, environmental, and occupational risks have largely offset the effects of population growth and ageing, in relation to trends in absolute burden. Conversely, the combination of increasing metabolic risks and population ageing will probably continue to drive the increasing trends in non-communicable diseases at the global level, which presents both a public health challenge and opportunity. We see considerable spatiotemporal heterogeneity in levels of risk exposure and risk-attributable burden. Although levels of development underlie some of this heterogeneity, O/E ratios show risks for which countries are overperforming or underperforming relative to their level of development. As such, these ratios provide a benchmarking tool to help to focus local decision making. Our findings reinforce the importance of both risk exposure monitoring and epidemiological research to assess causal connections between risks and health outcomes, and they highlight the usefulness of the GBD study in synthesising data to draw comprehensive and robust conclusions that help to inform good policy and strategic health planning
- …
