78 research outputs found
Recommended from our members
Platform-level protection for interacting mobile apps
In a modern mobile platform, apps are mutually distrustful, but they share the same device and frequently interact with each other. This dissertation shows how existing platforms, like Android and iOS, often fail to support important data protection scenarios, and describes two systems to improve platform-level security.
First, many data leaks in existing platforms are due to the lack of information flow control for inter-app data exchanges. For example, a document viewer that opens an attachment from an email client often further discloses the attachment to other apps or to the network. To prevent such leaks, we need strict information flow confinement, but a challenge to enforce such confinement in existing platforms is the potential disruptions to confined apps. We present Maxoid, a system that uses context-aware custom views of apps' storage state to make information flow enforcement backward compatible.
Second, apps' abstraction of data has diverged from platforms' abstraction of data. Modern mobile apps heavily rely on structured data, and relational databases have become the hub for apps' internal data management. However, in existing platforms, protection mechanisms are coarse-grained and have no visibility to the structures of apps' data. In these platforms, access control is a mixture of coarse-grained mechanisms and many ad hoc user-level checks, making data protection unprincipled and error-prone. We present Earp, a new mobile platform that combines simple object-level permissions and capability relationships among objects to naturally protect structured data for mobile apps. It achieves a uniform abstraction for storing, sharing and efficiently protecting structured data, for both storage and inter-app services.Computer Science
Learning to Skip for Language Modeling
Overparameterized large-scale language models have impressive generalization
performance of in-context few-shot learning. However, most language models
allocate the same amount of parameters or computation to each token,
disregarding the complexity or importance of the input data. We argue that in
language model pretraining, a variable amount of computation should be assigned
to different tokens, and this can be efficiently achieved via a simple routing
mechanism. Different from conventional early stopping techniques where tokens
can early exit at only early layers, we propose a more general method that
dynamically skips the execution of a layer (or module) for any input token with
a binary router. In our extensive evaluation across 24 NLP tasks, we
demonstrate that the proposed method can significantly improve the 1-shot
performance compared to other competitive baselines only at mild extra cost for
inference
An Obligate Role of Oxytocin Neurons in Diet Induced Energy Expenditure
Oxytocin neurons represent one of the major subsets of neurons in the paraventricular hypothalamus (PVH), a critical brain region for energy homeostasis. Despite substantial evidence supporting a role of oxytocin in body weight regulation, it remains controversial whether oxytocin neurons directly regulate body weight homeostasis, feeding or energy expenditure. Pharmacologic doses of oxytocin suppress feeding through a proposed melanocortin responsive projection from the PVH to the hindbrain. In contrast, deficiency in oxytocin or its receptor leads to reduced energy expenditure without feeding abnormalities. To test the physiological function of oxytocin neurons, we specifically ablated oxytocin neurons in adult mice. Our results show that oxytocin neuron ablation in adult animals has no effect on body weight, food intake or energy expenditure on a regular diet. Interestingly, male mice lacking oxytocin neurons are more sensitive to high fat diet-induced obesity due solely to reduced energy expenditure. In addition, despite a normal food intake, these mice exhibit a blunted food intake response to leptin administration. Thus, our study suggests that oxytocin neurons are required to resist the obesity associated with a high fat diet; but their role in feeding is permissive and can be compensated for by redundant pathways
Genomic insights into local adaptation and future climate-induced vulnerability of a keystone forest tree in East Asia
Assessment of population vulnerability and adaptive capacity under climate change is crucial for informing conservation strategies. Sang et al. assemble a reference genome for Populus koreana and combine population genomics and modelling to predict spatiotemporal responses to climate change.Rapid global climate change is posing a substantial threat to biodiversity. The assessment of population vulnerability and adaptive capacity under climate change is crucial for informing conservation and mitigation strategies. Here we generate a chromosome-scale genome assembly and re-sequence genomes of 230 individuals collected from 24 populations for Populus koreana, a pioneer and keystone tree species in temperate forests of East Asia. We integrate population genomics and environmental variables to reveal a set of climate-associated single-nucleotide polymorphisms, insertion/deletions and structural variations, especially numerous adaptive non-coding variants distributed across the genome. We incorporate these variants into an environmental modeling scheme to predict a highly spatiotemporal shift of this species in response to future climate change. We further identify the most vulnerable populations that need conservation priority and many candidate genes and variants that may be useful for forest tree breeding with special aims. Our findings highlight the importance of integrating genomic and environmental data to predict adaptive capacity of a key forest to rapid climate change in the future
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
We summarize the results of a host of efforts using giant automatic speech
recognition (ASR) models pre-trained using large, diverse unlabeled datasets
containing approximately a million hours of audio. We find that the combination
of pre-training, self-training and scaling up model size greatly increases data
efficiency, even for extremely large tasks with tens of thousands of hours of
labeled data. In particular, on an ASR task with 34k hours of labeled data, by
fine-tuning an 8 billion parameter pre-trained Conformer model we can match
state-of-the-art (SoTA) performance with only 3% of the training data and
significantly improve SoTA with the full training set. We also report on the
universal benefits gained from using big pre-trained and self-trained models
for a large set of downstream tasks that cover a wide range of speech domains
and span multiple orders of magnitudes of dataset sizes, including obtaining
SoTA performance on many public benchmarks. In addition, we utilize the learned
representation of pre-trained networks to achieve SoTA results on non-ASR
tasks.Comment: 14 pages, 7 figures, 13 tables; v2: minor corrections, reference
baselines and bibliography updated; v3: corrections based on reviewer
feedback, bibliography update
PaLI-X: On Scaling up a Multilingual Vision and Language Model
We present the training recipe and results of scaling up PaLI-X, a
multilingual vision and language model, both in terms of size of the components
and the breadth of its training task mixture. Our model achieves new levels of
performance on a wide-range of varied and complex tasks, including multiple
image-based captioning and question-answering tasks, image-based document
understanding and few-shot (in-context) learning, as well as object detection,
video question answering, and video captioning. PaLI-X advances the
state-of-the-art on most vision-and-language benchmarks considered (25+ of
them). Finally, we observe emerging capabilities, such as complex counting and
multilingual object detection, tasks that are not explicitly in the training
mix
- …