114 research outputs found
Towards Adversarial Robustness of Deep Vision Algorithms
Deep learning methods have achieved great success in solving computer vision
tasks, and they have been widely utilized in artificially intelligent systems
for image processing, analysis, and understanding. However, deep neural
networks have been shown to be vulnerable to adversarial perturbations in input
data. The security issues of deep neural networks have thus come to the fore.
It is imperative to study the adversarial robustness of deep vision algorithms
comprehensively. This talk focuses on the adversarial robustness of image
classification models and image denoisers. We will discuss the robustness of
deep vision algorithms from three perspectives: 1) robustness evaluation (we
propose the ObsAtk to evaluate the robustness of denoisers), 2) robustness
improvement (HAT, TisODE, and CIFS are developed to robustify vision models),
and 3) the connection between adversarial robustness and generalization
capability to new domains (we find that adversarially robust denoisers can deal
with unseen types of real-world noise).Comment: PhD thesi
Operational Semantics for Featherweight Lua
Lua is a small, embedded language to provide scripting in other languages. De- spite a clean, minimal syntax, it is still too complex for formal reasoning because of some syntactic sugar or specific syntax structures in Lua.
This thesis develops Featherweight Lua (FWLua), following the tradition of lan- guages like Featherweight Java[1] and Featherweight JavaScript[2]. The goal is to develop a core of language features that, while remaining simple enough for formal reasoning, also remain faithful to the central characteristics of the language. Specifi- cally for Lua, the core features that are essential for our modeling include:
â First-class functions â Tables as the central data construct â Metatables that provide various âhooksâ to change the behavior of tables
To further validate this approach, we show how an extensive set of features from the full Lua programming language can be reduced to FWLua. Finally, we include a reference implementation written in Haskell as a tool for further testing and experimenting with the language. With this research, we provide a basis for future research into the Lua programming language
Teachersâ knowledge: Teachersâ perceptions and their sources of knowledge in vocabulary instruction
The study investigated EFL teachersâ perceptions and their sources of knowledge in vocabulary instruction at the secondary school level in Addis Ababa, Ethiopia. To fulfill this purpose, an explanatory research design and mixed data analysis methods were employed. The study involved thirty-six English teachers from three representative secondary schools. Data was collected from the participant teachers through a questionnaire and a semi-structured interview. The findings show that participants in the study generally have positive perceptions about vocabulary teaching and learning. According to the participantsâ perspectives, vocabulary is central to language and it is important to language learners in their language learning. This thought was affirmed by participants in both quantitative and qualitative aspects of the study. The finding also revealed teachersâ sources of knowledge in vocabulary instruction. These knowledge sources include teachersâ teaching experience, their disciplinary background, apprenticeship of observation, and others. The discussion of these findings suggests implications for practices and recommendations for future research to improve vocabulary instruction in secondary schools
Towards Enhancing Time Series Contrastive Learning: A Dynamic Bad Pair Mining Approach
Not all positive pairs are beneficial to time series contrastive learning. In
this paper, we study two types of bad positive pairs that can impair the
quality of time series representation learned through contrastive learning: the
noisy positive pair and the faulty positive pair. We observe that, with the
presence of noisy positive pairs, the model tends to simply learn the pattern
of noise (Noisy Alignment). Meanwhile, when faulty positive pairs arise, the
model wastes considerable amount of effort aligning non-representative patterns
(Faulty Alignment). To address this problem, we propose a Dynamic Bad Pair
Mining (DBPM) algorithm, which reliably identifies and suppresses bad positive
pairs in time series contrastive learning. Specifically, DBPM utilizes a memory
module to dynamically track the training behavior of each positive pair along
training process. This allows us to identify potential bad positive pairs at
each epoch based on their historical training behaviors. The identified bad
pairs are subsequently down-weighted through a transformation module, thereby
mitigating their negative impact on the representation learning process. DBPM
is a simple algorithm designed as a lightweight plug-in without learnable
parameters to enhance the performance of existing state-of-the-art methods.
Through extensive experiments conducted on four large-scale, real-world time
series datasets, we demonstrate DBPM's efficacy in mitigating the adverse
effects of bad positive pairs.Comment: ICLR 2024 Camera Ready (https://openreview.net/pdf?id=K2c04ulKXn
Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning
Using information-theoretic principles, we consider the generalization error
(gen-error) of iterative semi-supervised learning (SSL) algorithms that
iteratively generate pseudo-labels for a large amount of unlabelled data to
progressively refine the model parameters. In contrast to most previous works
that {\em bound} the gen-error, we provide an {\em exact} expression for the
gen-error and particularize it to the binary Gaussian mixture model. Our
theoretical results suggest that when the class conditional variances are not
too large, the gen-error decreases with the number of iterations, but quickly
saturates. On the flip side, if the class conditional variances (and so amount
of overlap between the classes) are large, the gen-error increases with the
number of iterations. To mitigate this undesirable effect, we show that
regularization can reduce the gen-error. The theoretical results are
corroborated by extensive experiments on the MNIST and CIFAR datasets in which
we notice that for easy-to-distinguish classes, the gen-error improves after
several pseudo-labelling iterations, but saturates afterwards, and for more
difficult-to-distinguish classes, regularization improves the generalization
performance.Comment: 52 pages, 17 figure
Differences in Species Composition of the Soil Seed Banks among Degraded Patches in an Agro-Pastoral Transition Zone in Inner Mongolian Steppe
Degraded grasslands were distributed in patches characterized by fringed sagebrush (Artemisia frigida), narrowleaf stellera (Stellera chamaejasme), shining speargrass (Achnatherum splendens), or white swordflag (Iris lactea) at an agro-pastoral transition zone of the south Inner Mongolian steppe, which have been retrogressive from a Leymus chinensis steppe. A control patch (undegraded) was located close to the four degraded patches. We investigated the size, composition, species richness of soil seed banks, and its relation to the aboveground vegetation. The density of soil seed banks was highest in the white swordflag patch, intermediate in the shining speargrass and undegraded patches and lowest in the fringed sagebrush and narrowleaf stellera patches. The percentage of the persistent seed bank in the undegraded patch was higher than those in the four degraded patches. Similarities between the soil seed bank of the undegraded patch and degraded patches and between soil seed banks and standing vegetation of the undegraded patch were all low. The potential for in situ regeneration of the established vegetation of the undegraded patch from the soil seed bank is low in all of these four patches. We can assume that restoration of these habitats can not rely on seed banks alone
MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation
This paper addresses the issue of modifying the visual appearance of videos
while preserving their motion. A novel framework, named MagicProp, is proposed,
which disentangles the video editing process into two stages: appearance
editing and motion-aware appearance propagation. In the first stage, MagicProp
selects a single frame from the input video and applies image-editing
techniques to modify the content and/or style of the frame. The flexibility of
these techniques enables the editing of arbitrary regions within the frame. In
the second stage, MagicProp employs the edited frame as an appearance reference
and generates the remaining frames using an autoregressive rendering approach.
To achieve this, a diffusion-based conditional generation model, called
PropDPM, is developed, which synthesizes the target frame by conditioning on
the reference appearance, the target motion, and its previous appearance. The
autoregressive editing approach ensures temporal consistency in the resulting
videos. Overall, MagicProp combines the flexibility of image-editing techniques
with the superior temporal consistency of autoregressive modeling, enabling
flexible editing of object types and aesthetic styles in arbitrary regions of
input videos while maintaining good temporal consistency across frames.
Extensive experiments in various video editing scenarios demonstrate the
effectiveness of MagicProp
MagicVideo: Efficient Video Generation With Latent Diffusion Models
We present an efficient text-to-video generation framework based on latent
diffusion models, termed MagicVideo. Given a text description, MagicVideo can
generate photo-realistic video clips with high relevance to the text content.
With the proposed efficient latent 3D U-Net design, MagicVideo can generate
video clips with 256x256 spatial resolution on a single GPU card, which is 64x
faster than the recent video diffusion model (VDM). Unlike previous works that
train video generation from scratch in the RGB space, we propose to generate
video clips in a low-dimensional latent space. We further utilize all the
convolution operator weights of pre-trained text-to-image generative U-Net
models for faster training. To achieve this, we introduce two new designs to
adapt the U-Net decoder to video data: a framewise lightweight adaptor for the
image-to-video distribution adjustment and a directed temporal attention module
to capture frame temporal dependencies. The whole generation process is within
the low-dimension latent space of a pre-trained variation auto-encoder. We
demonstrate that MagicVideo can generate both realistic video content and
imaginary content in a photo-realistic style with a trade-off in terms of
quality and computational cost. Refer to https://magicvideo.github.io/# for
more examples
- âŠ