64 research outputs found
Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression
There have been significant advancements made by large language models (LLMs)
in various aspects of our daily lives. LLMs serve as a transformative force in
natural language processing, finding applications in text generation,
translation, sentiment analysis, and question-answering. The accomplishments of
LLMs have led to a substantial increase in research efforts in this domain. One
specific two-layer regression problem has been well-studied in prior works,
where the first layer is activated by a ReLU unit, and the second layer is
activated by a softmax unit. While previous works provide a solid analysis of
building a two-layer regression, there is still a gap in the analysis of
constructing regression problems with more than two layers.
In this paper, we take a crucial step toward addressing this problem: we
provide an analysis of a two-layer regression problem. In contrast to previous
works, our first layer is activated by a softmax unit. This sets the stage for
future analyses of creating more activation functions based on the softmax
function. Rearranging the softmax function leads to significantly different
analyses. Our main results involve analyzing the convergence properties of an
approximate Newton method used to minimize the regularized training loss. We
prove that the loss function for the Hessian matrix is positive definite and
Lipschitz continuous under certain assumptions. This enables us to establish
local convergence guarantees for the proposed training algorithm. Specifically,
with an appropriate initialization and after iterations,
our algorithm can find an -approximate minimizer of the training loss
with high probability. Each iteration requires approximately time, where is the model size, is the input matrix, and
is the matrix multiplication exponent
Gender Bias in Large Language Models across Multiple Languages
With the growing deployment of large language models (LLMs) across various
applications, assessing the influence of gender biases embedded in LLMs becomes
crucial. The topic of gender bias within the realm of natural language
processing (NLP) has gained considerable focus, particularly in the context of
English. Nonetheless, the investigation of gender bias in languages other than
English is still relatively under-explored and insufficiently analyzed. In this
work, We examine gender bias in LLMs-generated outputs for different languages.
We use three measurements: 1) gender bias in selecting descriptive words given
the gender-related context. 2) gender bias in selecting gender-related pronouns
(she/he) given the descriptive words. 3) gender bias in the topics of
LLM-generated dialogues. We investigate the outputs of the GPT series of LLMs
in various languages using our three measurement methods. Our findings revealed
significant gender biases across all the languages we examined.Comment: 20 pages, 27 tables, 7 figures, submitted to ACL202
Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
3D-aware image synthesis aims at learning a generative model that can render
photo-realistic 2D images while capturing decent underlying 3D shapes. A
popular solution is to adopt the generative adversarial network (GAN) and
replace the generator with a 3D renderer, where volume rendering with neural
radiance field (NeRF) is commonly used. Despite the advancement of synthesis
quality, existing methods fail to obtain moderate 3D shapes. We argue that,
considering the two-player game in the formulation of GANs, only making the
generator 3D-aware is not enough. In other words, displacing the generative
mechanism only offers the capability, but not the guarantee, of producing
3D-aware images, because the supervision of the generator primarily comes from
the discriminator. To address this issue, we propose GeoD through learning a
geometry-aware discriminator to improve 3D-aware GANs. Concretely, besides
differentiating real and fake samples from the 2D image space, the
discriminator is additionally asked to derive the geometry information from the
inputs, which is then applied as the guidance of the generator. Such a simple
yet effective design facilitates learning substantially more accurate 3D
shapes. Extensive experiments on various generator architectures and training
datasets verify the superiority of GeoD over state-of-the-art alternatives.
Moreover, our approach is registered as a general framework such that a more
capable discriminator (i.e., with a third task of novel view synthesis beyond
domain classification and geometry extraction) can further assist the generator
with a better multi-view consistency.Comment: Accepted by NeurIPS 2022. Project page:
https://vivianszf.github.io/geo
Cross Entropy versus Label Smoothing: A Neural Collapse Perspective
Label smoothing loss is a widely adopted technique to mitigate overfitting in
deep neural networks. This paper studies label smoothing from the perspective
of Neural Collapse (NC), a powerful empirical and theoretical framework which
characterizes model behavior during the terminal phase of training. We first
show empirically that models trained with label smoothing converge faster to
neural collapse solutions and attain a stronger level of neural collapse.
Additionally, we show that at the same level of NC1, models under label
smoothing loss exhibit intensified NC2. These findings provide valuable
insights into the performance benefits and enhanced model calibration under
label smoothing loss. We then leverage the unconstrained feature model to
derive closed-form solutions for the global minimizers for both loss functions
and further demonstrate that models under label smoothing have a lower
conditioning number and, therefore, theoretically converge faster. Our study,
combining empirical evidence and theoretical results, not only provides nuanced
insights into the differences between label smoothing and cross-entropy losses,
but also serves as an example of how the powerful neural collapse framework can
be used to improve our understanding of DNNs
LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis
This work presents an easy-to-use regularizer for GAN training, which helps
explicitly link some axes of the latent space to a set of pixels in the
synthesized image. Establishing such a connection facilitates a more convenient
local control of GAN generation, where users can alter the image content only
within a spatial area simply by partially resampling the latent code.
Experimental results confirm four appealing properties of our regularizer,
which we call LinkGAN. (1) The latent-pixel linkage is applicable to either a
fixed region (\textit{i.e.}, same for all instances) or a particular semantic
category (i.e., varying across instances), like the sky. (2) Two or multiple
regions can be independently linked to different latent axes, which further
supports joint control. (3) Our regularizer can improve the spatial
controllability of both 2D and 3D-aware GAN models, barely sacrificing the
synthesis performance. (4) The models trained with our regularizer are
compatible with GAN inversion techniques and maintain editability on real
images
propnet: Propagating 2D Annotation to 3D Segmentation for Gastric Tumors on CT Scans
**Background:** Accurate 3D CT scan segmentation of gastric tumors is pivotal
for diagnosis and treatment. The challenges lie in the irregular shapes,
blurred boundaries of tumors, and the inefficiency of existing methods.
**Purpose:** We conducted a study to introduce a model, utilizing
human-guided knowledge and unique modules, to address the challenges of 3D
tumor segmentation.
**Methods:** We developed the PropNet framework, propagating radiologists'
knowledge from 2D annotations to the entire 3D space. This model consists of a
proposing stage for coarse segmentation and a refining stage for improved
segmentation, using two-way branches for enhanced performance and an up-down
strategy for efficiency.
**Results:** With 98 patient scans for training and 30 for validation, our
method achieves a significant agreement with manual annotation (Dice of 0.803)
and improves efficiency. The performance is comparable in different scenarios
and with various radiologists' annotations (Dice between 0.785 and 0.803).
Moreover, the model shows improved prognostic prediction performance (C-index
of 0.620 vs. 0.576) on an independent validation set of 42 patients with
advanced gastric cancer.
**Conclusions:** Our model generates accurate tumor segmentation efficiently
and stably, improving prognostic performance and reducing high-throughput image
reading workload. This model can accelerate the quantitative analysis of
gastric tumors and enhance downstream task performance
Effects of the interactions between platelets with other cells in tumor growth and progression
It has been confirmed that platelets play a key role in tumorigenesis. Tumor-activated platelets can recruit blood cells and immune cells to migrate, establish an inflammatory tumor microenvironment at the sites of primary and metastatic tumors. On the other hand, they can also promote the differentiation of mesenchymal cells, which can accelerate the proliferation, genesis and migration of blood vessels. The role of platelets in tumors has been well studied. However, a growing number of studies suggest that interactions between platelets and immune cells (e.g., dendritic cells, natural killer cells, monocytes, and red blood cells) also play an important role in tumorigenesis and tumor development. In this review, we summarize the major cells that are closely associated with platelets and discuss the essential role of the interaction between platelets with these cells in tumorigenesis and tumor development
Small-molecule activation of lysosomal TRP channels ameliorates Duchenne muscular dystrophy in mouse models
Duchenne muscular dystrophy (DMD) is a devastating disease caused by mutations in dystrophin that compromise sarcolemma integrity. Currently, there is no treatment for DMD. Mutations in transient receptor potential mucolipin 1 (ML1), a lysosomal Ca2+ channel required for lysosomal exocytosis, produce a DMD-like phenotype. Here, we show that transgenic overexpression or pharmacological activation of ML1 in vivo facilitates sarcolemma repair and alleviates the dystrophic phenotypes in both skeletal and cardiac muscles of mdx mice (a mouse model of DMD). Hallmark dystrophic features of DMD, including myofiber necrosis, central nucleation, fibrosis, elevated serum creatine kinase levels, reduced muscle force, impaired motor ability, and dilated cardiomyopathies, were all ameliorated by increasing ML1 activity. ML1-dependent activation of transcription factor EB (TFEB) corrects lysosomal insufficiency to diminish muscle damage. Hence, targeting lysosomal Ca2+ channels may represent a promising approach to treat DMD and related muscle diseases
Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot
Incorporating a robotic manipulator into a wheel-legged robot enhances its
agility and expands its potential for practical applications. However, the
presence of potential instability and uncertainties presents additional
challenges for control objectives. In this paper, we introduce an
arm-constrained curriculum learning architecture to tackle the issues
introduced by adding the manipulator. Firstly, we develop an arm-constrained
reinforcement learning algorithm to ensure safety and stability in control
performance. Additionally, to address discrepancies in reward settings between
the arm and the base, we propose a reward-aware curriculum learning method. The
policy is first trained in Isaac gym and transferred to the physical robot to
do dynamic grasping tasks, including the door-opening task, fan-twitching task
and the relay-baton-picking and following task. The results demonstrate that
our proposed approach effectively controls the arm-equipped wheel-legged robot
to master dynamic grasping skills, allowing it to chase and catch a moving
object while in motion. Please refer to our website
(https://acodedog.github.io/wheel-legged-loco-manipulation) for the code and
supplemental videos
Evolution of LysM-RLK Gene Family in Wild and Cultivated Peanut Species
In legumes, a LysM-RLK perception of rhizobial lipo-chitooligosaccharides (LCOs) known as Nod factors (NFs), triggers a signaling pathway related to the onset of symbiosis development. On the other hand, activation of LysM-RLKs upon recognition of chitin-derived short-chitooligosaccharides initiates defense responses. In this work, we identified the members of the LysM-RLK family in cultivated (Arachis hypogaea L.) and wild (A. duranensis and A. ipaensis) peanut genomes, and reconstructed the evolutionary history of the family. Phylogenetic analyses allowed the building of a framework to reinterpret the functional data reported on peanut LysM-RLKs. In addition, the potential involvement of two identified proteins in NF perception and immunity was assessed by gene expression analyses. Results indicated that peanut LysM-RLK is a highly diverse family. Digital expression analyses indicated that some A. hypogaea LysM-RLK receptors were upregulated during the early and late stages of symbiosis. In addition, expression profiles of selected LysM-RLKs proteins suggest participation in the receptor network mediating NF and/or chitosan perception. The analyses of LysM-RLK in the non-model legume peanut can contribute to gaining insight into the molecular basis of legume–microbe interactions and to the understanding of the evolutionary history of this gene family within the Fabaceae.Fil: Rodriguez Melo, Johan Stiben. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Agrobiotecnología del Litoral. Universidad Nacional del Litoral. Instituto de Agrobiotecnología del Litoral; ArgentinaFil: Tonelli, Maria Laura. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas Fisicoquímicas y Naturales. Instituto de Investigaciones Agrobiotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigaciones Agrobiotecnológicas; ArgentinaFil: Barbosa, María Carolina. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas Fisicoquímicas y Naturales. Instituto de Investigaciones Agrobiotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigaciones Agrobiotecnológicas; ArgentinaFil: Ariel, Federico. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Agrobiotecnología del Litoral. Universidad Nacional del Litoral. Instituto de Agrobiotecnología del Litoral; ArgentinaFil: Zhao, Zifan. University of Florida; Estados UnidosFil: Wang, Jianping. University of Florida; Estados UnidosFil: Fabra, Adriana Isidora. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas Fisicoquímicas y Naturales. Instituto de Investigaciones Agrobiotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigaciones Agrobiotecnológicas; ArgentinaFil: Ibañez, Fernando Julio. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas Fisicoquímicas y Naturales. Instituto de Investigaciones Agrobiotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigaciones Agrobiotecnológicas; Argentin
- …