115 research outputs found
Phase equilibrium simulation and its application in crystallization processes
Solid-liquid phase equilibrium information is essential to the research and development of crystallization processes. Computer simulation of the multicomponent solid-liquid equilibrium avoids the traditional tedious experimental determination;The phase equilibrium simulation requires an accurate thermodynamic model to describe the solution chemistry and a usable mathematical procedure to obtain reliable solutions: In this work, a modified activity coefficient model is presented. The modification makes the model more practical to use. A new numerical algorithm, which is based on a large-scale optimization technique is used for phase equilibrium calculation. This new method takes advantage of the thermodynamic properties of the solid-liquid equilibrium and unifies thermodynamics and Mathematics; The numerical procedures have real physical meanings. The phase diagram at various temperatures of the industrial important system Na-K-Mg-Cl-NO[subscript]3- H[subscript]2O is calculated using the new method. The results compare well with the available experimental data
A Unified Scheme of ResNet and Softmax
Large language models (LLMs) have brought significant changes to human
society. Softmax regression and residual neural networks (ResNet) are two
important techniques in deep learning: they not only serve as significant
theoretical components supporting the functionality of LLMs but also are
related to many other machine learning and theoretical computer science fields,
including but not limited to image classification, object detection, semantic
segmentation, and tensors.
Previous research works studied these two concepts separately. In this paper,
we provide a theoretical analysis of the regression problem: , where
is a matrix in , is a vector in
, and is the -dimensional vector whose entries are
all . This regression problem is a unified scheme that combines softmax
regression and ResNet, which has never been done before. We derive the
gradient, Hessian, and Lipschitz properties of the loss function. The Hessian
is shown to be positive semidefinite, and its structure is characterized as the
sum of a low-rank matrix and a diagonal matrix. This enables an efficient
approximate Newton method.
As a result, this unified scheme helps to connect two previously thought
unrelated fields and provides novel insight into loss landscape and
optimization for emerging over-parameterized neural networks, which is
meaningful for future research in deep learning models
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Diffusion models have achieved great success in a range of tasks, such as
image synthesis and molecule design. As such successes hinge on large-scale
training data collected from diverse sources, the trustworthiness of these
collected data is hard to control or audit. In this work, we aim to explore the
vulnerabilities of diffusion models under potential training data manipulations
and try to answer: How hard is it to perform Trojan attacks on well-trained
diffusion models? What are the adversarial targets that such Trojan attacks can
achieve? To answer these questions, we propose an effective Trojan attack
against diffusion models, TrojDiff, which optimizes the Trojan diffusion and
generative processes during training. In particular, we design novel
transitions during the Trojan diffusion process to diffuse adversarial targets
into a biased Gaussian distribution and propose a new parameterization of the
Trojan generative process that leads to an effective training objective for the
attack. In addition, we consider three types of adversarial targets: the
Trojaned diffusion models will always output instances belonging to a certain
class from the in-domain distribution (In-D2D attack), out-of-domain
distribution (Out-D2D-attack), and one specific instance (D2I attack). We
evaluate TrojDiff on CIFAR-10 and CelebA datasets against both DDPM and DDIM
diffusion models. We show that TrojDiff always achieves high attack performance
under different adversarial targets using different types of triggers, while
the performance in benign environments is preserved. The code is available at
https://github.com/chenweixin107/TrojDiff.Comment: CVPR202
DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration
We present DeepICP - a novel end-to-end learning-based 3D point cloud
registration framework that achieves comparable registration accuracy to prior
state-of-the-art geometric methods. Different from other keypoint based methods
where a RANSAC procedure is usually needed, we implement the use of various
deep neural network structures to establish an end-to-end trainable network.
Our keypoint detector is trained through this end-to-end structure and enables
the system to avoid the inference of dynamic objects, leverages the help of
sufficiently salient features on stationary objects, and as a result, achieves
high robustness. Rather than searching the corresponding points among existing
points, the key contribution is that we innovatively generate them based on
learned matching probabilities among a group of candidates, which can boost the
registration accuracy. Our loss function incorporates both the local similarity
and the global geometric constraints to ensure all above network designs can
converge towards the right direction. We comprehensively validate the
effectiveness of our approach using both the KITTI dataset and the
Apollo-SouthBay dataset. Results demonstrate that our method achieves
comparable or better performance than the state-of-the-art geometry-based
methods. Detailed ablation and visualization analysis are included to further
illustrate the behavior and insights of our network. The low registration error
and high robustness of our method makes it attractive for substantial
applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results
updated, accepted by ICCV 201
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Large language models (LLMs) have played a pivotal role in revolutionizing
various facets of our daily existence. Solving attention regression is a
fundamental task in optimizing LLMs. In this work, we focus on giving a
provable guarantee for the one-layer attention network objective function
. Here is Kronecker product between and
. is a matrix in , is the -th block of
. The are variables we want to
learn. and is one
entry at -th row and -th column of ,
is the -column vector of , and is the
vectorization of .
In a multi-layer LLM network, the matrix can
be viewed as the output of a layer, and can be viewed as the input of a layer. The matrix version of can
be viewed as and can be viewed as . We provide an iterative
greedy algorithm to train loss function up that runs in
time. Here denotes the time of multiplying matrix
another matrix, and denotes the exponent of
matrix multiplication
Visual Persuasion: Inferring Communicative Intents of Images
In this paper we introduce the novel problem of under-standing visual persuasion. Modern mass media make ex-tensive use of images to persuade people to make commer-cial and political decisions. These effects and techniques are widely studied in the social sciences, but behavioral studies do not scale to massive datasets. Computer vision has made great strides in building syntactical representa-tions of images, such as detection and identification of ob-jects. However, the pervasive use of images for commu-nicative purposes has been largely ignored. We extend the significant advances in syntactic analysis in computer vi-sion to the higher-level challenge of understanding the un-derlying communicative intent implied in images. We be-gin by identifying nine dimensions of persuasive intent la-tent in images of politicians, such as “socially dominant,” “energetic, ” and “trustworthy, ” and propose a hierarchical model that builds on the layer of syntactical attributes, such as “smile ” and “waving hand, ” to predict the intents pre-sented in the images. To facilitate progress, we introduce a new dataset of 1,124 images of politicians labeled with ground-truth intents in the form of rankings. This study demonstrates that a systematic focus on visual persuasion opens up the field of computer vision to a new class of inves-tigations around mediated images, intersecting with media analysis, psychology, and political communication. 1
Learning Point-Language Hierarchical Alignment for 3D Visual Grounding
This paper presents a novel hierarchical alignment model (HAM) that learns
multi-granularity visual and linguistic representations in an end-to-end
manner. We extract key points and proposal points to model 3D contexts and
instances, and propose point-language alignment with context modulation (PLACM)
mechanism, which learns to gradually align word-level and sentence-level
linguistic embeddings with visual representations, while the modulation with
the visual context captures latent informative relationships. To further
capture both global and local relationships, we propose a spatially
multi-granular modeling scheme that applies PLACM to both global and local
fields. Experimental results demonstrate the superiority of HAM, with
visualized results showing that it can dynamically model fine-grained visual
and linguistic representations. HAM outperforms existing methods by a
significant margin and achieves state-of-the-art performance on two publicly
available datasets, and won the championship in ECCV 2022 ScanRefer challenge.
Code is available at~\url{https://github.com/PPjmchen/HAM}.Comment: Champion on ECCV 2022 ScanRefer Challeng
A promising Na3V2(PO4)(3) cathode for use in the construction of high energy batteries
High-energy batteries need significant cathodes which can simultaneously provide large specific
capacities and high discharge plateaus. NASICON-structured Na3V2(PO4)3 (NVP) has been utilised as a
promising cathode to meet this requirement and be used in the construction of high energy batteries. For
a hybrid-ion battery by employing metallic lithium as an anode, NVP exhibits an initial specific capacity of
170 mA h g 1 in the voltage range of 1.6–4.8 V with a long discharge plateau around 3.7 V. Three Na(2)
sites for NVP are found capable to be utilised through the application of a wide voltage window but only
two of them are able to undergo ions exchange to produce a NaLi2V2(PO4)3 phase. However, a hybrid-ion
migration mechanism is suggested to exist to describe the whole ion transport in which the effects of a
Na-ion ‘‘barrier’’ results in a lowered ion diffusion rate and observed specific capacity.
1. Introduction
Lithium-ion battery (LIB) technology is critically needed for many
applications in a plethora of industries and is an important energystorage
solution which can be potentially applied, for instance
into electric vehicles (EVs).1,2 However, LIB has continued to be
primarily relegated by the electronics market mainly due to its
cost and material issues3 and the lack of high-performance
cathode materials have become a technological bottleneck for
the commercial development of advanced LIB.4 Particularly for
the entrance of LIB into high energy fields, such as EVs and
renewable energy storage in smart grids, the demand for highcapacity
and voltage cathodes is starting to become a key focus
of research.
In the search for new positive-electrode materials for LIB,
recent research has focused upon nano-structured lithium
transitional-metal phosphates that exhibit desirable properties
such as high energy storage capacities combined with electrochemical
stability.5,6 Olivine LiFePO4,7 as one member of this
class, has risen to prominence so far due to other characteristics
involving low cost, low environmental impact and safety,
which ar
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Generative Pre-trained Transformer (GPT) models have exhibited exciting
progress in capabilities, capturing the interest of practitioners and the
public alike. Yet, while the literature on the trustworthiness of GPT models
remains limited, practitioners have proposed employing capable GPT models for
sensitive applications to healthcare and finance - where mistakes can be
costly. To this end, this work proposes a comprehensive trustworthiness
evaluation for large language models with a focus on GPT-4 and GPT-3.5,
considering diverse perspectives - including toxicity, stereotype bias,
adversarial robustness, out-of-distribution robustness, robustness on
adversarial demonstrations, privacy, machine ethics, and fairness. Based on our
evaluations, we discover previously unpublished vulnerabilities to
trustworthiness threats. For instance, we find that GPT models can be easily
misled to generate toxic and biased outputs and leak private information in
both training data and conversation history. We also find that although GPT-4
is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more
vulnerable given jailbreaking system or user prompts, potentially due to the
reason that GPT-4 follows the (misleading) instructions more precisely. Our
work illustrates a comprehensive trustworthiness evaluation of GPT models and
sheds light on the trustworthiness gaps. Our benchmark is publicly available at
https://decodingtrust.github.io/
- …