Search CORE

115 research outputs found

Phase equilibrium simulation and its application in crystallization processes

Author: Song Weixin
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1991
Field of study

Solid-liquid phase equilibrium information is essential to the research and development of crystallization processes. Computer simulation of the multicomponent solid-liquid equilibrium avoids the traditional tedious experimental determination;The phase equilibrium simulation requires an accurate thermodynamic model to describe the solution chemistry and a usable mathematical procedure to obtain reliable solutions: In this work, a modified activity coefficient model is presented. The modification makes the model more practical to use. A new numerical algorithm, which is based on a large-scale optimization technique is used for phase equilibrium calculation. This new method takes advantage of the thermodynamic properties of the solid-liquid equilibrium and unifies thermodynamics and Mathematics; The numerical procedures have real physical meanings. The phase diagram at various temperatures of the industrial important system Na-K-Mg-Cl-NO[subscript]3- H[subscript]2O is calculated using the new method. The results compare well with the available experimental data

Digital Repository @ Iowa State University (ISU)

A Unified Scheme of ResNet and Softmax

Author: Song Zhao
Wang Weixin
Yin Junze
Publication venue
Publication date: 23/09/2023
Field of study

Large language models (LLMs) have brought significant changes to human society. Softmax regression and residual neural networks (ResNet) are two important techniques in deep learning: they not only serve as significant theoretical components supporting the functionality of LLMs but also are related to many other machine learning and theoretical computer science fields, including but not limited to image classification, object detection, semantic segmentation, and tensors. Previous research works studied these two concepts separately. In this paper, we provide a theoretical analysis of the regression problem:

\| \langle \exp(Ax) + A x , {\bf 1}_n \rangle^{-1} ( \exp(Ax) + Ax ) - b \|_2^2

, where

A

is a matrix in

\mathbb{R}^{n \times d}

b

is a vector in

\mathbb{R}^n

, and

{\bf 1}_n

is the

n

-dimensional vector whose entries are all

1

. This regression problem is a unified scheme that combines softmax regression and ResNet, which has never been done before. We derive the gradient, Hessian, and Lipschitz properties of the loss function. The Hessian is shown to be positive semidefinite, and its structure is characterized as the sum of a low-rank matrix and a diagonal matrix. This enables an efficient approximate Newton method. As a result, this unified scheme helps to connect two previously thought unrelated fields and provides novel insight into loss landscape and optimization for emerging over-parameterized neural networks, which is meaningful for future research in deep learning models

arXiv.org e-Print Archive

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

Author: Chen Weixin
Li Bo
Song Dawn
Publication venue
Publication date: 10/03/2023
Field of study

Diffusion models have achieved great success in a range of tasks, such as image synthesis and molecule design. As such successes hinge on large-scale training data collected from diverse sources, the trustworthiness of these collected data is hard to control or audit. In this work, we aim to explore the vulnerabilities of diffusion models under potential training data manipulations and try to answer: How hard is it to perform Trojan attacks on well-trained diffusion models? What are the adversarial targets that such Trojan attacks can achieve? To answer these questions, we propose an effective Trojan attack against diffusion models, TrojDiff, which optimizes the Trojan diffusion and generative processes during training. In particular, we design novel transitions during the Trojan diffusion process to diffuse adversarial targets into a biased Gaussian distribution and propose a new parameterization of the Trojan generative process that leads to an effective training objective for the attack. In addition, we consider three types of adversarial targets: the Trojaned diffusion models will always output instances belonging to a certain class from the in-domain distribution (In-D2D attack), out-of-domain distribution (Out-D2D-attack), and one specific instance (D2I attack). We evaluate TrojDiff on CIFAR-10 and CelebA datasets against both DDPM and DDIM diffusion models. We show that TrojDiff always achieves high attack performance under different adversarial targets using different types of triggers, while the performance in benign environments is preserved. The code is available at https://github.com/chenweixin107/TrojDiff.Comment: CVPR202

arXiv.org e-Print Archive

DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration

Author: Fu Xiangyu
Lu Weixin
Song Shiyu
Wan Guowei
Yuan Pengfei
Zhou Yao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2019
Field of study

We present DeepICP - a novel end-to-end learning-based 3D point cloud registration framework that achieves comparable registration accuracy to prior state-of-the-art geometric methods. Different from other keypoint based methods where a RANSAC procedure is usually needed, we implement the use of various deep neural network structures to establish an end-to-end trainable network. Our keypoint detector is trained through this end-to-end structure and enables the system to avoid the inference of dynamic objects, leverages the help of sufficiently salient features on stationary objects, and as a result, achieves high robustness. Rather than searching the corresponding points among existing points, the key contribution is that we innovatively generate them based on learned matching probabilities among a group of candidates, which can boost the registration accuracy. Our loss function incorporates both the local similarity and the global geometric constraints to ensure all above network designs can converge towards the right direction. We comprehensively validate the effectiveness of our approach using both the KITTI dataset and the Apollo-SouthBay dataset. Results demonstrate that our method achieves comparable or better performance than the state-of-the-art geometry-based methods. Detailed ablation and visualization analysis are included to further illustrate the behavior and insights of our network. The low registration error and high robustness of our method makes it attractive for substantial applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results updated, accepted by ICCV 201

arXiv.org e-Print Archive

Crossref

A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time

Author: Gao Yeqi
Song Zhao
Wang Weixin
Yin Junze
Publication venue
Publication date: 14/09/2023
Field of study

Large language models (LLMs) have played a pivotal role in revolutionizing various facets of our daily existence. Solving attention regression is a fundamental task in optimizing LLMs. In this work, we focus on giving a provable guarantee for the one-layer attention network objective function

L(X,Y) = \sum_{j_0 = 1}^n \sum_{i_0 = 1}^d ( \langle \langle \exp( \mathsf{A}_{j_0} x ) , {\bf 1}_n \rangle^{-1} \exp( \mathsf{A}_{j_0} x ), A_{3} Y_{*,i_0} \rangle - b_{j_0,i_0} )^2

. Here

\mathsf{A} \in \mathbb{R}^{n^2 \times d^2}

is Kronecker product between

A_1 \in \mathbb{R}^{n \times d}

and

A_2 \in \mathbb{R}^{n \times d}

A_3

is a matrix in

\mathbb{R}^{n \times d}

\mathsf{A}_{j_0} \in \mathbb{R}^{n \times d^2}

is the

j_0

-th block of

\mathsf{A}

. The

X, Y \in \mathbb{R}^{d \times d}

are variables we want to learn.

B \in \mathbb{R}^{n \times d}

and

b_{j_0,i_0} \in \mathbb{R}

is one entry at

j_0

-th row and

i_0

-th column of

B

Y_{*,i_0} \in \mathbb{R}^d

is the

i_0

-column vector of

Y

, and

x \in \mathbb{R}^{d^2}

is the vectorization of

X

. In a multi-layer LLM network, the matrix

B \in \mathbb{R}^{n \times d}

can be viewed as the output of a layer, and

A_1= A_2 = A_3 \in \mathbb{R}^{n \times d}

can be viewed as the input of a layer. The matrix version of

x

can be viewed as

QK^\top

and

Y

can be viewed as

V

. We provide an iterative greedy algorithm to train loss function

L(X,Y)

\epsilon

that runs in

\widetilde{O}( ({\cal T}_{\mathrm{mat}}(n,n,d) + {\cal T}_{\mathrm{mat}}(n,d,d) + d^{2\omega}) \log(1/\epsilon) )

time. Here

{\cal T}_{\mathrm{mat}}(a,b,c)

denotes the time of multiplying

a \times b

matrix another

b \times c

matrix, and

\omega\approx 2.37

denotes the exponent of matrix multiplication

arXiv.org e-Print Archive

Visual Persuasion: Inferring Communicative Intents of Images

Author: Francis F. Steen
Jungseock Joo
Song-chun Zhu
Weixin Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this paper we introduce the novel problem of under-standing visual persuasion. Modern mass media make ex-tensive use of images to persuade people to make commer-cial and political decisions. These effects and techniques are widely studied in the social sciences, but behavioral studies do not scale to massive datasets. Computer vision has made great strides in building syntactical representa-tions of images, such as detection and identification of ob-jects. However, the pervasive use of images for commu-nicative purposes has been largely ignored. We extend the significant advances in syntactic analysis in computer vi-sion to the higher-level challenge of understanding the un-derlying communicative intent implied in images. We be-gin by identifying nine dimensions of persuasive intent la-tent in images of politicians, such as “socially dominant,” “energetic, ” and “trustworthy, ” and propose a hierarchical model that builds on the layer of syntactical attributes, such as “smile ” and “waving hand, ” to predict the intents pre-sented in the images. To facilitate progress, we introduce a new dataset of 1,124 images of politicians labeled with ground-truth intents in the form of rankings. This study demonstrates that a systematic focus on visual persuasion opens up the field of computer vision to a new class of inves-tigations around mediated images, intersecting with media analysis, psychology, and political communication. 1

CiteSeerX

Crossref

Learning Point-Language Hierarchical Alignment for 3D Visual Grounding

Author: Chen Jiaming
Luo Weixin
Ma Lin
Song Ran
Wei Xiaolin
Zhang Wei
Publication venue
Publication date: 05/06/2023
Field of study

This paper presents a novel hierarchical alignment model (HAM) that learns multi-granularity visual and linguistic representations in an end-to-end manner. We extract key points and proposal points to model 3D contexts and instances, and propose point-language alignment with context modulation (PLACM) mechanism, which learns to gradually align word-level and sentence-level linguistic embeddings with visual representations, while the modulation with the visual context captures latent informative relationships. To further capture both global and local relationships, we propose a spatially multi-granular modeling scheme that applies PLACM to both global and local fields. Experimental results demonstrate the superiority of HAM, with visualized results showing that it can dynamically model fine-grained visual and linguistic representations. HAM outperforms existing methods by a significant margin and achieves state-of-the-art performance on two publicly available datasets, and won the championship in ECCV 2022 ScanRefer challenge. Code is available at~\url{https://github.com/PPjmchen/HAM}.Comment: Champion on ECCV 2022 ScanRefer Challeng

arXiv.org e-Print Archive

A promising Na3V2(PO4)(3) cathode for use in the construction of high energy batteries

Author: Arico
Armand
Barker
Chen
Cheng
Choi
Choi
Craig E. Banks
Cushing
Du
Dunn
Ellis
Gopalakrishnan
Hanjun Zhu
Jian
Jian
Kabbour
Kang
Kim
Lim
Padhi
Pumera
Qinqin Sun
Qiyuan Chen
Saravanan
Shakoor
Slater
Song
Song
Song
Song
Umeda
Weixin Song
Xiaobo Ji
Yinpeng Yao
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2014
Field of study

High-energy batteries need significant cathodes which can simultaneously provide large specific capacities and high discharge plateaus. NASICON-structured Na3V2(PO4)3 (NVP) has been utilised as a promising cathode to meet this requirement and be used in the construction of high energy batteries. For a hybrid-ion battery by employing metallic lithium as an anode, NVP exhibits an initial specific capacity of 170 mA h g 1 in the voltage range of 1.6–4.8 V with a long discharge plateau around 3.7 V. Three Na(2) sites for NVP are found capable to be utilised through the application of a wide voltage window but only two of them are able to undergo ions exchange to produce a NaLi2V2(PO4)3 phase. However, a hybrid-ion migration mechanism is suggested to exist to describe the whole ion transport in which the effects of a Na-ion ‘‘barrier’’ results in a lowered ion diffusion rate and observed specific capacity. 1. Introduction Lithium-ion battery (LIB) technology is critically needed for many applications in a plethora of industries and is an important energystorage solution which can be potentially applied, for instance into electric vehicles (EVs).1,2 However, LIB has continued to be primarily relegated by the electronics market mainly due to its cost and material issues3 and the lack of high-performance cathode materials have become a technological bottleneck for the commercial development of advanced LIB.4 Particularly for the entrance of LIB into high energy fields, such as EVs and renewable energy storage in smart grids, the demand for highcapacity and voltage cathodes is starting to become a key focus of research. In the search for new positive-electrode materials for LIB, recent research has focused upon nano-structured lithium transitional-metal phosphates that exhibit desirable properties such as high energy storage capacities combined with electrochemical stability.5,6 Olivine LiFePO4,7 as one member of this class, has risen to prominence so far due to other characteristics involving low cost, low environmental impact and safety, which ar

Crossref

E-space: Manchester Metropolitan University's Research Repository

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Author: Arora Simran
Chen Weixin
Cheng Yu
Dutta Ritik
Hendrycks Dan
Kang Mintong
Koyejo Sanmi
Li Bo
Lin Zinan
Mazeika Mantas
Pei Hengzhi
Schaeffer Rylan
Song Dawn
Truong Sang T.
Wang Boxin
Xie Chulin
Xiong Zidi
Xu Chejian
Zhang Chenhui
Publication venue
Publication date: 20/06/2023
Field of study

Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications to healthcare and finance - where mistakes can be costly. To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives - including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. Based on our evaluations, we discover previously unpublished vulnerabilities to trustworthiness threats. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially due to the reason that GPT-4 follows the (misleading) instructions more precisely. Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps. Our benchmark is publicly available at https://decodingtrust.github.io/

arXiv.org e-Print Archive