115 research outputs found

    Phase equilibrium simulation and its application in crystallization processes

    Get PDF
    Solid-liquid phase equilibrium information is essential to the research and development of crystallization processes. Computer simulation of the multicomponent solid-liquid equilibrium avoids the traditional tedious experimental determination;The phase equilibrium simulation requires an accurate thermodynamic model to describe the solution chemistry and a usable mathematical procedure to obtain reliable solutions: In this work, a modified activity coefficient model is presented. The modification makes the model more practical to use. A new numerical algorithm, which is based on a large-scale optimization technique is used for phase equilibrium calculation. This new method takes advantage of the thermodynamic properties of the solid-liquid equilibrium and unifies thermodynamics and Mathematics; The numerical procedures have real physical meanings. The phase diagram at various temperatures of the industrial important system Na-K-Mg-Cl-NO[subscript]3- H[subscript]2O is calculated using the new method. The results compare well with the available experimental data

    A Unified Scheme of ResNet and Softmax

    Full text link
    Large language models (LLMs) have brought significant changes to human society. Softmax regression and residual neural networks (ResNet) are two important techniques in deep learning: they not only serve as significant theoretical components supporting the functionality of LLMs but also are related to many other machine learning and theoretical computer science fields, including but not limited to image classification, object detection, semantic segmentation, and tensors. Previous research works studied these two concepts separately. In this paper, we provide a theoretical analysis of the regression problem: exp(Ax)+Ax,1n1(exp(Ax)+Ax)b22\| \langle \exp(Ax) + A x , {\bf 1}_n \rangle^{-1} ( \exp(Ax) + Ax ) - b \|_2^2, where AA is a matrix in Rn×d\mathbb{R}^{n \times d}, bb is a vector in Rn\mathbb{R}^n, and 1n{\bf 1}_n is the nn-dimensional vector whose entries are all 11. This regression problem is a unified scheme that combines softmax regression and ResNet, which has never been done before. We derive the gradient, Hessian, and Lipschitz properties of the loss function. The Hessian is shown to be positive semidefinite, and its structure is characterized as the sum of a low-rank matrix and a diagonal matrix. This enables an efficient approximate Newton method. As a result, this unified scheme helps to connect two previously thought unrelated fields and provides novel insight into loss landscape and optimization for emerging over-parameterized neural networks, which is meaningful for future research in deep learning models

    TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

    Full text link
    Diffusion models have achieved great success in a range of tasks, such as image synthesis and molecule design. As such successes hinge on large-scale training data collected from diverse sources, the trustworthiness of these collected data is hard to control or audit. In this work, we aim to explore the vulnerabilities of diffusion models under potential training data manipulations and try to answer: How hard is it to perform Trojan attacks on well-trained diffusion models? What are the adversarial targets that such Trojan attacks can achieve? To answer these questions, we propose an effective Trojan attack against diffusion models, TrojDiff, which optimizes the Trojan diffusion and generative processes during training. In particular, we design novel transitions during the Trojan diffusion process to diffuse adversarial targets into a biased Gaussian distribution and propose a new parameterization of the Trojan generative process that leads to an effective training objective for the attack. In addition, we consider three types of adversarial targets: the Trojaned diffusion models will always output instances belonging to a certain class from the in-domain distribution (In-D2D attack), out-of-domain distribution (Out-D2D-attack), and one specific instance (D2I attack). We evaluate TrojDiff on CIFAR-10 and CelebA datasets against both DDPM and DDIM diffusion models. We show that TrojDiff always achieves high attack performance under different adversarial targets using different types of triggers, while the performance in benign environments is preserved. The code is available at https://github.com/chenweixin107/TrojDiff.Comment: CVPR202

    DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration

    Full text link
    We present DeepICP - a novel end-to-end learning-based 3D point cloud registration framework that achieves comparable registration accuracy to prior state-of-the-art geometric methods. Different from other keypoint based methods where a RANSAC procedure is usually needed, we implement the use of various deep neural network structures to establish an end-to-end trainable network. Our keypoint detector is trained through this end-to-end structure and enables the system to avoid the inference of dynamic objects, leverages the help of sufficiently salient features on stationary objects, and as a result, achieves high robustness. Rather than searching the corresponding points among existing points, the key contribution is that we innovatively generate them based on learned matching probabilities among a group of candidates, which can boost the registration accuracy. Our loss function incorporates both the local similarity and the global geometric constraints to ensure all above network designs can converge towards the right direction. We comprehensively validate the effectiveness of our approach using both the KITTI dataset and the Apollo-SouthBay dataset. Results demonstrate that our method achieves comparable or better performance than the state-of-the-art geometry-based methods. Detailed ablation and visualization analysis are included to further illustrate the behavior and insights of our network. The low registration error and high robustness of our method makes it attractive for substantial applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results updated, accepted by ICCV 201

    A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time

    Full text link
    Large language models (LLMs) have played a pivotal role in revolutionizing various facets of our daily existence. Solving attention regression is a fundamental task in optimizing LLMs. In this work, we focus on giving a provable guarantee for the one-layer attention network objective function L(X,Y)=j0=1ni0=1d(exp(Aj0x),1n1exp(Aj0x),A3Y,i0bj0,i0)2L(X,Y) = \sum_{j_0 = 1}^n \sum_{i_0 = 1}^d ( \langle \langle \exp( \mathsf{A}_{j_0} x ) , {\bf 1}_n \rangle^{-1} \exp( \mathsf{A}_{j_0} x ), A_{3} Y_{*,i_0} \rangle - b_{j_0,i_0} )^2. Here ARn2×d2\mathsf{A} \in \mathbb{R}^{n^2 \times d^2} is Kronecker product between A1Rn×dA_1 \in \mathbb{R}^{n \times d} and A2Rn×dA_2 \in \mathbb{R}^{n \times d}. A3A_3 is a matrix in Rn×d\mathbb{R}^{n \times d}, Aj0Rn×d2\mathsf{A}_{j_0} \in \mathbb{R}^{n \times d^2} is the j0j_0-th block of A\mathsf{A}. The X,YRd×dX, Y \in \mathbb{R}^{d \times d} are variables we want to learn. BRn×dB \in \mathbb{R}^{n \times d} and bj0,i0Rb_{j_0,i_0} \in \mathbb{R} is one entry at j0j_0-th row and i0i_0-th column of BB, Y,i0RdY_{*,i_0} \in \mathbb{R}^d is the i0i_0-column vector of YY, and xRd2x \in \mathbb{R}^{d^2} is the vectorization of XX. In a multi-layer LLM network, the matrix BRn×dB \in \mathbb{R}^{n \times d} can be viewed as the output of a layer, and A1=A2=A3Rn×dA_1= A_2 = A_3 \in \mathbb{R}^{n \times d} can be viewed as the input of a layer. The matrix version of xx can be viewed as QKQK^\top and YY can be viewed as VV. We provide an iterative greedy algorithm to train loss function L(X,Y)L(X,Y) up ϵ\epsilon that runs in O~((Tmat(n,n,d)+Tmat(n,d,d)+d2ω)log(1/ϵ))\widetilde{O}( ({\cal T}_{\mathrm{mat}}(n,n,d) + {\cal T}_{\mathrm{mat}}(n,d,d) + d^{2\omega}) \log(1/\epsilon) ) time. Here Tmat(a,b,c){\cal T}_{\mathrm{mat}}(a,b,c) denotes the time of multiplying a×ba \times b matrix another b×cb \times c matrix, and ω2.37\omega\approx 2.37 denotes the exponent of matrix multiplication

    Visual Persuasion: Inferring Communicative Intents of Images

    Full text link
    In this paper we introduce the novel problem of under-standing visual persuasion. Modern mass media make ex-tensive use of images to persuade people to make commer-cial and political decisions. These effects and techniques are widely studied in the social sciences, but behavioral studies do not scale to massive datasets. Computer vision has made great strides in building syntactical representa-tions of images, such as detection and identification of ob-jects. However, the pervasive use of images for commu-nicative purposes has been largely ignored. We extend the significant advances in syntactic analysis in computer vi-sion to the higher-level challenge of understanding the un-derlying communicative intent implied in images. We be-gin by identifying nine dimensions of persuasive intent la-tent in images of politicians, such as “socially dominant,” “energetic, ” and “trustworthy, ” and propose a hierarchical model that builds on the layer of syntactical attributes, such as “smile ” and “waving hand, ” to predict the intents pre-sented in the images. To facilitate progress, we introduce a new dataset of 1,124 images of politicians labeled with ground-truth intents in the form of rankings. This study demonstrates that a systematic focus on visual persuasion opens up the field of computer vision to a new class of inves-tigations around mediated images, intersecting with media analysis, psychology, and political communication. 1

    Learning Point-Language Hierarchical Alignment for 3D Visual Grounding

    Full text link
    This paper presents a novel hierarchical alignment model (HAM) that learns multi-granularity visual and linguistic representations in an end-to-end manner. We extract key points and proposal points to model 3D contexts and instances, and propose point-language alignment with context modulation (PLACM) mechanism, which learns to gradually align word-level and sentence-level linguistic embeddings with visual representations, while the modulation with the visual context captures latent informative relationships. To further capture both global and local relationships, we propose a spatially multi-granular modeling scheme that applies PLACM to both global and local fields. Experimental results demonstrate the superiority of HAM, with visualized results showing that it can dynamically model fine-grained visual and linguistic representations. HAM outperforms existing methods by a significant margin and achieves state-of-the-art performance on two publicly available datasets, and won the championship in ECCV 2022 ScanRefer challenge. Code is available at~\url{https://github.com/PPjmchen/HAM}.Comment: Champion on ECCV 2022 ScanRefer Challeng

    A promising Na3V2(PO4)(3) cathode for use in the construction of high energy batteries

    Get PDF
    High-energy batteries need significant cathodes which can simultaneously provide large specific capacities and high discharge plateaus. NASICON-structured Na3V2(PO4)3 (NVP) has been utilised as a promising cathode to meet this requirement and be used in the construction of high energy batteries. For a hybrid-ion battery by employing metallic lithium as an anode, NVP exhibits an initial specific capacity of 170 mA h g 1 in the voltage range of 1.6–4.8 V with a long discharge plateau around 3.7 V. Three Na(2) sites for NVP are found capable to be utilised through the application of a wide voltage window but only two of them are able to undergo ions exchange to produce a NaLi2V2(PO4)3 phase. However, a hybrid-ion migration mechanism is suggested to exist to describe the whole ion transport in which the effects of a Na-ion ‘‘barrier’’ results in a lowered ion diffusion rate and observed specific capacity. 1. Introduction Lithium-ion battery (LIB) technology is critically needed for many applications in a plethora of industries and is an important energystorage solution which can be potentially applied, for instance into electric vehicles (EVs).1,2 However, LIB has continued to be primarily relegated by the electronics market mainly due to its cost and material issues3 and the lack of high-performance cathode materials have become a technological bottleneck for the commercial development of advanced LIB.4 Particularly for the entrance of LIB into high energy fields, such as EVs and renewable energy storage in smart grids, the demand for highcapacity and voltage cathodes is starting to become a key focus of research. In the search for new positive-electrode materials for LIB, recent research has focused upon nano-structured lithium transitional-metal phosphates that exhibit desirable properties such as high energy storage capacities combined with electrochemical stability.5,6 Olivine LiFePO4,7 as one member of this class, has risen to prominence so far due to other characteristics involving low cost, low environmental impact and safety, which ar

    DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

    Full text link
    Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications to healthcare and finance - where mistakes can be costly. To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives - including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. Based on our evaluations, we discover previously unpublished vulnerabilities to trustworthiness threats. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially due to the reason that GPT-4 follows the (misleading) instructions more precisely. Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps. Our benchmark is publicly available at https://decodingtrust.github.io/
    corecore