Search CORE

181 research outputs found

PATROL: Privacy-Oriented Pruning for Collaborative Inference Against Model Inversion Attacks

Author: Ding Shiwei
Pan Miao
Yuan Xiaoyong
Zhang Lan
Publication venue
Publication date: 20/07/2023
Field of study

Collaborative inference has been a promising solution to enable resource-constrained edge devices to perform inference using state-of-the-art deep neural networks (DNNs). In collaborative inference, the edge device first feeds the input to a partial DNN locally and then uploads the intermediate result to the cloud to complete the inference. However, recent research indicates model inversion attacks (MIAs) can reconstruct input data from intermediate results, posing serious privacy concerns for collaborative inference. Existing perturbation and cryptography techniques are inefficient and unreliable in defending against MIAs while performing accurate inference. This paper provides a viable solution, named PATROL, which develops privacy-oriented pruning to balance privacy, efficiency, and utility of collaborative inference. PATROL takes advantage of the fact that later layers in a DNN can extract more task-specific features. Given limited local resources for collaborative inference, PATROL intends to deploy more layers at the edge based on pruning techniques to enforce task-specific features for inference and reduce task-irrelevant but sensitive features for privacy preservation. To achieve privacy-oriented pruning, PATROL introduces two key components: Lipschitz regularization and adversarial reconstruction training, which increase the reconstruction errors by reducing the stability of MIAs and enhance the target inference model by adversarial training, respectively

arXiv.org e-Print Archive

Optimized Path Planning for USVs under Ocean Currents

Author: Akbari Behzad
Liu Shiwei
Pan Ya-Jun
Wang Tianye
Publication venue
Publication date: 06/07/2023
Field of study

The proposed work focuses on the path planning for Unmanned Surface Vehicles (USVs) in the ocean enviroment, taking into account various spatiotemporal factors such as ocean currents and other energy consumption factors. The paper proposes the use of Gaussian Process Motion Planning (GPMP2), a Bayesian optimization method that has shown promising results in continuous and nonlinear path planning algorithms. The proposed work improves GPMP2 by incorporating a new spatiotemporal factor for tracking and predicting ocean currents using a spatiotemporal Bayesian inference. The algorithm is applied to the USV path planning and is shown to optimize for smoothness, obstacle avoidance, and ocean currents in a challenging environment. The work is relevant for practical applications in ocean scenarios where an optimal path planning for USVs is essential for minimizing costs and optimizing performance.Comment: 9 pages and 7 figures, submitted for IEEE Transactions on Man, systems ,and Cybernetic

arXiv.org e-Print Archive

Effects of Surface Modification of Nanotube Arrays on the Performance of CdS Quantum-Dot-Sensitized Solar Cells

Author: Danhong Li
Jianjun Liao
Nengqian Pan
Shiwei Lin
Xiankun Cao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

CdS-sensitized TiO2 nanotube arrays have been fabricated using the method of successive ionic layer adsorption and reaction and used as a photoanode for quantum-dot-sensitized solar cells. Before being coated with CdS, the surface of TiO2 nanotube arrays was treated with TiCl4, nitric acid (HNO3), potassium hydroxide (KOH), and methyltrimethoxysilane (MTMS), respectively, for the purpose of reducing the interface transfer resistance of quantum-dot-sensitized solar cells. The surfaces of the modified samples represented the characteristics of superhydrophilic and hydrophobic which directly affect the power conversion efficiency of the solar cells. The results showed that surface modification resulted in the reduction of the surface tension, which played a significant role in the connectivity of CdS and TiO2 nanotube arrays. In addition, the solar cells based on CdS/TiO2 electrode treated by HNO3 achieved a maximum power conversion efficiency of 0.17%, which was 42% higher than the reference sample without any modification

Crossref

Directory of Open Access Journals

Towards Real-World Visual Tracking with Temporal Contexts

Author: Cao Ziang
Fu Changhong
Huang Ziyuan
Liu Ziwei
Pan Liang
Zhang Shiwei
Publication venue
Publication date: 20/08/2023
Field of study

Visual tracking has made significant improvements in the past few decades. Most existing state-of-the-art trackers 1) merely aim for performance in ideal conditions while overlooking the real-world conditions; 2) adopt the tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only integrate the temporal information into the template, where temporal contexts among consecutive frames are far from being fully utilized. To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently. Based on it, we propose a stronger version for real-world visual tracking, i.e., TCTrack++. It boils down to two levels: features and similarity maps. Specifically, for feature extraction, we propose an attention-based temporally adaptive convolution to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights. For similarity map refinement, we introduce an adaptive temporal transformer to encode the temporal knowledge efficiently and decode it for the accurate refinement of the similarity map. To further improve the performance, we additionally introduce a curriculum learning strategy. Also, we adopt online evaluation to measure performance in real-world conditions. Exhaustive experiments on 8 wellknown benchmarks demonstrate the superiority of TCTrack++. Real-world tests directly verify that TCTrack++ can be readily used in real-world applications.Comment: Accepted by IEEE TPAMI, Code: https://github.com/vision4robotics/TCTrac

arXiv.org e-Print Archive

Synthesis and Characterization of Hierarchical Structured TiO 2

Author: Jianjun Liao
Kai Liu
Min Zeng
Nengqian Pan
Shiwei Lin
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Hierarchical structured TiO2 nanotubes were prepared by mechanical ball milling of highly ordered TiO2 nanotube arrays grown by electrochemical anodization of titanium foil. Scanning electron microscopy, transmission electron microscopy, X-ray diffraction, specific surface area analysis, UV-visible absorption spectroscopy, photocurrent measurement, photoluminescence spectra, electrochemical impedance spectra, and photocatalytic degradation test were applied to characterize the nanocomposites. Surface area increased as the milling time extended. After 5 h ball milling, TiO2 hierarchical nanotubes exhibited a corn-like shape and exhibited enhanced photoelectrochemical activity in comparison to commercial P25. The superior photocatalytic activity is suggested to be due to the combined advantages of high surface area of nanoparticles and rapid electron transfer as well as collection of the nanotubes in the hierarchical structure. The hierarchical structured TiO2 nanotubes could be applied into flexible applications on solar cells, sensors, and other photoelectrochemical devices

Crossref

Directory of Open Access Journals

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

Author: Albanie Samuel
Feng Tao
Jiang Jianwen
Ni Dong
Pan Yining
Wang Xiang
Yuan Hangjie
Zhang Shiwei
Zhang Yingya
Zhao Deli
Publication venue
Publication date: 18/08/2023
Field of study

Relational Language-Image Pre-training (RLIP) aims to align vision representations with relational texts, thereby advancing the capability of relational reasoning in computer vision tasks. However, hindered by the slow convergence of RLIPv1 architecture and the limited availability of existing scene graph data, scaling RLIPv1 is challenging. In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data. To enable fast scaling, RLIPv2 introduces Asymmetric Language-Image Fusion (ALIF), a mechanism that facilitates earlier and deeper gated cross-modal fusion with sparsified language encoding layers. ALIF leads to comparable or better performance than RLIPv1 in a fraction of the time for pre-training and fine-tuning. To obtain scene graph data at scale, we extend object detection datasets with free-form relation labels by introducing a captioner (e.g., BLIP) and a designed Relation Tagger. The Relation Tagger assigns BLIP-generated relation texts to region pairs, thus enabling larger-scale relational pre-training. Through extensive experiments conducted on Human-Object Interaction Detection and Scene Graph Generation, RLIPv2 shows state-of-the-art performance on three benchmarks under fully-finetuning, few-shot and zero-shot settings. Notably, the largest RLIPv2 achieves 23.29mAP on HICO-DET without any fine-tuning, yields 32.22mAP with just 1% data and yields 45.09mAP with 100% data. Code and models are publicly available at https://github.com/JacobYuan7/RLIPv2.Comment: Accepted to ICCV 2023. Code and models: https://github.com/JacobYuan7/RLIPv

arXiv.org e-Print Archive

InstructVideo: Instructing Video Diffusion Models with Human Feedback

Author: Albanie Samuel
Feng Tao
Liu Ziwei
Ni Dong
Pan Yining
Wang Xiang
Wei Yujie
Yuan Hangjie
Zhang Shiwei
Zhang Yingya
Publication venue
Publication date: 19/12/2023
Field of study

Diffusion models have emerged as the de facto paradigm for video generation. However, their reliance on web-scale data of varied quality often yields results that are visually unappealing and misaligned with the textual prompts. To tackle this problem, we propose InstructVideo to instruct text-to-video diffusion models with human feedback by reward fine-tuning. InstructVideo has two key ingredients: 1) To ameliorate the cost of reward fine-tuning induced by generating through the full DDIM sampling chain, we recast reward fine-tuning as editing. By leveraging the diffusion process to corrupt a sampled video, InstructVideo requires only partial inference of the DDIM sampling chain, reducing fine-tuning cost while improving fine-tuning efficiency. 2) To mitigate the absence of a dedicated video reward model for human preferences, we repurpose established image reward models, e.g., HPSv2. To this end, we propose Segmental Video Reward, a mechanism to provide reward signals based on segmental sparse sampling, and Temporally Attenuated Reward, a method that mitigates temporal modeling degradation during fine-tuning. Extensive experiments, both qualitative and quantitative, validate the practicality and efficacy of using image reward models in InstructVideo, significantly enhancing the visual quality of generated videos without compromising generalization capabilities. Code and models will be made publicly available.Comment: Project page: https://instructvideo.github.io

arXiv.org e-Print Archive

ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters

Author: Chow Derek
Fan Zichen
Kielian Gregory
Lei Kauna
Liu Shiwei
Pan Bangfei
Saligane Mehdi
Sylvester Dennis
Tao Guanchen
Zou Yifei
Publication venue
Publication date: 20/02/2024
Field of study

The self-attention mechanism sets transformer-based large language model (LLM) apart from the convolutional and recurrent neural networks. Despite the performance improvement, achieving real-time LLM inference on silicon is challenging due to the extensively used Softmax in self-attention. Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which becomes the bottleneck especially when dealing with a longer context. To address this challenge, we propose Constant Softmax (ConSmax), a software-hardware co-design as an efficient Softmax alternative. ConSmax employs differentiable normalization parameters to remove the maximum searching and denominator summation in Softmax. It allows for massive parallelization while performing the critical tasks of Softmax. In addition, a scalable ConSmax hardware utilizing a bitwidth-split look-up table (LUT) can produce lossless non-linear operation and support mix-precision computing. It further facilitates efficient LLM inference. Experimental results show that ConSmax achieves a minuscule power consumption of 0.43 mW and area of 0.001 mm2 at 1-GHz working frequency and 22-nm CMOS technology. Compared to state-of-the-art Softmax hardware, ConSmax results in 14.5x energy and 14.0x area savings with a comparable accuracy on a GPT-2 model and the WikiText103 dataset

arXiv.org e-Print Archive

Multi-State Memristors and Their Applications: An Overview

Author: Jiang Xiongfei
Malik Adil
Pan Yihan
Papavassiliou Christos
Prodromakis Themis
Serb Alex
Si Zhaoguang
Stathopoulos Spyros
Wang Chaohan
Wang Shiwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/11/2022
Field of study

Edinburgh Research Explorer

YOLO SSPD: a small target cotton boll detection model during the boll-spitting period based on space-to-depth convolution

Author: Fei Tan
Li Guo
Mengli Zhang
Pan Gao
Peng Xing
Shiwei Ruan
Wei Chen
Yongquan Li
Yuan Zhang
Publication venue: Frontiers Media S.A.
Publication date: 01/06/2024
Field of study

IntroductionCotton yield estimation is crucial in the agricultural process, where the accuracy of boll detection during the flocculation period significantly influences yield estimations in cotton fields. Unmanned Aerial Vehicles (UAVs) are frequently employed for plant detection and counting due to their cost-effectiveness and adaptability.MethodsAddressing the challenges of small target cotton bolls and low resolution of UAVs, this paper introduces a method based on the YOLO v8 framework for transfer learning, named YOLO small-scale pyramid depth-aware detection (SSPD). The method combines space-to-depth and non-strided convolution (SPD-Conv) and a small target detector head, and also integrates a simple, parameter-free attentional mechanism (SimAM) that significantly improves target boll detection accuracy.ResultsThe YOLO SSPD achieved a boll detection accuracy of 0.874 on UAV-scale imagery. It also recorded a coefficient of determination (R2) of 0.86, with a root mean square error (RMSE) of 12.38 and a relative root mean square error (RRMSE) of 11.19% for boll counts.DiscussionThe findings indicate that YOLO SSPD can significantly improve the accuracy of cotton boll detection on UAV imagery, thereby supporting the cotton production process. This method offers a robust solution for high-precision cotton monitoring, enhancing the reliability of cotton yield estimates

Directory of Open Access Journals