Search CORE

1,105 research outputs found

Automatic CNN channel selection and effective detection on face and rotated aerial objects

Author: Fang Zhenyu.
Publication venue
Publication date: 01/01/2020
Field of study

Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time.Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time

STAX (Strathclyde Repository)

Resource management on Cloud systems with Machine Learning

Author: Fang Zhenyu
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2010
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Activated Vibrational Modes and Fermi Resonance in Tip-Enhanced Raman Spectroscopy

Author: Fang Yurui
Sun Mengtao
Xu Hongxing
Zhang Zhenyu
Publication venue: 'American Physical Society (APS)'
Publication date: 18/12/2011
Field of study

Using p-aminothiophenol (PATP) molecules on a gold substrate as prototypical examples and high vacuum tip-enhanced Raman spectroscopy (HV-TERS), we show that the vibrational spectra of those molecules are distinctly different from those in typical surface-enhanced Raman spectroscopy. Detailed first-principles calculations help to assign the Raman peaks in the TERS measurements as Raman active and infrared (IR) active vibrational modes of dimercaptoazobenzene (DMAB), thus providing strong spectroscopic evidence for the conversion of PATP dimerization to DMAB. The activation of the IR active modes is due to enhanced electromagnetic field gradient effects within the gap region of the highly asymmetric tip-surface geometry. Our TERS measurements also realize splitting of certain vibrational modes due to Fermi resonance between a fundamental mode and the overtone of a different mode or a combinational mode. These findings help to broaden the versatility of TERS as a promising technique for ultrasensitive molecular spectroscopy

arXiv.org e-Print Archive

Lund University Publications

Crossref

Preliminary study on ductile fracture of imperfect lattice materials

Author: Cui Xiaodong
Fang Daining
Pei Yongmao
Xue Zhenyu
Publication venue: Elsevier Ltd.
Publication date: 15/12/2011
Field of study

AbstractThe ductile fracture behavior of two-dimensional imperfect lattice material under dynamic stretching is studied by finite element method using ABAQUS/Explicit code. The simulations are performed with three isotopic lattice materials: the regular hexagonal honeycomb, the Kagome lattice and the regular triangular lattice. All the three lattices are made of an elastic/visco-plastic metal material. Two typical imperfections: vacancy defect and rigid inclusion are introduced separately. The numerical results reveal novel deformation modes and crack growth patterns in the ductile fracture of lattice material. Various crack growth patterns as defined according to their profiles, “X”-type, “Butterfly”-type, “Petal”-type, are observed in different combinations of imperfection type and lattice topology. Crack propagation could induce severe material softening and deduce the plastic dissipation of the lattices. Subsequently, the effects of the strain rate, relative density, microstructure topology, and defect type on the crack growth pattern, the associated macroscopic material softening and the knock-down of total plastic dissipation are investigated

Elsevier - Publisher Connector

TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies

Author: Liu Lihua
Xuan Zhenyu
Zhang Michael Q.
Zhao Fang
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

In order to understand gene regulation, accurate and comprehensive knowledge of transcriptional regulatory elements is essential. Here, we report our efforts in building a mammalian Transcriptional Regulatory Element Database (TRED) with associated data analysis functions. It collects cis- and trans-regulatory elements and is dedicated to easy data access and analysis for both single-gene-based and genome-scale studies. Distinguishing features of TRED include: (i) relatively complete genome-wide promoter annotation for human, mouse and rat; (ii) availability of gene transcriptional regulation information including transcription factor binding sites and experimental evidence; (iii) data accuracy is ensured by hand curation; (iv) efficient user interface for easy and flexible data retrieval; and (v) implementation of on-the-fly sequence analysis tools. TRED can provide good training datasets for further genome-wide cis-regulatory element prediction and annotation, assist detailed functional studies and facilitate the decipher of gene regulatory networks (http://rulai.cshl.edu/TRED)

CiteSeerX

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central