Search CORE

405 research outputs found

The pre-stack migration imaging technique for damages identification in concrete structures

Author: Luo Ying
Wang Ziping
Xu Baiqiang
Yuan Fuhgwo
Publication venue: The Chinese Society of Theoretical and Applied Mechanics. Published by Elsevier Ltd.
Publication date: 31/12/2011
Field of study

AbstractPre-stack migration imaging (PMI) method, which is used in geophysical exploration by the performance of single side detection and visually display, can be used to identify the location, orientation, and severity of damages in concrete structure. In particular, this letter focuses on the experimental study by using a finite number of sensors for further practical applications. A concrete structure with a surface-mounted linear PZT transducers array is illustrated. Three types of damages, horizontal, dipping and V-shaped crack damage, have been studied. A pre-stack reverse time migration technique is used to back-propagate the scattering waves and to image damages in concrete structure. The migration results from the scattering waves of an artificial damage are presented. It is shown that the existence of the damage in concrete structure is correctly revealed through migration process

Elsevier - Publisher Connector

Focusing Modeling of OPFC Linear Array Transducer by Using Distributed Point Source Method

Author: Ying Luo
Ziping Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

The improvement of ultrasonic phased array detection technology is a major concern of engineering community. Orthotropic piezoelectric fiber composite (OPFC) can be constructed to multielement linear array which may be applied conveniently to actuators and sensors. The phased array transducers can generate special directional strong actuator power and high sensitivity for its orthotropic performance. Focusing beam of the linear phased array transducer is obtained simply only by adjusting a parabolic time delay. In this work, the distributed point source method (DPSM) is used to model the ultrasonic field. DPSM is a newly developed mesh-free numerical technique that has been developed for solving a variety of engineering problems. This work gives the basic theory of this method and solves the problems from the application of new OPFC phased array transducer. Compared with traditional transducer, the interaction effect of two OPFC linear phased array transducers is also modeled in the same medium, which shows that the pressure beam produced by the new transducer is narrower or more collimated than that produced by the conventional transducer at different angles. DPSM can be used to analyze and optimally design the OPFC linear phased array transducer

Crossref

Directory of Open Access Journals

Attention-enhanced connectionist temporal classification for discrete speech emotion recognition

Author: Bao Zhongtian
Cummins Nicholas
Schuller Björn
Wang Haishuai
Zhang Zixing
Zhao Ziping
Publication venue
Publication date: 01/01/2019
Field of study

Discrete speech emotion recognition (SER), the assignment of a single emotion label to an entire speech utterance, is typically performed as a sequence-to-label task. This approach, however, is limited, in that it can result in models that do not capture temporal changes in the speech signal, including those indicative of a particular emotion. One potential solution to overcome this limitation is to model SER as a sequence-to-sequence task instead. In this regard, we have developed an attention-based bidirectional long short-term memory (BLSTM) neural network in combination with a connectionist temporal classification (CTC) objective function (Attention-BLSTM-CTC) for SER. We also assessed the benefits of incorporating two contemporary attention mechanisms, namely component attention and quantum attention, into the CTC framework. To the best of the authors’ knowledge, this is the first time that such a hybrid architecture has been employed for SER.We demonstrated the effectiveness of our approach on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) and FAU-Aibo Emotion corpora. The experimental results demonstrate that our proposed model outperforms current state-of-the-art approaches.The work presented in this paper substantially supported by the National Natural Science Foundation of China (Grant No. 61702370), the Key Program of the Natural Science Foundation of Tianjin (Grant No. 18JCZDJC36300), the Open Projects Program of the National Laboratory of Pattern Recognition, and the Senior Visiting Scholar Program of Tianjin Normal University. Interspeech 2019 ISSN: 1990-977

OPUS Augsburg

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

YOLO-FaceV2: A Scale and Occlusion Aware Face Detector

Author: Chen Weijun
Huang Hongbo
Liu Yahui
Su Yongxin
Wang Xiuying
Yu Ziping
Publication venue
Publication date: 04/08/2022
Field of study

In recent years, face detection algorithms based on deep learning have made great progress. These algorithms can be generally divided into two categories, i.e. two-stage detector like Faster R-CNN and one-stage detector like YOLO. Because of the better balance between accuracy and speed, one-stage detectors have been widely used in many applications. In this paper, we propose a real-time face detector based on the one-stage detector YOLOv5, named YOLO-FaceV2. We design a Receptive Field Enhancement module called RFE to enhance receptive field of small face, and use NWD Loss to make up for the sensitivity of IoU to the location deviation of tiny objects. For face occlusion, we present an attention module named SEAM and introduce Repulsion Loss to solve it. Moreover, we use a weight function Slide to solve the imbalance between easy and hard samples and use the information of the effective receptive field to design the anchor. The experimental results on WiderFace dataset show that our face detector outperforms YOLO and its variants can be find in all easy, medium and hard subsets. Source code in https://github.com/Krasjet-Yu/YOLO-FaceV

arXiv.org e-Print Archive

Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition

Author: Li Chao
Wang Haishuai
Zhang Zixing
Zhao Yiqin
Zhao Ziping
Zheng Yu
Publication venue: DigitalCommons@Fairfield
Publication date: 01/01/2018
Field of study

Automatic emotion recognition from speech, which is an important and challenging task in the field of affective computing, heavily relies on the effectiveness of the speech features for classification. Previous approaches to emotion recognition have mostly focused on the extraction of carefully hand-crafted features. How to model spatio-temporal dynamics for speech emotion recognition effectively is still under active investigation. In this paper, we propose a method to tackle the problem of emotional relevant feature extraction from speech by leveraging Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks with fully convolutional networks in order to automatically learn the best spatio-temporal representations of speech signals. The learned high-level features are then fed into a deep neural network (DNN) to predict the final emotion. The experimental results on the Chinese Natural Audio-Visual Emotion Database (CHEAVD) and the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpora show that our method provides more accurate predictions compared with other existing emotion recognition algorithms

Crossref

Fairfield University: DigitalCommons@Fairfield

Hierarchical attention transfer networks for depression assessment from speech

Author: Bao Zhongtian
Cummins Nicholas
Schuller Björn
Wang Haishuai
Zhang Zixing
Zhao Ziping
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

OPUS Augsburg

M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining

Author: Chen Jingdong
Guo Qingpei
Ju Lin
Ma Ziping
Ren Wang
Wang Jian
Xu Furong
Yang Ming
Zhang Hanxiao
Publication venue
Publication date: 03/02/2024
Field of study

Vision-language foundation models like CLIP have revolutionized the field of artificial intelligence. Nevertheless, VLM models supporting multi-language, e.g., in both Chinese and English, have lagged due to the relative scarcity of large-scale pretraining datasets. Toward this end, we introduce a comprehensive bilingual (Chinese-English) dataset BM-6B with over 6 billion image-text pairs, aimed at enhancing multimodal foundation models to well understand images in both languages. To handle such a scale of dataset, we propose a novel grouped aggregation approach for image-text contrastive loss computation, which reduces the communication overhead and GPU memory demands significantly, facilitating a 60% increase in training speed. We pretrain a series of bilingual image-text foundation models with an enhanced fine-grained understanding ability on BM-6B, the resulting models, dubbed as

M^2

-Encoders (pronounced "M-Square"), set new benchmarks in both languages for multimodal retrieval and classification tasks. Notably, Our largest

M^2

-Encoder-10B model has achieved top-1 accuracies of 88.5% on ImageNet and 80.7% on ImageNet-CN under a zero-shot classification setting, surpassing previously reported SoTA methods by 2.2% and 21.1%, respectively. The

M^2

-Encoder series represents one of the most comprehensive bilingual image-text foundation models to date, so we are making it available to the research community for further exploration and development

arXiv.org e-Print Archive