Search CORE

168 research outputs found

An interpretability framework for Similar case matching

Author: Fang Jiajun
Lin Nankai
Liu Haonan
Yang Aimin
Zhou Dong
Publication venue
Publication date: 16/08/2023
Field of study

Similar Case Matching (SCM) plays a pivotal role in the legal system by facilitating the efficient identification of similar cases for legal professionals. While previous research has primarily concentrated on enhancing the performance of SCM models, the aspect of interpretability has been neglected. To bridge the gap, this study proposes an integrated pipeline framework for interpretable SCM. The framework comprises four modules: judicial feature sentence identification, case matching, feature sentence alignment, and conflict resolution. In contrast to current SCM methods, our framework first extracts feature sentences within a legal case that contain essential information. Then it conducts case matching based on these extracted features. Subsequently, our framework aligns the corresponding sentences in two legal cases to provide evidence of similarity. In instances where the results of case matching and feature sentence alignment exhibit conflicts, the conflict resolution module resolves these inconsistencies. The experimental results show the effectiveness of our proposed framework, establishing a new benchmark for interpretable SCM

arXiv.org e-Print Archive

Island area, not isolation, drives taxonomic, phylogenetic and functional diversity of ants on land-bridge islands

Author: Ding Ping
Dunn Robert R.
Si Xingfeng
Zhao Yuhao
Zhou Haonan
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Copenhagen University Research Information System

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving

Author: An Haonan
Chen Xianhao
Fang Yuguang
Fang Zhengru
Hu Senkang
Xu Guowen
Zhou Yuan
Publication venue
Publication date: 24/10/2023
Field of study

Collaborative perception among multiple connected and autonomous vehicles can greatly enhance perceptive capabilities by allowing vehicles to exchange supplementary information via communications. Despite advances in previous approaches, challenges still remain due to channel variations and data heterogeneity among collaborative vehicles. To address these issues, we propose ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph and minimize the average transmission delay while mitigating the side effects from the data heterogeneity. Our novelties lie in three aspects. We first design a transmission delay minimization method, which can construct the communication graph and minimize the transmission delay according to different channel information state. We then propose an adaptive data reconstruction mechanism, which can dynamically adjust the rate-distortion trade-off to enhance perception efficiency. Moreover, it minimizes the temporal redundancy during data transmissions. Finally, we conceive a domain alignment scheme to align the data distribution from different vehicles, which can mitigate the domain gap between different vehicles and improve the performance of the target task. Comprehensive experiments demonstrate the effectiveness of our method in comparison to the existing state-of-the-art works.Comment: 6 pages, 6 figure

arXiv.org e-Print Archive

FSD: An Initial Chinese Dataset for Fake Song Detection

Author: Cheng Haonan
Jiang Zhenghao
Lu Xiaolin
Xie Yuankun
Yang Yuxin
Ye Long
Zhou Jingjing
Publication venue
Publication date: 06/09/2023
Field of study

Singing voice synthesis and singing voice conversion have significantly advanced, revolutionizing musical experiences. However, the rise of "Deepfake Songs" generated by these technologies raises concerns about authenticity. Unlike Audio DeepFake Detection (ADD), the field of song deepfake detection lacks specialized datasets or methods for song authenticity verification. In this paper, we initially construct a Chinese Fake Song Detection (FSD) dataset to investigate the field of song deepfake detection. The fake songs in the FSD dataset are generated by five state-of-the-art singing voice synthesis and singing voice conversion methods. Our initial experiments on FSD revealed the ineffectiveness of existing speech-trained ADD models for the task of song deepFake detection. Thus, we employ the FSD dataset for the training of ADD models. We subsequently evaluate these models under two scenarios: one with the original songs and another with separated vocal tracks. Experiment results show that song-trained ADD models exhibit a 38.58% reduction in average equal error rate compared to speech-trained ADD models on the FSD test set.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive

ReliTalk: Relightable Talking Portrait Generation from a Single Video

Author: Chen Zhaoxi
Fan Xiangyu
Jiang Yuming
Liu Ziwei
Qiu Haonan
Wu Wayne
Yang Lei
Zhou Hang
Publication venue
Publication date: 05/09/2023
Field of study

Recent years have witnessed great progress in creating vivid audio-driven portraits from monocular videos. However, how to seamlessly adapt the created video avatars to other scenarios with different backgrounds and lighting conditions remains unsolved. On the other hand, existing relighting studies mostly rely on dynamically lighted or multi-view data, which are too expensive for creating video portraits. To bridge this gap, we propose ReliTalk, a novel framework for relightable audio-driven talking portrait generation from monocular videos. Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images. Specifically, we involve 3D facial priors derived from audio features to predict delicate normal maps through implicit functions. These initially predicted normals then take a crucial part in reflectance decomposition by dynamically estimating the lighting condition of the given video. Moreover, the stereoscopic face representation is refined using the identity-consistent loss under simulated multiple lighting conditions, addressing the ill-posed problem caused by limited views available from a single monocular video. Extensive experiments validate the superiority of our proposed framework on both real and synthetic datasets. Our code is released in https://github.com/arthur-qiu/ReliTalk

arXiv.org e-Print Archive