Search CORE

511 research outputs found

Formal verification of a concurrent binary search tree

Author: Chen Xiwen
Publication venue
Publication date
Field of study

In this thesis, we formally verify a simplified version of the non-blocking linearizable binary search tree of Ellen et al., which appeared in the Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing (pages 131-140), using the PVS specification and verification system. The algorithm and its specification are both modelled as I/O automata. In order to formally verify that the algorithm implements the specification, we show that the algorithm's I/O automaton simulates the specification's. An intermediate I/O automaton is constructed to simplify the simulation proof of linearizability. By showing there is a forward simulation from the algorithm's I/O automaton to the intermediate automaton and there is a backward simulation from the intermediate automaton to the specification's automaton, we formally verify that the algorithm implements its specification. While formalizing the proof, we found small errors in the original proof

YorkSpace

Diversity Maximized Scheduling in RoadSide Units for Traffic Monitoring Applications

Author: Amin Rahul
Chen Xiwen
Razi Abolfazl
Sarlak Ahmad
Publication venue
Publication date: 28/06/2023
Field of study

This paper develops an optimal data aggregation policy for learning-based traffic control systems based on imagery collected from Road Side Units (RSUs) under imperfect communications. Our focus is optimizing semantic information flow from RSUs to a nearby edge server or cloud-based processing units by maximizing data diversity based on the target machine learning application while taking into account heterogeneous channel conditions (e.g., delay, error rate) and constrained total transmission rate. As a proof-of-concept, we enforce fairness among class labels to increase data diversity for classification problems. The developed constrained optimization problem is non-convex. Hence it does not admit a closed-form solution, and the exhaustive search is NP-hard in the number of RSUs. To this end, we propose an approximate algorithm that applies a greedy interval-by-interval scheduling policy by selecting RSUs to transmit. We use coalition game formulation to maximize the overall added fairness by the selected RSUs in each transmission interval. Once, RSUs are selected, we employ a maximum uncertainty method to handpick data samples that contribute the most to the learning performance. Our method outperforms random selection, uniform selection, and pure network-based optimization methods (e.g., FedCS) in terms of the ultimate accuracy of the target learning application

arXiv.org e-Print Archive

Learning on Bandwidth Constrained Multi-Source Data with MIMO-inspired DPP MAP Inference

Author: Amin Rahul
Chen Xiwen
Li Huayu
Razi Abolfazl
Publication venue
Publication date: 17/11/2023
Field of study

This paper proposes a distributed version of Determinant Point Processing (DPP) inference to enhance multi-source data diversification under limited communication bandwidth. DPP is a popular probabilistic approach that improves data diversity by enforcing the repulsion of elements in the selected subsets. The well-studied Maximum A Posteriori (MAP) inference in DPP aims to identify the subset with the highest diversity quantified by DPP. However, this approach is limited by the presumption that all data samples are available at one point, which hinders its applicability to real-world applications such as traffic datasets where data samples are distributed across sources and communication between them is band-limited. Inspired by the techniques used in Multiple-Input Multiple-Output (MIMO) communication systems, we propose a strategy for performing MAP inference among distributed sources. Specifically, we show that a lower bound of the diversity-maximized distributed sample selection problem can be treated as a power allocation problem in MIMO systems. A determinant-preserved sparse representation of selected samples is used to perform sample precoding in local sources to be processed by DPP. Our method does not require raw data exchange among sources, but rather a band-limited feedback channel to send lightweight diversity measures, analogous to the CSI message in MIMO systems, from the center to data sources. The experiments show that our scalable approach can outperform baseline methods, including random selection, uninformed individual DPP with no feedback, and DPP with SVD-based feedback, in both i.i.d and non-i.i.d setups. Specifically, it achieves 1 to 6 log-difference diversity gain in the latent representation of CIFAR-10, CIFAR-100, StanfordCars, and GTSRB datasets

arXiv.org e-Print Archive

Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions

Author: Chen Yuan
Feng Xiwen
Panageas Katherine
Shen Ronglai
Publication venue
Publication date: 30/01/2024
Field of study

Cancer is a complex disease driven by genomic alterations, and tumor sequencing is becoming a mainstay of clinical care for cancer patients. The emergence of multi-institution sequencing data presents a powerful resource for learning real-world evidence to enhance precision oncology. GENIE BPC, led by the American Association for Cancer Research, establishes a unique database linking genomic data with clinical information for patients treated at multiple cancer centers. However, leveraging such multi-institutional sequencing data presents significant challenges. Variations in gene panels result in loss of information when the analysis is conducted on common gene sets. Additionally, differences in sequencing techniques and patient heterogeneity across institutions add complexity. High data dimensionality, sparse gene mutation patterns, and weak signals at the individual gene level further complicate matters. Motivated by these real-world challenges, we introduce the Bridge model. It uses a quantile-matched latent variable approach to derive integrated features to preserve information beyond common genes and maximize the utilization of all available data while leveraging information sharing to enhance both learning efficiency and the model's capacity to generalize. By extracting harmonized and noise-reduced lower-dimensional latent variables, the true mutation pattern unique to each individual is captured. We assess the model's performance and parameter estimation through extensive simulation studies. The extracted latent features from the Bridge model consistently excel in predicting patient survival across six cancer types in GENIE BPC data

arXiv.org e-Print Archive

RD-DPP: Rate-Distortion Theory Meets Determinantal Point Process to Diversify Learning Data Samples

Author: Amin Rahul
Chen Xiwen
Li Huayu
Razi Abolfazl
Publication venue
Publication date: 08/04/2023
Field of study

In some practical learning tasks, such as traffic video analysis, the number of available training samples is restricted by different factors, such as limited communication bandwidth and computation power; therefore, it is imperative to select diverse data samples that contribute the most to the quality of the learning system. One popular approach to selecting diverse samples is Determinantal Point Process (DPP). However, it suffers from a few known drawbacks, such as restriction of the number of samples to the rank of the similarity matrix, and not being customizable for specific learning tasks (e.g., multi-level classification tasks). In this paper, we propose a new way of measuring task-oriented diversity based on the Rate-Distortion (RD) theory, appropriate for multi-level classification. To this end, we establish a fundamental relationship between DPP and RD theory, which led to designing RD-DPP, an RD-based value function to evaluate the diversity gain of data samples. We also observe that the upper bound of the diversity of data selected by DPP has a universal trend of phase transition that quickly approaches its maximum point, then slowly converges to its final limits, meaning that DPP is beneficial only at the beginning of sample accumulation. We use this fact to design a bi-modal approach for sequential data selection

arXiv.org e-Print Archive

Progressively Dual Prior Guided Few-shot Semantic Segmentation

Author: Cao Qinglong
Chen Yuntian
Han Junwei
Yao Xiwen
Publication venue
Publication date: 20/11/2022
Field of study

Few-shot semantic segmentation task aims at performing segmentation in query images with a few annotated support samples. Currently, few-shot segmentation methods mainly focus on leveraging foreground information without fully utilizing the rich background information, which could result in wrong activation of foreground-like background regions with the inadaptability to dramatic scene changes of support-query image pairs. Meanwhile, the lack of detail mining mechanism could cause coarse parsing results without some semantic components or edge areas since prototypes have limited ability to cope with large object appearance variance. To tackle these problems, we propose a progressively dual prior guided few-shot semantic segmentation network. Specifically, a dual prior mask generation (DPMG) module is firstly designed to suppress the wrong activation in foreground-background comparison manner by regarding background as assisted refinement information. With dual prior masks refining the location of foreground area, we further propose a progressive semantic detail enrichment (PSDE) module which forces the parsing model to capture the hidden semantic details by iteratively erasing the high-confidence foreground region and activating details in the rest region with a hierarchical structure. The collaboration of DPMG and PSDE formulates a novel few-shot segmentation network that can be learned in an end-to-end manner. Comprehensive experiments on PASCAL-5i and MS COCO powerfully demonstrate that our proposed algorithm achieves the great performance

arXiv.org e-Print Archive

Analysis and differentiation of seminal plasma via polarized SERS spectroscopy

Author: Chen
Feng
Haishan Zeng
JH Chen
Lu
Wang
Xiwen Chen
Zufang Huang
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref

Evidences for interaction-induced Haldane fractional exclusion statistics in one and higher dimensions

Author: Chen Yang-Yang
Deng Youjin
Guan Xiwen
Liu Longxiang
Zhang Xibo
Publication venue
Publication date: 24/10/2019
Field of study

Haldane fractional exclusion statistics (FES) has a long history of intense studies, but its realization in physical systems is rare. Here we study repulsively interacting Bose gases at and near a quantum critical point, and find evidences that such strongly correlated gases obey simple non-mutual FES over a wide range of interaction strengths in both one and two dimensions. Based on exact solutions in one dimension, quantum Monte Carlo simulations and experiments in both dimensions, we show that the thermodynamic properties of these interacting gases, including entropy per particle, density and pressure, are essentially equivalent to those of non-interacting particles with FES. Accordingly, we establish a simple interaction-to-FES mapping that reveals the statistical nature of particle-hole symmetry breaking induced by interaction in such quantum many-body systems. Whereas strongly interacting Bose gases reach full fermionization in one dimension, they exhibit incomplete fermionization in two dimensions. Our results open a route to understanding correlated interacting systems via non-interacting particles with FES in arbitrary dimensions.Comment: There are 4 figures in the main text as well as a supplemental materia

arXiv.org e-Print Archive