Search CORE

739 research outputs found

Personalized Treatment-Response Trajectories: Errors-in-variables, Interpretability, and Causality

Author: Zhang Guangyi
Publication venue
Publication date: 17/06/2019
Field of study

One fundamental problem in many applications is to estimate treatment-response trajectories given multidimensional treatment variables. However, in reality, the estimation suffers severely from measurement error both in treatment timing and covariates, for example when the treatment data are self-reported by users. We introduce a novel data-driven method to tackle this challenging problem, which models personalized treatment-response trajectories as a sum of a parametric response function, based on restored true treatment timing and covariates and sharing information across individuals under a hierarchical structure, and a counterfactual trend fitted by a sparse Gaussian Process. In a real-life dataset where the impact of diet on continuous blood glucose is estimated, our model achieves a superior performance in estimation accuracy and prediction

Aaltodoc Publication Archive

RFNet: Riemannian Fusion Network for EEG-based Brain-Computer Interfaces

Author: Etemad Ali
Zhang Guangyi
Publication venue
Publication date: 19/08/2020
Field of study

This paper presents the novel Riemannian Fusion Network (RFNet), a deep neural architecture for learning spatial and temporal information from Electroencephalogram (EEG) for a number of different EEG-based Brain Computer Interface (BCI) tasks and applications. The spatial information relies on Spatial Covariance Matrices (SCM) of multi-channel EEG, whose space form a Riemannian Manifold due to the Symmetric and Positive Definite structure. We exploit a Riemannian approach to map spatial information onto feature vectors in Euclidean space. The temporal information characterized by features based on differential entropy and logarithm power spectrum density is extracted from different windows through time. Our network then learns the temporal information by employing a deep long short-term memory network with a soft attention mechanism. The output of the attention mechanism is used as the temporal feature vector. To effectively fuse spatial and temporal information, we use an effective fusion strategy, which learns attention weights applied to embedding-specific features for decision making. We evaluate our proposed framework on four public datasets from three popular fields of BCI, notably emotion recognition, vigilance estimation, and motor imagery classification, containing various types of tasks such as binary classification, multi-class classification, and regression. RFNet approaches the state-of-the-art on one dataset (SEED) and outperforms other methods on the other three datasets (SEED-VIG, BCI-IV 2A, and BCI-IV 2B), setting new state-of-the-art values and showing the robustness of our framework in EEG representation learning

arXiv.org e-Print Archive

Sparse Convolution for Approximate Sparse Instance

Author: Li Xiaoxiao
Song Zhao
Zhang Guangyi
Publication venue
Publication date: 04/06/2023
Field of study

Computing the convolution

A \star B

of two vectors of dimension

n

is one of the most important computational primitives in many fields. For the non-negative convolution scenario, the classical solution is to leverage the Fast Fourier Transform whose time complexity is

O(n \log n)

. However, the vectors

A

and

B

could be very sparse and we can exploit such property to accelerate the computation to obtain the result. In this paper, we show that when

\|A \star B\|_{\geq c_1} = k

and

\|A \star B\|_{\leq c_2} = n-k

holds, we can approximately recover the all index in

\mathrm{supp}_{\geq c_1}(A \star B)

with point-wise error of

o(1)

O(k \log (n) \log(k)\log(k/\delta))

time. We further show that we can iteratively correct the error and recover all index in

\mathrm{supp}_{\geq c_1}(A \star B)

correctly in

O(k \log(n) \log^2(k) (\log(1/\delta) + \log\log(k)))

time

arXiv.org e-Print Archive

Finding Favourite Tuples on Data Streams with Provably Few Comparisons

Author: Gionis Aristides
Tatti Nikolaj
Zhang Guangyi
Publication venue
Publication date: 06/07/2023
Field of study

One of the most fundamental tasks in data science is to assist a user with unknown preferences in finding high-utility tuples within a large database. To accurately elicit the unknown user preferences, a widely-adopted way is by asking the user to compare pairs of tuples. In this paper, we study the problem of identifying one or more high-utility tuples by adaptively receiving user input on a minimum number of pairwise comparisons. We devise a single-pass streaming algorithm, which processes each tuple in the stream at most once, while ensuring that the memory size and the number of requested comparisons are in the worst case logarithmic in

n

, where

n

is the number of all tuples. An important variant of the problem, which can help to reduce human error in comparisons, is to allow users to declare ties when confronted with pairs of tuples of nearly equal utility. We show that the theoretical guarantees of our method can be maintained for this important problem variant. In addition, we show how to enhance existing pruning techniques in the literature by leveraging powerful tools from mathematical programming. Finally, we systematically evaluate all proposed algorithms over both synthetic and real-life datasets, examine their scalability, and demonstrate their superior performance over existing methods.Comment: To appear in KDD 202

arXiv.org e-Print Archive

Ranking with submodular functions on a budget

Author: Gionis Aristides
Tatti Nikolaj
Zhang Guangyi
Publication venue
Publication date: 01/01/2022
Field of study

Submodular maximization has been the backbone of many important machine-learning problems, and has applications to viral marketing, diversification, sensor placement, and more. However, the study of maximizing submodular functions has mainly been restricted in the context of selecting a set of items. On the other hand, many real-world applications require a solution that is a ranking over a set of items. The problem of ranking in the context of submodular function maximization has been considered before, but to a much lesser extent than item-selection formulations. In this paper, we explore a novel formulation for ranking items with submodular valuations and budget constraints. We refer to this problem as max-submodular ranking (MSR). In more detail, given a set of items and a set of non-decreasing submodular functions, where each function is associated with a budget, we aim to find a ranking of the set of items that maximizes the sum of values achieved by all functions under the budget constraints. For the MSR problem with cardinality- and knapsack-type budget constraints we propose practical algorithms with approximation guarantees. In addition, we perform an empirical evaluation, which demonstrates the superior performance of the proposed algorithms against strong baselines.Peer reviewe

arXiv.org e-Print Archive

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction

Author: Chen Guangyi
Chen Zhenhao
Fan Shunxing
Zhang Kun
Publication venue
Publication date: 09/04/2023
Field of study

The indeterminate nature of human motion requires trajectory prediction systems to use a probabilistic model to formulate the multi-modality phenomenon and infer a finite set of future trajectories. However, the inference processes of most existing methods rely on Monte Carlo random sampling, which is insufficient to cover the realistic paths with finite samples, due to the long tail effect of the predicted distribution. To promote the sampling process of stochastic prediction, we propose a novel method, called BOsampler, to adaptively mine potential paths with Bayesian optimization in an unsupervised manner, as a sequential design strategy in which new prediction is dependent on the previously drawn samples. Specifically, we model the trajectory sampling as a Gaussian process and construct an acquisition function to measure the potential sampling value. This acquisition function applies the original distribution as prior and encourages exploring paths in the long-tail region. This sampling method can be integrated with existing stochastic predictive models without retraining. Experimental results on various baseline methods demonstrate the effectiveness of our method

arXiv.org e-Print Archive