134 research outputs found
Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices
This paper is concerned with the interplay between statistical asymmetry and
spectral methods. Suppose we are interested in estimating a rank-1 and
symmetric matrix , yet only a
randomly perturbed version is observed. The noise matrix
is composed of zero-mean independent (but not
necessarily homoscedastic) entries and is, therefore, not symmetric in general.
This might arise, for example, when we have two independent samples for each
entry of and arrange them into an {\em asymmetric} data
matrix . The aim is to estimate the leading eigenvalue and
eigenvector of . We demonstrate that the leading eigenvalue
of the data matrix can be times more accurate --- up
to some log factor --- than its (unadjusted) leading singular value in
eigenvalue estimation. Further, the perturbation of any linear form of the
leading eigenvector of --- say, entrywise eigenvector perturbation
--- is provably well-controlled. This eigen-decomposition approach is fully
adaptive to heteroscedasticity of noise without the need of careful bias
correction or any prior knowledge about the noise variance. We also provide
partial theory for the more general rank- case. The takeaway message is
this: arranging the data samples in an asymmetric manner and performing
eigen-decomposition could sometimes be beneficial.Comment: accepted to Annals of Statistics, 2020. 37 page
Dynamic nonparametric filtering with application to finance
Problems of nonparametric filtering arises frequently in engineering and financial economics. Nonparametric filters often involve some filtering parameters to choose. These parameters can be chosen to optimize the performance locally at each time point or globally over a time interval. In this article, the filtering parameters are chosen via minimizing the prediction error for a large class of filters. Under a general martingale setting, with mild conditions on the time series structure and virtually no assumption on filters, we show that the adaptive filter with filtering parameter chosen by historical data performs nearly as well as the one with the ideal filter in the class, in terms of filtering errors. The theoretical result is also verified via intensive simulations. Our approach is also useful for choosing the orders of parametric models such as AR or GARCH processes. It can also be applied to volatility estimation in financial economics. We illustrate the proposed methods by estimating the volatility of the returns of the S&P500 index and the yields of the three-month Treasury bills
Can Process Facilitation Improve Globally Distributed Collaboration? An Action Design Research
Distributed collaborators still face problems to organize, to coordinate, and to build consensus. Collaboration tools still have difficulty to configure, to use, and to help facilitate collaboration management. In this study, we conducted an action design research on Company A that relies on distributed collaboration for their business activities. Based on the design theory of collaboration engineering, we designed a process facilitation support application to address the problems identified from Company A with real organizational problems. After rounds of iteration, we proposed two artifacts including facilitated collaboration process and collaborative tools for applications of process guidance. Findings suggest the benefits of facilitated process guidance on globally distributed collaboration. The results of survey show consistently high satisfaction towards the tool and process guidance from the employees. Our research serves as an exploratory investigation in the field of distributed collaboration, and provides evidence regarding the organizational challenges in a business context
A Causal Intervention Scheme for Semantic Segmentation of Quasi-periodic Cardiovascular Signals
Precise segmentation is a vital first step to analyze semantic information of
cardiac cycle and capture anomaly with cardiovascular signals. However, in the
field of deep semantic segmentation, inference is often unilaterally confounded
by the individual attribute of data. Towards cardiovascular signals,
quasi-periodicity is the essential characteristic to be learned, regarded as
the synthesize of the attributes of morphology (Am) and rhythm (Ar). Our key
insight is to suppress the over-dependence on Am or Ar while the generation
process of deep representations. To address this issue, we establish a
structural causal model as the foundation to customize the intervention
approaches on Am and Ar, respectively. In this paper, we propose contrastive
causal intervention (CCI) to form a novel training paradigm under a frame-level
contrastive framework. The intervention can eliminate the implicit statistical
bias brought by the single attribute and lead to more objective
representations. We conduct comprehensive experiments with the controlled
condition for QRS location and heart sound segmentation. The final results
indicate that our approach can evidently improve the performance by up to 0.41%
for QRS location and 2.73% for heart sound segmentation. The efficiency of the
proposed method is generalized to multiple databases and noisy signals.Comment: submitted to IEEE Journal of Biomedical and Health Informatics
(J-BHI
An experimental study of satisfaction response: Evaluation of online collaborative learning
On the one hand, a growing amount of research discusses support for improving online collaborative learning quality, and many indicators are focused to assess its success. On the other hand, thinkLets for designing reputable and valuable collaborative processes have been developed for more than ten years. However, few studies try to apply thinkLets to online collaborative learning. This paper introduces thinkLets to online collaborative learning and experimentally tests its effectiveness with participants' responses on their satisfaction. Yield Shift Theory (YST), a causal theory explaining inner satisfaction, is adopted. In the experiment, 113 students from Universities in Beijing, China are chosen as a sample. They were divided into two groups, collaborating online in a simulated class. Then, YST in student groups under online collaborative learning is validated, a comparison study of online collaborative learning with and without thinkLets is implemented, and the satisfaction response of participants are analyzed. As a result of this comparison, YST is proved applicable in this context, and satisfaction is higher in online collaborative learning with thinkLets
AceGPT, Localizing Large Language Models in Arabic
This paper explores the imperative need and methodology for developing a
localized Large Language Model (LLM) tailored for Arabic, a language with
unique cultural characteristics that are not adequately addressed by current
mainstream models like ChatGPT. Key concerns additionally arise when
considering cultural sensitivity and local values. To this end, the paper
outlines a packaged solution, including further pre-training with Arabic texts,
supervised fine-tuning (SFT) using native Arabic instructions and GPT-4
responses in Arabic, and reinforcement learning with AI feedback (RLAIF) using
a reward model that is sensitive to local culture and values. The objective is
to train culturally aware and value-aligned Arabic LLMs that can serve the
diverse application-specific needs of Arabic-speaking communities.
Extensive evaluations demonstrated that the resulting LLM called `AceGPT' is
the SOTA open Arabic LLM in various benchmarks, including instruction-following
benchmark (i.e., Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark
(i.e., Arabic MMLU and EXAMs), as well as the newly-proposed Arabic cultural \&
value alignment benchmark. Notably, AceGPT outperforms ChatGPT in the popular
Vicuna-80 benchmark when evaluated with GPT-4, despite the benchmark's limited
scale. % Natural Language Understanding (NLU) benchmark (i.e., ALUE)
Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT.Comment: https://github.com/FreedomIntelligence/AceGP
- …