Search CORE

132 research outputs found

An Efficient Sum Query Algorithm for Distance-based Locally Dominating Functions

Author: Huang Ziyun
Xu Jinhui
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th International Symposium on Algorithms and Computation (ISAAC 2017)
Publication date: 01/01/2017
Field of study

In this paper, we consider the following sum query problem: Given a point set P in R^d, and a distance-based function f(p,q) (i.e. a function of the distance between p and q) satisfying some general properties, the goal is to develop a data structure and a query algorithm for efficiently computing a (1+epsilon)-approximate solution to the sum sum_{p in P} f(p,q) for any query point q in R^d and any small constant epsilon>0. Existing techniques for this problem are mainly based on some core-set techniques which often have difficulties to deal with functions with local domination property. Based on several new insights to this problem, we develop in this paper a novel technique to overcome these encountered difficulties. Our algorithm is capable of answering queries with high success probability in time no more than ~O_{epsilon,d}(n^{0.5 + c}), and the underlying data structure can be constructed in ~O_{epsilon,d}(n^{1+c}) time for any c>0, where the hidden constant has only polynomial dependence on 1/epsilon and d. Our technique is simple and can be easily implemented for practical purpose

Dagstuhl Research Online Publication Server

The career paths of graduates in chinese interpreting studies: a scientometric exploration

Author: Xu Ziyun
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 01/01/2015
Field of study

Els darrers 30 anys, el creixement dels Estudis d’Interpretació xinesos ha sigut, com a mínim, espectacular. La creixent col·laboració econòmica i política entre la Xina i Occident han motivat la demanda d’intèrprets per a superar les diferències lingüístiques i culturals. Des que es van crear graus i màsters en traducció i interpretació per tota la Xina, centenars d’egressats universitaris s’han embarcat en carreres professionals diverses. Aquest estudi comença amb un panorama de la disciplina: trajectòria de creixement, tendències temàtiques i teòriques dominants, metodologies d’investigació i col·laboracions, i principals figures. Partint d’un corpus exhaustiu de tesis de màster, es fa servir l’Aparellament per Puntuació de Propensió (APP) i l’Avaluació de la Importància de les Variables (EIV) per a examinar quins determinants estructurals poden tenir un impacte causal en les decisions que els estudiants prenen sobre la seva carrera professional després de la graduació. La investigació revela que és més probable que entrin al món acadèmic els escriptors de tesis empíriques que no aquells que van dur a terme recerca teòrica. En contra de les expectatives habituals, el contingut de la tesis i el prestigi de l’afiliació acadèmica de l’estudiant i el director de tesi tenen poc impacte en la decisió. Segons la disciplina continua desenvolupant-se i madura, els factors que afecten les decisions sobre la carrera dels egressats tendeixen a continuar desenvolupant-se en paral·lel, tornant-la més complexa i diversa.En los últimos 30 años el crecimiento de los Estudios de Interpretación chinos ha sido, como poco, espectacular. La progresiva colaboración económica y política entre China y Occidente ha motivado una demanda de intérpretes para superar las diferencias lingüísticas y culturales. Desde que se crearon grados y máster en traducción e interpretación por toda China cientos de graduados universitarios se han embarcado en diferentes carreras profesionales. Este estudio empieza con un panorama de la disciplina: trayectoria de crecimiento, tendencias temáticas y teóricas dominantes, metodologías de investigación y colaboraciones, y principales figuras. A partir de un corpus exhaustivo de tesis de máster, se usa Pareamiento por Puntaje de Propensión (PPP) y Evaluación de la Importancia de las Variables (EIV) para examinar que determinantes estructurales pueden tener un impacto causal en las decisiones que los estudiantes toman sobre sus carreras tras la graduación. La investigación revela que es más probable que accedan al mundo académico los escritores de tesis empíricas que aquellos que realizan estudios teóricos. Al contrario de lo esperado, el contenido de la tesis y el prestigio de la afiliación académica del estudiante o el director de la tesis tienen poco impacto en la decisión. Según la disciplina se sigue desarrollando y madura, los factores que afectan las decisiones sobre la carrera profesional de los graduados tienden a seguir desarrollándose en paralelo, volviéndola más compleja y diversa.Increasing economic and political collaboration between China and the West has driven the demand for interpreters to bridge the linguistic and cultural divide. Since master’s and bachelor’s degree courses in interpreting and translation were created all over China, hundreds of university graduates have embarked on widely differing career paths. This study begins with an overview of the discipline: its growth trajectory, dominant theoretical and thematic trends, research methodologies and collaborations, and major players. Working from an exhaustive corpus of master’s theses, Propensity Score Matching (PSM) and Variable Importance Evaluation (VIE) are used to examine which structural determinants may have a causal impact on the decisions students make about their careers after graduation. The research reveals that writers of empirical theses are much more inclined to enter the academic sphere than those who conduct theoretical studies. Graduation year and geographical location of university also contribute to the choice between one career path and another. Contrary to common expectation, thesis content and the prestige of a student’s academic affiliation or thesis advisor have little impact on the decision. As the discipline continues to evolve and mature, the factors affecting graduates’ career choices are likely to develop in parallel, becoming ever more complex and diverse

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

Author: Han Xu
Liu Zhiyuan
Sun Maosong
Wang Ziyun
Yao Yuan
Yu Pengfei
Zhu Hao
Publication venue
Publication date: 01/01/2018
Field of study

We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers. The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers. We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods. Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans. We also show that a range of different reasoning skills are needed to solve our task. These results indicate that few-shot relation classification remains an open problem and still requires further research. Our detailed analysis points multiple directions for future research. All details and resources about the dataset and baselines are released on http://zhuhao.me/fewrel.Comment: EMNLP 2018. The first four authors contribute equally. The order is determined by dice rolling. Visit our website http://zhuhao.me/fewre

arXiv.org e-Print Archive

Crossref

Distributed and Robust Support Vector Machine

Author: Ding Hu
Huang Ziyun
Liu Yangwei
Xu Jinhui
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Symposium on Algorithms and Computation (ISAAC 2016)
Publication date: 01/01/2016
Field of study

In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in R^d space) of SVM are arbitrarily distributed among k nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed (1-epsilon)-approximation algorithm to reach this lower bound, where epsilon is a user specified small constant. For distributed SVM with outliers, we present a (1-epsilon)-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top t selection algorithm with communication complexity of O(k log (t)) in the coordinator model. Experimental results on benchmark datasets confirm the theoretical guarantees of our algorithms

Dagstuhl Research Online Publication Server

Small Candidate Set for Translational Pattern Search

Author: Feng Qilong
Huang Ziyun
Wang Jianxin
Xu Jinhui
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

In this paper, we study the following pattern search problem: Given a pair of point sets A and B in fixed dimensional space R^d, with |B| = n, |A| = m and n >= m, the pattern search problem is to find the translations T\u27s of A such that each of the identified translations induces a matching between T(A) and a subset B\u27 of B with cost no more than some given threshold, where the cost is defined as the minimum bipartite matching cost of T(A) and B\u27. We present a novel algorithm to produce a small set of candidate translations for the pattern search problem. For any B\u27 subseteq B with |B\u27| = |A|, there exists at least one translation T in the candidate set such that the minimum bipartite matching cost between T(A) and B\u27 is no larger than (1+epsilon) times the minimum bipartite matching cost between A and B\u27 under any translation (i.e., the optimal translational matching cost). We also show that there exists an alternative solution to this problem, which constructs a candidate set of size O(n log^2 n) in O(n log^2 n) time with high probability of success. As a by-product of our construction, we obtain a weak epsilon-net for hypercube ranges, which significantly improves the construction time and the size of the candidate set. Our technique can be applied to a number of applications, including the translational pattern matching problem

Dagstuhl Research Online Publication Server

Improved Algorithms for Clustering with Outliers

Author: Feng Qilong
Huang Ziyun
Wang Jianxin
Xu Jinhui
Zhang Zhen
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

Clustering is a fundamental problem in unsupervised learning. In many real-world applications, the to-be-clustered data often contains various types of noises and thus needs to be removed from the learning process. To address this issue, we consider in this paper two variants of such clustering problems, called k-median with m outliers and k-means with m outliers. Existing techniques for both problems either incur relatively large approximation ratios or can only efficiently deal with a small number of outliers. In this paper, we present improved solution to each of them for the case where k is a fixed number and m could be quite large. Particularly, we gave the first PTAS for the k-median problem with outliers in Euclidean space R^d for possibly high m and d. Our algorithm runs in O(nd((1/epsilon)(k+m))^(k/epsilon)^O(1)) time, which considerably improves the previous result (with running time O(nd(m+k)^O(m+k) + (1/epsilon)k log n)^O(1))) given by [Feldman and Schulman, SODA 2012]. For the k-means with outliers problem, we introduce a (6+epsilon)-approximation algorithm for general metric space with running time O(n(beta (1/epsilon)(k+m))^k) for some constant beta>1. Our algorithm first uses the k-means++ technique to sample O((1/epsilon)(k+m)) points from input and then select the k centers from them. Compared to the more involving existing techniques, our algorithms are much simpler, i.e., using only random sampling, and achieving better performance ratios

Dagstuhl Research Online Publication Server

A Unified Framework of FPT Approximation Algorithms for Clustering Problems

Author: Feng Qilong
Huang Ziyun
Wang Jianxin
Xu Jinhui
Zhang Zhen
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Algorithms and Computation (ISAAC 2020)
Publication date: 01/01/2020
Field of study

In this paper, we present a framework for designing FPT approximation algorithms for many k-clustering problems. Our results are based on a new technique for reducing search spaces. A reduced search space is a small subset of the input data that has the guarantee of containing k clients close to the facilities opened in an optimal solution for any clustering problem we consider. We show, somewhat surprisingly, that greedily sampling O(k) clients yields the desired reduced search space, based on which we obtain FPT(k)-time algorithms with improved approximation guarantees for problems such as capacitated clustering, lower-bounded clustering, clustering with service installation costs, fault tolerant clustering, and priority clustering

Dagstuhl Research Online Publication Server

The final step effect

Author: Jianmin Zeng
Jie Xu
Tao Wang
Ying He
Yujie Yuan
Ziyun Gao
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2023
Field of study

Suppose you need to complete a task of 5 steps, each of which has equal difficulty and pass rate. You somehow have a privilege that can ensure you pass one of the steps, but you need to decide which step to be privileged before you start the task. Which step do you want to privilege? Mathematically speaking, the effect of each step on the final outcome is identical, and so there seems to be no prima facie reason for a preference. Five studies were conducted to explore this issue. In Study 1, participants could place the privilege on any of steps 1–5. Participants were most inclined to privilege step 5. In Study 2, participants needed to pay some money to purchase the privilege for steps 1–5, respectively. Participants would pay most money for step 5. Study 3 directly reminded participants that the probability of success of the whole task is mathematically the same, no matter on which step the privilege is placed, but most of the participants still prefer to privilege the final step. Study 4 supposed that the outcomes of all steps were not announced until all steps were finished, and asked how painful participants would feel if they passed all steps but one. People thought they would feel most painful when they failed at the final step. In Study 5, an implicit association test showed that people associated the first step with easy and the final step with hard. These results demonstrated the phenomenon of the final step effect and suggested that both anticipated painfulness and stereotype may play a role in this phenomenon

Directory of Open Access Journals

Cortical hierarchy disorganization in major depressive disorder and its association with suicidality

Author: Chen Shengli
Hou Gangqiang
Lin Shiwei
Lin Xiaoshan
Qiu Yingwei
Xu Ziyun
Zhang Xiaojing
Zhang Yingli
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

ObjectivesTo explore the suicide risk-specific disruption of cortical hierarchy in major depressive disorder (MDD) patients with diverse suicide risks.MethodsNinety-two MDD patients with diverse suicide risks and 38 matched controls underwent resting-state functional MRI. Connectome gradient analysis and stepwise functional connectivity (SFC) analysis were used to characterize the suicide risk-specific alterations of cortical hierarchy in MDD patients.ResultsRelative to controls, patients with suicide attempts (SA) had a prominent compression from the sensorimotor system; patients with suicide ideations (SI) had a prominent compression from the higher-level systems; non-suicide patients had a compression from both the sensorimotor system and higher-level systems, although it was less prominent relative to SA and SI patients. SFC analysis further validated this depolarization phenomenon.ConclusionThis study revealed MDD patients had suicide risk-specific disruptions of cortical hierarchy, which advance our understanding of the neuromechanisms of suicidality in MDD patients

Directory of Open Access Journals