132 research outputs found

    An Efficient Sum Query Algorithm for Distance-based Locally Dominating Functions

    Get PDF
    In this paper, we consider the following sum query problem: Given a point set P in R^d, and a distance-based function f(p,q) (i.e. a function of the distance between p and q) satisfying some general properties, the goal is to develop a data structure and a query algorithm for efficiently computing a (1+epsilon)-approximate solution to the sum sum_{p in P} f(p,q) for any query point q in R^d and any small constant epsilon>0. Existing techniques for this problem are mainly based on some core-set techniques which often have difficulties to deal with functions with local domination property. Based on several new insights to this problem, we develop in this paper a novel technique to overcome these encountered difficulties. Our algorithm is capable of answering queries with high success probability in time no more than ~O_{epsilon,d}(n^{0.5 + c}), and the underlying data structure can be constructed in ~O_{epsilon,d}(n^{1+c}) time for any c>0, where the hidden constant has only polynomial dependence on 1/epsilon and d. Our technique is simple and can be easily implemented for practical purpose

    The career paths of graduates in chinese interpreting studies: a scientometric exploration

    Get PDF
    Els darrers 30 anys, el creixement dels Estudis d’Interpretació xinesos ha sigut, com a mínim, espectacular. La creixent col·laboració econòmica i política entre la Xina i Occident han motivat la demanda d’intèrprets per a superar les diferències lingüístiques i culturals. Des que es van crear graus i màsters en traducció i interpretació per tota la Xina, centenars d’egressats universitaris s’han embarcat en carreres professionals diverses. Aquest estudi comença amb un panorama de la disciplina: trajectòria de creixement, tendències temàtiques i teòriques dominants, metodologies d’investigació i col·laboracions, i principals figures. Partint d’un corpus exhaustiu de tesis de màster, es fa servir l’Aparellament per Puntuació de Propensió (APP) i l’Avaluació de la Importància de les Variables (EIV) per a examinar quins determinants estructurals poden tenir un impacte causal en les decisions que els estudiants prenen sobre la seva carrera professional després de la graduació. La investigació revela que és més probable que entrin al món acadèmic els escriptors de tesis empíriques que no aquells que van dur a terme recerca teòrica. En contra de les expectatives habituals, el contingut de la tesis i el prestigi de l’afiliació acadèmica de l’estudiant i el director de tesi tenen poc impacte en la decisió. Segons la disciplina continua desenvolupant-se i madura, els factors que afecten les decisions sobre la carrera dels egressats tendeixen a continuar desenvolupant-se en paral·lel, tornant-la més complexa i diversa.En los últimos 30 años el crecimiento de los Estudios de Interpretación chinos ha sido, como poco, espectacular. La progresiva colaboración económica y política entre China y Occidente ha motivado una demanda de intérpretes para superar las diferencias lingüísticas y culturales. Desde que se crearon grados y máster en traducción e interpretación por toda China cientos de graduados universitarios se han embarcado en diferentes carreras profesionales. Este estudio empieza con un panorama de la disciplina: trayectoria de crecimiento, tendencias temáticas y teóricas dominantes, metodologías de investigación y colaboraciones, y principales figuras. A partir de un corpus exhaustivo de tesis de máster, se usa Pareamiento por Puntaje de Propensión (PPP) y Evaluación de la Importancia de las Variables (EIV) para examinar que determinantes estructurales pueden tener un impacto causal en las decisiones que los estudiantes toman sobre sus carreras tras la graduación. La investigación revela que es más probable que accedan al mundo académico los escritores de tesis empíricas que aquellos que realizan estudios teóricos. Al contrario de lo esperado, el contenido de la tesis y el prestigio de la afiliación académica del estudiante o el director de la tesis tienen poco impacto en la decisión. Según la disciplina se sigue desarrollando y madura, los factores que afectan las decisiones sobre la carrera profesional de los graduados tienden a seguir desarrollándose en paralelo, volviéndola más compleja y diversa.Increasing economic and political collaboration between China and the West has driven the demand for interpreters to bridge the linguistic and cultural divide. Since master’s and bachelor’s degree courses in interpreting and translation were created all over China, hundreds of university graduates have embarked on widely differing career paths. This study begins with an overview of the discipline: its growth trajectory, dominant theoretical and thematic trends, research methodologies and collaborations, and major players. Working from an exhaustive corpus of master’s theses, Propensity Score Matching (PSM) and Variable Importance Evaluation (VIE) are used to examine which structural determinants may have a causal impact on the decisions students make about their careers after graduation. The research reveals that writers of empirical theses are much more inclined to enter the academic sphere than those who conduct theoretical studies. Graduation year and geographical location of university also contribute to the choice between one career path and another. Contrary to common expectation, thesis content and the prestige of a student’s academic affiliation or thesis advisor have little impact on the decision. As the discipline continues to evolve and mature, the factors affecting graduates’ career choices are likely to develop in parallel, becoming ever more complex and diverse

    FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

    Full text link
    We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers. The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers. We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods. Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans. We also show that a range of different reasoning skills are needed to solve our task. These results indicate that few-shot relation classification remains an open problem and still requires further research. Our detailed analysis points multiple directions for future research. All details and resources about the dataset and baselines are released on http://zhuhao.me/fewrel.Comment: EMNLP 2018. The first four authors contribute equally. The order is determined by dice rolling. Visit our website http://zhuhao.me/fewre

    Distributed and Robust Support Vector Machine

    Get PDF
    In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in R^d space) of SVM are arbitrarily distributed among k nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed (1-epsilon)-approximation algorithm to reach this lower bound, where epsilon is a user specified small constant. For distributed SVM with outliers, we present a (1-epsilon)-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top t selection algorithm with communication complexity of O(k log (t)) in the coordinator model. Experimental results on benchmark datasets confirm the theoretical guarantees of our algorithms

    Small Candidate Set for Translational Pattern Search

    Get PDF
    In this paper, we study the following pattern search problem: Given a pair of point sets A and B in fixed dimensional space R^d, with |B| = n, |A| = m and n >= m, the pattern search problem is to find the translations T\u27s of A such that each of the identified translations induces a matching between T(A) and a subset B\u27 of B with cost no more than some given threshold, where the cost is defined as the minimum bipartite matching cost of T(A) and B\u27. We present a novel algorithm to produce a small set of candidate translations for the pattern search problem. For any B\u27 subseteq B with |B\u27| = |A|, there exists at least one translation T in the candidate set such that the minimum bipartite matching cost between T(A) and B\u27 is no larger than (1+epsilon) times the minimum bipartite matching cost between A and B\u27 under any translation (i.e., the optimal translational matching cost). We also show that there exists an alternative solution to this problem, which constructs a candidate set of size O(n log^2 n) in O(n log^2 n) time with high probability of success. As a by-product of our construction, we obtain a weak epsilon-net for hypercube ranges, which significantly improves the construction time and the size of the candidate set. Our technique can be applied to a number of applications, including the translational pattern matching problem

    Improved Algorithms for Clustering with Outliers

    Get PDF
    Clustering is a fundamental problem in unsupervised learning. In many real-world applications, the to-be-clustered data often contains various types of noises and thus needs to be removed from the learning process. To address this issue, we consider in this paper two variants of such clustering problems, called k-median with m outliers and k-means with m outliers. Existing techniques for both problems either incur relatively large approximation ratios or can only efficiently deal with a small number of outliers. In this paper, we present improved solution to each of them for the case where k is a fixed number and m could be quite large. Particularly, we gave the first PTAS for the k-median problem with outliers in Euclidean space R^d for possibly high m and d. Our algorithm runs in O(nd((1/epsilon)(k+m))^(k/epsilon)^O(1)) time, which considerably improves the previous result (with running time O(nd(m+k)^O(m+k) + (1/epsilon)k log n)^O(1))) given by [Feldman and Schulman, SODA 2012]. For the k-means with outliers problem, we introduce a (6+epsilon)-approximation algorithm for general metric space with running time O(n(beta (1/epsilon)(k+m))^k) for some constant beta>1. Our algorithm first uses the k-means++ technique to sample O((1/epsilon)(k+m)) points from input and then select the k centers from them. Compared to the more involving existing techniques, our algorithms are much simpler, i.e., using only random sampling, and achieving better performance ratios

    A Unified Framework of FPT Approximation Algorithms for Clustering Problems

    Get PDF
    In this paper, we present a framework for designing FPT approximation algorithms for many k-clustering problems. Our results are based on a new technique for reducing search spaces. A reduced search space is a small subset of the input data that has the guarantee of containing k clients close to the facilities opened in an optimal solution for any clustering problem we consider. We show, somewhat surprisingly, that greedily sampling O(k) clients yields the desired reduced search space, based on which we obtain FPT(k)-time algorithms with improved approximation guarantees for problems such as capacitated clustering, lower-bounded clustering, clustering with service installation costs, fault tolerant clustering, and priority clustering

    The final step effect

    Get PDF
    Suppose you need to complete a task of 5 steps, each of which has equal difficulty and pass rate. You somehow have a privilege that can ensure you pass one of the steps, but you need to decide which step to be privileged before you start the task. Which step do you want to privilege? Mathematically speaking, the effect of each step on the final outcome is identical, and so there seems to be no prima facie reason for a preference. Five studies were conducted to explore this issue. In Study 1, participants could place the privilege on any of steps 1–5. Participants were most inclined to privilege step 5. In Study 2, participants needed to pay some money to purchase the privilege for steps 1–5, respectively. Participants would pay most money for step 5. Study 3 directly reminded participants that the probability of success of the whole task is mathematically the same, no matter on which step the privilege is placed, but most of the participants still prefer to privilege the final step. Study 4 supposed that the outcomes of all steps were not announced until all steps were finished, and asked how painful participants would feel if they passed all steps but one. People thought they would feel most painful when they failed at the final step. In Study 5, an implicit association test showed that people associated the first step with easy and the final step with hard. These results demonstrated the phenomenon of the final step effect and suggested that both anticipated painfulness and stereotype may play a role in this phenomenon

    Cortical hierarchy disorganization in major depressive disorder and its association with suicidality

    Get PDF
    ObjectivesTo explore the suicide risk-specific disruption of cortical hierarchy in major depressive disorder (MDD) patients with diverse suicide risks.MethodsNinety-two MDD patients with diverse suicide risks and 38 matched controls underwent resting-state functional MRI. Connectome gradient analysis and stepwise functional connectivity (SFC) analysis were used to characterize the suicide risk-specific alterations of cortical hierarchy in MDD patients.ResultsRelative to controls, patients with suicide attempts (SA) had a prominent compression from the sensorimotor system; patients with suicide ideations (SI) had a prominent compression from the higher-level systems; non-suicide patients had a compression from both the sensorimotor system and higher-level systems, although it was less prominent relative to SA and SI patients. SFC analysis further validated this depolarization phenomenon.ConclusionThis study revealed MDD patients had suicide risk-specific disruptions of cortical hierarchy, which advance our understanding of the neuromechanisms of suicidality in MDD patients
    corecore