9,571 research outputs found

    Termhood-based Comparability Metrics of Comparable Corpus in Special Domain

    Full text link
    Cross-Language Information Retrieval (CLIR) and machine translation (MT) resources, such as dictionaries and parallel corpora, are scarce and hard to come by for special domains. Besides, these resources are just limited to a few languages, such as English, French, and Spanish and so on. So, obtaining comparable corpora automatically for such domains could be an answer to this problem effectively. Comparable corpora, that the subcorpora are not translations of each other, can be easily obtained from web. Therefore, building and using comparable corpora is often a more feasible option in multilingual information processing. Comparability metrics is one of key issues in the field of building and using comparable corpus. Currently, there is no widely accepted definition or metrics method of corpus comparability. In fact, Different definitions or metrics methods of comparability might be given to suit various tasks about natural language processing. A new comparability, namely, termhood-based metrics, oriented to the task of bilingual terminology extraction, is proposed in this paper. In this method, words are ranked by termhood not frequency, and then the cosine similarities, calculated based on the ranking lists of word termhood, is used as comparability. Experiments results show that termhood-based metrics performs better than traditional frequency-based metrics

    Detecting early signs of the 2007-2008 crisis in the world trade

    Get PDF
    Since 2007, several contributions have tried to identify early-warning signals of the financial crisis. However, the vast majority of analyses has focused on financial systems and little theoretical work has been done on the economic counterpart. In the present paper we fill this gap and employ the theoretical tools of network theory to shed light on the response of world trade to the financial crisis of 2007 and the economic recession of 2008-2009. We have explored the evolution of the bipartite World Trade Web (WTW) across the years 1995-2010, monitoring the behavior of the system both before and after 2007. Our analysis shows early structural changes in the WTW topology: since 2003, the WTW becomes increasingly compatible with the picture of a network where correlations between countries and products are progressively lost. Moreover, the WTW structural modification can be considered as concluded in 2010, after a seemingly stationary phase of three years. We have also refined our analysis by considering specific subsets of countries and products: the most statistically significant early-warning signals are provided by the most volatile macrosectors, especially when measured on developing countries, suggesting the emerging economies as being the most sensitive ones to the global economic cycles.Comment: 18 pages, 9 figure

    A Benchmark for Image Retrieval using Distributed Systems over the Internet: BIRDS-I

    Full text link
    The performance of CBIR algorithms is usually measured on an isolated workstation. In a real-world environment the algorithms would only constitute a minor component among the many interacting components. The Internet dramati-cally changes many of the usual assumptions about measuring CBIR performance. Any CBIR benchmark should be designed from a networked systems standpoint. These benchmarks typically introduce communication overhead because the real systems they model are distributed applications. We present our implementation of a client/server benchmark called BIRDS-I to measure image retrieval performance over the Internet. It has been designed with the trend toward the use of small personalized wireless systems in mind. Web-based CBIR implies the use of heteroge-neous image sets, imposing certain constraints on how the images are organized and the type of performance metrics applicable. BIRDS-I only requires controlled human intervention for the compilation of the image collection and none for the generation of ground truth in the measurement of retrieval accuracy. Benchmark image collections need to be evolved incrementally toward the storage of millions of images and that scaleup can only be achieved through the use of computer-aided compilation. Finally, our scoring metric introduces a tightly optimized image-ranking window.Comment: 24 pages, To appear in the Proc. SPIE Internet Imaging Conference 200

    Spectroscopy of blue horizontal branch stars in NGC 6656 (M22)

    Full text link
    Recent investigations revealed very peculiar properties of blue horizontal branch (HB) stars in \omega Centauri, which show anomalously low surface gravity and mass compared to other clusters and to theoretical models. \omega Centauri, however, is a very unusual object, hosting a complex mix of multiple stellar populations with different metallicity and chemical abundances. We measured the fundamental parameters (temperature, gravity, and surface helium abundance) of a sample of 71 blue HB stars in M22, with the aim of clarifying if the peculiar results found in \omega Cen are unique to this cluster. M22 also hosts multiple sub-populations of stars with a spread in metallicity, analogous to \omega Cen. The stellar parameters were measured on low-resolution spectra fitting the Balmer and helium lines with a grid of synthetic spectra. From these parameters, the mass and reddening were estimated. Our results on the gravities and masses agree well with theoretical expectations, matching the previous measurements in three "normal" clusters. The anomalies found in \omega Cen are not observed among our stars. A mild mass underestimate is found for stars hotter than 14\,000 K, but an exact analogy with \omega Cen cannot be drawn. We measured the reddening in the direction of M22 with two independent methods, finding E(B-V)=0.35 \pm 0.02 mag, with semi-amplitude of the maximum variation \Delta(E(B-V))=0.06 mag, and an rms intrinsic dispersion of \sigma(E(B-V))=0.03 mag.Comment: 11 pages, 9 Postscript figure

    Patterns of dominant flows in the world trade web

    Get PDF
    The large-scale organization of the world economies is exhibiting increasingly levels of local heterogeneity and global interdependency. Understanding the relation between local and global features calls for analytical tools able to uncover the global emerging organization of the international trade network. Here we analyze the world network of bilateral trade imbalances and characterize its overall flux organization, unraveling local and global high-flux pathways that define the backbone of the trade system. We develop a general procedure capable to progressively filter out in a consistent and quantitative way the dominant trade channels. This procedure is completely general and can be applied to any weighted network to detect the underlying structure of transport flows. The trade fluxes properties of the world trade web determines a ranking of trade partnerships that highlights global interdependencies, providing information not accessible by simple local analysis. The present work provides new quantitative tools for a dynamical approach to the propagation of economic crises

    Measuring the understandability of WSDL specifications, web service understanding degree approach and system

    Get PDF
    Web Services (WS) are fundamental software artifacts for building service oriented applications and they are usually reused by others. Therefore they must be analyzed and comprehended for maintenance tasks: identification of critical parts, bug fixing, adaptation and improvement. In this article, WSDLUD a method aimed at measuring a priori the understanding degree (UD) of WSDL (Web Service Description Language) descriptions is presented. In order to compute UD several criteria useful to measure the understanding’s complexity of WSDL descriptions must be defined. These criteria are used by LSP (Logic Scoring of Preference), a multicriteria evaluation method, for producing a Global Preference value that indicates the satisfaction level of the WSDL description regarding the evaluation focus, in this case, the understanding degree. All the criteria information required by LSP is extracted from WSDL descriptions by using static analysis techniques and processed by specific algorithms which allow gathering semantic information. This process allows to obtain a priori information about the comprehension difficulty which proves our research hypotheses that states that it is possible to compute the understanding degree of a WSDL description.info:eu-repo/semantics/publishedVersio

    Artificial Sequences and Complexity Measures

    Get PDF
    In this paper we exploit concepts of information theory to address the fundamental problem of identifying and defining the most suitable tools to extract, in a automatic and agnostic way, information from a generic string of characters. We introduce in particular a class of methods which use in a crucial way data compression techniques in order to define a measure of remoteness and distance between pairs of sequences of characters (e.g. texts) based on their relative information content. We also discuss in detail how specific features of data compression techniques could be used to introduce the notion of dictionary of a given sequence and of Artificial Text and we show how these new tools can be used for information extraction purposes. We point out the versatility and generality of our method that applies to any kind of corpora of character strings independently of the type of coding behind them. We consider as a case study linguistic motivated problems and we present results for automatic language recognition, authorship attribution and self consistent-classification.Comment: Revised version, with major changes, of previous "Data Compression approach to Information Extraction and Classification" by A. Baronchelli and V. Loreto. 15 pages; 5 figure

    Gamma-distribution and wealth inequality

    Full text link
    We discuss the equivalence between kinetic wealth-exchange models, in which agents exchange wealth during trades, and mechanical models of particles, exchanging energy during collisions. The universality of the underlying dynamics is shown both through a variational approach based on the minimization of the Boltzmann entropy and a complementary microscopic analysis of the collision dynamics of molecules in a gas. In various relevant cases the equilibrium distribution is the same for all these models, namely a gamma-distribution with suitably defined temperature and number of dimensions. This in turn allows one to quantify the inequalities observed in the wealth distributions and suggests that their origin should be traced back to very general underlying mechanisms: for instance, it follows that the smaller the fraction of the relevant quantity (e.g. wealth or energy) that agents can exchange during an interaction, the closer the corresponding equilibrium distribution is to a fair distribution.Comment: Presented to the International Workshop and Conference on: Statistical Physics Approaches to Multi-disciplinary Problems, January 07-13, 2008, IIT Guwahati, Indi
    corecore