9,571 research outputs found
Termhood-based Comparability Metrics of Comparable Corpus in Special Domain
Cross-Language Information Retrieval (CLIR) and machine translation (MT)
resources, such as dictionaries and parallel corpora, are scarce and hard to
come by for special domains. Besides, these resources are just limited to a few
languages, such as English, French, and Spanish and so on. So, obtaining
comparable corpora automatically for such domains could be an answer to this
problem effectively. Comparable corpora, that the subcorpora are not
translations of each other, can be easily obtained from web. Therefore,
building and using comparable corpora is often a more feasible option in
multilingual information processing. Comparability metrics is one of key issues
in the field of building and using comparable corpus. Currently, there is no
widely accepted definition or metrics method of corpus comparability. In fact,
Different definitions or metrics methods of comparability might be given to
suit various tasks about natural language processing. A new comparability,
namely, termhood-based metrics, oriented to the task of bilingual terminology
extraction, is proposed in this paper. In this method, words are ranked by
termhood not frequency, and then the cosine similarities, calculated based on
the ranking lists of word termhood, is used as comparability. Experiments
results show that termhood-based metrics performs better than traditional
frequency-based metrics
Detecting early signs of the 2007-2008 crisis in the world trade
Since 2007, several contributions have tried to identify early-warning
signals of the financial crisis. However, the vast majority of analyses has
focused on financial systems and little theoretical work has been done on the
economic counterpart. In the present paper we fill this gap and employ the
theoretical tools of network theory to shed light on the response of world
trade to the financial crisis of 2007 and the economic recession of 2008-2009.
We have explored the evolution of the bipartite World Trade Web (WTW) across
the years 1995-2010, monitoring the behavior of the system both before and
after 2007. Our analysis shows early structural changes in the WTW topology:
since 2003, the WTW becomes increasingly compatible with the picture of a
network where correlations between countries and products are progressively
lost. Moreover, the WTW structural modification can be considered as concluded
in 2010, after a seemingly stationary phase of three years. We have also
refined our analysis by considering specific subsets of countries and products:
the most statistically significant early-warning signals are provided by the
most volatile macrosectors, especially when measured on developing countries,
suggesting the emerging economies as being the most sensitive ones to the
global economic cycles.Comment: 18 pages, 9 figure
A Benchmark for Image Retrieval using Distributed Systems over the Internet: BIRDS-I
The performance of CBIR algorithms is usually measured on an isolated
workstation. In a real-world environment the algorithms would only constitute a
minor component among the many interacting components. The Internet
dramati-cally changes many of the usual assumptions about measuring CBIR
performance. Any CBIR benchmark should be designed from a networked systems
standpoint. These benchmarks typically introduce communication overhead because
the real systems they model are distributed applications. We present our
implementation of a client/server benchmark called BIRDS-I to measure image
retrieval performance over the Internet. It has been designed with the trend
toward the use of small personalized wireless systems in mind. Web-based CBIR
implies the use of heteroge-neous image sets, imposing certain constraints on
how the images are organized and the type of performance metrics applicable.
BIRDS-I only requires controlled human intervention for the compilation of the
image collection and none for the generation of ground truth in the measurement
of retrieval accuracy. Benchmark image collections need to be evolved
incrementally toward the storage of millions of images and that scaleup can
only be achieved through the use of computer-aided compilation. Finally, our
scoring metric introduces a tightly optimized image-ranking window.Comment: 24 pages, To appear in the Proc. SPIE Internet Imaging Conference
200
Spectroscopy of blue horizontal branch stars in NGC 6656 (M22)
Recent investigations revealed very peculiar properties of blue horizontal
branch (HB) stars in \omega Centauri, which show anomalously low surface
gravity and mass compared to other clusters and to theoretical models. \omega
Centauri, however, is a very unusual object, hosting a complex mix of multiple
stellar populations with different metallicity and chemical abundances. We
measured the fundamental parameters (temperature, gravity, and surface helium
abundance) of a sample of 71 blue HB stars in M22, with the aim of clarifying
if the peculiar results found in \omega Cen are unique to this cluster. M22
also hosts multiple sub-populations of stars with a spread in metallicity,
analogous to \omega Cen. The stellar parameters were measured on low-resolution
spectra fitting the Balmer and helium lines with a grid of synthetic spectra.
From these parameters, the mass and reddening were estimated. Our results on
the gravities and masses agree well with theoretical expectations, matching the
previous measurements in three "normal" clusters. The anomalies found in \omega
Cen are not observed among our stars. A mild mass underestimate is found for
stars hotter than 14\,000 K, but an exact analogy with \omega Cen cannot be
drawn. We measured the reddening in the direction of M22 with two independent
methods, finding E(B-V)=0.35 \pm 0.02 mag, with semi-amplitude of the maximum
variation \Delta(E(B-V))=0.06 mag, and an rms intrinsic dispersion of
\sigma(E(B-V))=0.03 mag.Comment: 11 pages, 9 Postscript figure
Patterns of dominant flows in the world trade web
The large-scale organization of the world economies is exhibiting
increasingly levels of local heterogeneity and global interdependency.
Understanding the relation between local and global features calls for
analytical tools able to uncover the global emerging organization of the
international trade network. Here we analyze the world network of bilateral
trade imbalances and characterize its overall flux organization, unraveling
local and global high-flux pathways that define the backbone of the trade
system. We develop a general procedure capable to progressively filter out in a
consistent and quantitative way the dominant trade channels. This procedure is
completely general and can be applied to any weighted network to detect the
underlying structure of transport flows. The trade fluxes properties of the
world trade web determines a ranking of trade partnerships that highlights
global interdependencies, providing information not accessible by simple local
analysis. The present work provides new quantitative tools for a dynamical
approach to the propagation of economic crises
Measuring the understandability of WSDL specifications, web service understanding degree approach and system
Web Services (WS) are fundamental software artifacts for building service oriented applications and they are usually reused by others. Therefore they must be analyzed and comprehended for maintenance tasks: identification of critical parts, bug fixing, adaptation and improvement. In this article, WSDLUD a method aimed at measuring a priori the understanding degree (UD) of WSDL (Web Service Description Language) descriptions is presented. In order to compute UD several criteria useful to measure the understanding’s complexity of WSDL descriptions must be defined. These criteria are used by LSP (Logic Scoring of Preference), a multicriteria evaluation method, for producing a Global Preference value that indicates
the satisfaction level of the WSDL description regarding the evaluation focus,
in this case, the understanding degree. All the criteria information required by LSP is extracted from WSDL descriptions by using static analysis techniques and processed by specific algorithms which allow gathering semantic information. This process allows to obtain a priori information about the comprehension difficulty
which proves our research hypotheses that states that it is possible to compute the understanding degree of a WSDL description.info:eu-repo/semantics/publishedVersio
Artificial Sequences and Complexity Measures
In this paper we exploit concepts of information theory to address the
fundamental problem of identifying and defining the most suitable tools to
extract, in a automatic and agnostic way, information from a generic string of
characters. We introduce in particular a class of methods which use in a
crucial way data compression techniques in order to define a measure of
remoteness and distance between pairs of sequences of characters (e.g. texts)
based on their relative information content. We also discuss in detail how
specific features of data compression techniques could be used to introduce the
notion of dictionary of a given sequence and of Artificial Text and we show how
these new tools can be used for information extraction purposes. We point out
the versatility and generality of our method that applies to any kind of
corpora of character strings independently of the type of coding behind them.
We consider as a case study linguistic motivated problems and we present
results for automatic language recognition, authorship attribution and self
consistent-classification.Comment: Revised version, with major changes, of previous "Data Compression
approach to Information Extraction and Classification" by A. Baronchelli and
V. Loreto. 15 pages; 5 figure
Gamma-distribution and wealth inequality
We discuss the equivalence between kinetic wealth-exchange models, in which
agents exchange wealth during trades, and mechanical models of particles,
exchanging energy during collisions. The universality of the underlying
dynamics is shown both through a variational approach based on the minimization
of the Boltzmann entropy and a complementary microscopic analysis of the
collision dynamics of molecules in a gas. In various relevant cases the
equilibrium distribution is the same for all these models, namely a
gamma-distribution with suitably defined temperature and number of dimensions.
This in turn allows one to quantify the inequalities observed in the wealth
distributions and suggests that their origin should be traced back to very
general underlying mechanisms: for instance, it follows that the smaller the
fraction of the relevant quantity (e.g. wealth or energy) that agents can
exchange during an interaction, the closer the corresponding equilibrium
distribution is to a fair distribution.Comment: Presented to the International Workshop and Conference on:
Statistical Physics Approaches to Multi-disciplinary Problems, January 07-13,
2008, IIT Guwahati, Indi
- …