9,519 research outputs found
Name Disambiguation from link data in a collaboration graph using temporal and topological features
In a social community, multiple persons may share the same name, phone number
or some other identifying attributes. This, along with other phenomena, such as
name abbreviation, name misspelling, and human error leads to erroneous
aggregation of records of multiple persons under a single reference. Such
mistakes affect the performance of document retrieval, web search, database
integration, and more importantly, improper attribution of credit (or blame).
The task of entity disambiguation partitions the records belonging to multiple
persons with the objective that each decomposed partition is composed of
records of a unique person. Existing solutions to this task use either
biographical attributes, or auxiliary features that are collected from external
sources, such as Wikipedia. However, for many scenarios, such auxiliary
features are not available, or they are costly to obtain. Besides, the attempt
of collecting biographical or external data sustains the risk of privacy
violation. In this work, we propose a method for solving entity disambiguation
task from link information obtained from a collaboration network. Our method is
non-intrusive of privacy as it uses only the time-stamped graph topology of an
anonymized network. Experimental results on two real-life academic
collaboration networks show that the proposed method has satisfactory
performance.Comment: The short version of this paper has been accepted to ASONAM 201
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
Modeling Financial Time Series with Artificial Neural Networks
Financial time series convey the decisions and actions of a population of human actors over time. Econometric and regressive models have been developed in the past decades for analyzing these time series. More recently, biologically inspired artificial neural network models have been shown to overcome some of the main challenges of traditional techniques by better exploiting the non-linear, non-stationary, and oscillatory nature of noisy, chaotic human interactions. This review paper explores the options, benefits, and weaknesses of the various forms of artificial neural networks as compared with regression techniques in the field of financial time series analysis.CELEST, a National Science Foundation Science of Learning Center (SBE-0354378); SyNAPSE program of the Defense Advanced Research Project Agency (HR001109-03-0001
Lazy learning in radial basis neural networks: A way of achieving more accurate models
Radial Basis Neural Networks have been successfully used in a large number of applications having in its rapid convergence time one of its most important advantages. However, the level of generalization is usually poor and very dependent on the quality of the training data because some of the training patterns can be redundant or irrelevant. In this paper, we present a learning method that automatically selects the training patterns more appropriate to the new sample to be approximated. This training method follows a lazy learning strategy, in the sense that it builds approximations centered around the novel sample. The proposed method has been applied to three different domains an artificial regression problem and two time series prediction problems. Results have been compared to standard training method using the complete training data set and the new method shows better generalization abilities.Publicad
Neural Networks with Non-Uniform Embedding and Explicit Validation Phase to Assess Granger Causality
A challenging problem when studying a dynamical system is to find the
interdependencies among its individual components. Several algorithms have been
proposed to detect directed dynamical influences between time series. Two of
the most used approaches are a model-free one (transfer entropy) and a
model-based one (Granger causality). Several pitfalls are related to the
presence or absence of assumptions in modeling the relevant features of the
data. We tried to overcome those pitfalls using a neural network approach in
which a model is built without any a priori assumptions. In this sense this
method can be seen as a bridge between model-free and model-based approaches.
The experiments performed will show that the method presented in this work can
detect the correct dynamical information flows occurring in a system of time
series. Additionally we adopt a non-uniform embedding framework according to
which only the past states that actually help the prediction are entered into
the model, improving the prediction and avoiding the risk of overfitting. This
method also leads to a further improvement with respect to traditional Granger
causality approaches when redundant variables (i.e. variables sharing the same
information about the future of the system) are involved. Neural networks are
also able to recognize dynamics in data sets completely different from the ones
used during the training phase
- …