8,922 research outputs found
Least absolute deviation estimation of linear econometric models: A literature review
Econometricians generally take for granted that the error terms in the econometric models are generated by distributions having a finite variance. However, since the time of Pareto the existence of error distributions with infinite variance is known. Works of many econometricians, namely, Meyer & Glauber (1964), Fama (1965) and Mandlebroth (1967), on economic data series like prices in financial and commodity markets confirm that infinite variance distributions exist abundantly. The distribution of firms by size, behaviour of speculative prices and various other recent economic phenomena also display similar trends. Further, econometricians generally assume that the disturbance term, which is an influence of innumerably many factors not accounted for in the model, approaches normality according to the Central Limit Theorem. But Bartels (1977) is of the opinion that there are limit theorems, which are just likely to be relevant when considering the sum of number of components in a regression disturbance that leads to non-normal stable distribution characterized by infinite variance. Thus, the possibility of the error term following a non-normal distribution exists. The Least Squares method of estimation of parameters of linear (regression) models performs well provided that the residuals (disturbances or errors) are well behaved (preferably normally or near-normally distributed and not infested with large size outliers) and follow Gauss-Markov assumptions. However, models with the disturbances that are prominently non-normally distributed and contain sizeable outliers fail estimation by the Least Squares method. An intensive research has established that in such cases estimation by the Least Absolute Deviation (LAD) method performs well. This paper is an attempt to survey the literature on LAD estimation of single as well as multi-equation linear econometric models.Lad estimator; Least absolute deviation estimation; econometric model; LAD Estimator; Minimum Absolute Deviation; Robust; Outliers; L1 Estimator; Review of literature
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
DeepWalk: Online Learning of Social Representations
We present DeepWalk, a novel approach for learning latent representations of
vertices in a network. These latent representations encode social relations in
a continuous vector space, which is easily exploited by statistical models.
DeepWalk generalizes recent advancements in language modeling and unsupervised
feature learning (or deep learning) from sequences of words to graphs. DeepWalk
uses local information obtained from truncated random walks to learn latent
representations by treating walks as the equivalent of sentences. We
demonstrate DeepWalk's latent representations on several multi-label network
classification tasks for social networks such as BlogCatalog, Flickr, and
YouTube. Our results show that DeepWalk outperforms challenging baselines which
are allowed a global view of the network, especially in the presence of missing
information. DeepWalk's representations can provide scores up to 10%
higher than competing methods when labeled data is sparse. In some experiments,
DeepWalk's representations are able to outperform all baseline methods while
using 60% less training data. DeepWalk is also scalable. It is an online
learning algorithm which builds useful incremental results, and is trivially
parallelizable. These qualities make it suitable for a broad class of real
world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table
Decentralized Differentially Private Without-Replacement Stochastic Gradient Descent
While machine learning has achieved remarkable results in a wide variety of
domains, the training of models often requires large datasets that may need to
be collected from different individuals. As sensitive information may be
contained in the individual's dataset, sharing training data may lead to severe
privacy concerns. Therefore, there is a compelling need to develop
privacy-aware machine learning methods, for which one effective approach is to
leverage the generic framework of differential privacy. Considering that
stochastic gradient descent (SGD) is one of the mostly adopted methods for
large-scale machine learning problems, two decentralized differentially private
SGD algorithms are proposed in this work. Particularly, we focus on SGD without
replacement due to its favorable structure for practical implementation. In
addition, both privacy and convergence analysis are provided for the proposed
algorithms. Finally, extensive experiments are performed to verify the
theoretical results and demonstrate the effectiveness of the proposed
algorithms
Sequential Monte Carlo EM for multivariate probit models
Multivariate probit models (MPM) have the appealing feature of capturing some
of the dependence structure between the components of multidimensional binary
responses. The key for the dependence modelling is the covariance matrix of an
underlying latent multivariate Gaussian. Most approaches to MLE in multivariate
probit regression rely on MCEM algorithms to avoid computationally intensive
evaluations of multivariate normal orthant probabilities. As an alternative to
the much used Gibbs sampler a new SMC sampler for truncated multivariate
normals is proposed. The algorithm proceeds in two stages where samples are
first drawn from truncated multivariate Student distributions and then
further evolved towards a Gaussian. The sampler is then embedded in a MCEM
algorithm. The sequential nature of SMC methods can be exploited to design a
fully sequential version of the EM, where the samples are simply updated from
one iteration to the next rather than resampled from scratch. Recycling the
samples in this manner significantly reduces the computational cost. An
alternative view of the standard conditional maximisation step provides the
basis for an iterative procedure to fully perform the maximisation needed in
the EM algorithm. The identifiability of MPM is also thoroughly discussed. In
particular, the likelihood invariance can be embedded in the EM algorithm to
ensure that constrained and unconstrained maximisation are equivalent. A simple
iterative procedure is then derived for either maximisation which takes
effectively no computational time. The method is validated by applying it to
the widely analysed Six Cities dataset and on a higher dimensional simulated
example. Previous approaches to the Six Cities overly restrict the parameter
space but, by considering the correct invariance, the maximum likelihood is
quite naturally improved when treating the full unrestricted model.Comment: 26 pages, 2 figures. In press, Computational Statistics & Data
Analysi
Multilevel Aggregation Methods for Small-World Graphs with Application to Random-Walk Ranking
We describe multilevel aggregation in the specific context of using Markov chains to rank the nodes of graphs. More generally, aggregation is a graph coarsening technique that has a wide range of possible uses regarding information retrieval applications. Aggregation successfully generates efficient multilevel methods for solving nonsingular linear systems and various eigenproblems from discretized partial differential equations, which tend to involve mesh-like graphs. Our primary goal is to extend the applicability of aggregation to similar problems on small-world graphs, with a secondary goal of developing these methods for eventual applicability towards many other tasks such as using the information in the hierarchies for node clustering or pattern recognition. The nature of small-world graphs makes it difficult for many coarsening approaches to obtain useful hierarchies that have complexity on the order of the number of edges in the original graph while retaining the relevant properties of the original graph. Here, for a set of synthetic graphs with the small-world property, we show how multilevel hierarchies formed with non-overlapping strength-based aggregation have optimal or near optimal complexity. We also provide an example of how these hierarchies are employed to accelerate convergence of methods that calculate the stationary probability vector of large, sparse, irreducible, slowly-mixing Markov chains on such small-world graphs. The stationary probability vector of a Markov chain allows one to rank the nodes in a graph based on the likelihood that a long random walk visits each node. These ranking approaches have a wide range of applications including information retrieval and web ranking, performance modeling of computer and communication systems, analysis of social networks, dependability and security analysis, and analysis of biological systems
- …