43 research outputs found
Targeted matrix completion
Matrix completion is a problem that arises in many data-analysis settings
where the input consists of a partially-observed matrix (e.g., recommender
systems, traffic matrix analysis etc.). Classical approaches to matrix
completion assume that the input partially-observed matrix is low rank. The
success of these methods depends on the number of observed entries and the rank
of the matrix; the larger the rank, the more entries need to be observed in
order to accurately complete the matrix. In this paper, we deal with matrices
that are not necessarily low rank themselves, but rather they contain low-rank
submatrices. We propose Targeted, which is a general framework for completing
such matrices. In this framework, we first extract the low-rank submatrices and
then apply a matrix-completion algorithm to these low-rank submatrices as well
as the remainder matrix separately. Although for the completion itself we use
state-of-the-art completion methods, our results demonstrate that Targeted
achieves significantly smaller reconstruction errors than other classical
matrix-completion methods. One of the key technical contributions of the paper
lies in the identification of the low-rank submatrices from the input
partially-observed matrices.Comment: Proceedings of the 2017 SIAM International Conference on Data Mining
(SDM
Matrix completion with structure
Often, data organized in matrix form contains missing entries. Further, such data has been observed to exhibit effective low-rank, and has led to interest in the particular problem of low-rank matrix-completion: Given a partially-observed matrix, estimate the missing entries such that the output completion is low-rank. The goal of this thesis is to improve matrix-completion algorithms by explicitly analyzing two sources of information in the observed entries: their locations and their values.
First, we provide a categorization of a new approach to matrix-completion, which we call structural. Structural methods quantify the possibility of completion using tests applied only to the locations of known entries. By framing each test as the class of partially-observed matrices that pass the test, we provide the first organizing framework for analyzing the relationship among structural completion methods.
Building on the structural approach, we then develop a new algorithm for active matrix-completion that is combinatorial in nature. The algorithm uses just the locations of known entries to suggest a small number of queries to be made on the missing entries that allow it to produce a full and accurate completion. If a budget is placed on the number of queries, the algorithm outputs a partial completion, indicating which entries it can and cannot accurately estimate given the observations at hand.
Finally, we propose a local approach to matrix-completion that analyzes the values of the observed entries to discover a structure that is more fine-grained than the traditional low-rank assumption. Motivated by the Singular Value Decomposition, we develop an algorithm that finds low-rank submatrices using only the first few singular vectors of a matrix. By completing low-rank submatrices separately from the rest of the matrix, the local approach to matrix-completion produces more accurate reconstructions than traditional algorithms
CSI: A Hybrid Deep Model for Fake News Detection
The topic of fake news has drawn attention both from the public and the
academic communities. Such misinformation has the potential of affecting public
opinion, providing an opportunity for malicious parties to manipulate the
outcomes of public events such as elections. Because such high stakes are at
play, automatically detecting fake news is an important, yet challenging
problem that is not yet well understood. Nevertheless, there are three
generally agreed upon characteristics of fake news: the text of an article, the
user response it receives, and the source users promoting it. Existing work has
largely focused on tailoring solutions to one particular characteristic which
has limited their success and generality. In this work, we propose a model that
combines all three characteristics for a more accurate and automated
prediction. Specifically, we incorporate the behavior of both parties, users
and articles, and the group behavior of users who propagate fake news.
Motivated by the three characteristics, we propose a model called CSI which is
composed of three modules: Capture, Score, and Integrate. The first module is
based on the response and text; it uses a Recurrent Neural Network to capture
the temporal pattern of user activity on a given article. The second module
learns the source characteristic based on the behavior of users, and the two
are integrated with the third module to classify an article as fake or not.
Experimental analysis on real-world data demonstrates that CSI achieves higher
accuracy than existing models, and extracts meaningful latent representations
of both users and articles.Comment: In Proceedings of the 26th ACM International Conference on
Information and Knowledge Management (CIKM) 201
Matrix completion with queries
In many applications, e.g., recommender systems and traffic monitoring, the
data comes in the form of a matrix that is only partially observed and low
rank. A fundamental data-analysis task for these datasets is matrix completion,
where the goal is to accurately infer the entries missing from the matrix. Even
when the data satisfies the low-rank assumption, classical matrix-completion
methods may output completions with significant error -- in that the
reconstructed matrix differs significantly from the true underlying matrix.
Often, this is due to the fact that the information contained in the observed
entries is insufficient. In this work, we address this problem by proposing an
active version of matrix completion, where queries can be made to the true
underlying matrix. Subsequently, we design Order&Extend, which is the first
algorithm to unify a matrix-completion approach and a querying strategy into a
single algorithm. Order&Extend is able identify and alleviate insufficient
information by judiciously querying a small number of additional entries. In an
extensive experimental evaluation on real-world datasets, we demonstrate that
our algorithm is efficient and is able to accurately reconstruct the true
matrix while asking only a small number of queries.Comment: Proceedings of the 21th ACM SIGKDD International Conference on
Knowledge Discovery and Data Minin
Con/tra la maquinaria del castigo
A treinta años del fallo Bazterrica, los contratiempos persisten. Un sector del Poder Judicial reclama cambios para una política de drogas respetuosa de los derechos humanos, pero otro continúa criminalizando perejiles y encarcelando a las propias víctimas de las redes de tráfico y el sistema de recaudación policial. En tanto, la ley que penaliza la tenencia para uso personal sigue vigente y el dispositivo punitivo que deja la estructura mafiosa ilesa no se detiene.Facultad de Periodismo y Comunicación Socia
Con/tra la maquinaria del castigo
A treinta años del fallo Bazterrica, los contratiempos persisten. Un sector del Poder Judicial reclama cambios para una política de drogas respetuosa de los derechos humanos, pero otro continúa criminalizando perejiles y encarcelando a las propias víctimas de las redes de tráfico y el sistema de recaudación policial. En tanto, la ley que penaliza la tenencia para uso personal sigue vigente y el dispositivo punitivo que deja la estructura mafiosa ilesa no se detiene.Facultad de Periodismo y Comunicación Socia
Con/tra la maquinaria del castigo
A treinta años del fallo Bazterrica, los contratiempos persisten. Un sector del Poder Judicial reclama cambios para una política de drogas respetuosa de los derechos humanos, pero otro continúa criminalizando perejiles y encarcelando a las propias víctimas de las redes de tráfico y el sistema de recaudación policial. En tanto, la ley que penaliza la tenencia para uso personal sigue vigente y el dispositivo punitivo que deja la estructura mafiosa ilesa no se detiene.Facultad de Periodismo y Comunicación Socia
A Kernel of Truth: Determining Rumor Veracity on Twitter by Diffusion Pattern Alone
Recent work in the domain of misinformation detection has leveraged rich
signals in the text and user identities associated with content on social
media. But text can be strategically manipulated and accounts reopened under
different aliases, suggesting that these approaches are inherently brittle. In
this work, we investigate an alternative modality that is naturally robust: the
pattern in which information propagates. Can the veracity of an unverified
rumor spreading online be discerned solely on the basis of its pattern of
diffusion through the social network?
Using graph kernels to extract complex topological information from Twitter
cascade structures, we train accurate predictive models that are blind to
language, user identities, and time, demonstrating for the first time that such
"sanitized" diffusion patterns are highly informative of veracity. Our results
indicate that, with proper aggregation, the collective sharing pattern of the
crowd may reveal powerful signals of rumor truth or falsehood, even in the
early stages of propagation.Comment: Published at The Web Conference (WWW) 202