7,971 research outputs found
The Incremental Multiresolution Matrix Factorization Algorithm
Multiresolution analysis and matrix factorization are foundational tools in
computer vision. In this work, we study the interface between these two
distinct topics and obtain techniques to uncover hierarchical block structure
in symmetric matrices -- an important aspect in the success of many vision
problems. Our new algorithm, the incremental multiresolution matrix
factorization, uncovers such structure one feature at a time, and hence scales
well to large matrices. We describe how this multiscale analysis goes much
farther than what a direct global factorization of the data can identify. We
evaluate the efficacy of the resulting factorizations for relative leveraging
within regression tasks using medical imaging data. We also use the
factorization on representations learned by popular deep networks, providing
evidence of their ability to infer semantic relationships even when they are
not explicitly trained to do so. We show that this algorithm can be used as an
exploratory tool to improve the network architecture, and within numerous other
settings in vision.Comment: Computer Vision and Pattern Recognition (CVPR) 2017, 10 page
Abstract Meaning Representation for Multi-Document Summarization
Generating an abstract from a collection of documents is a desirable
capability for many real-world applications. However, abstractive approaches to
multi-document summarization have not been thoroughly investigated. This paper
studies the feasibility of using Abstract Meaning Representation (AMR), a
semantic representation of natural language grounded in linguistic theory, as a
form of content representation. Our approach condenses source documents to a
set of summary graphs following the AMR formalism. The summary graphs are then
transformed to a set of summary sentences in a surface realization step. The
framework is fully data-driven and flexible. Each component can be optimized
independently using small-scale, in-domain training data. We perform
experiments on benchmark summarization datasets and report promising results.
We also describe opportunities and challenges for advancing this line of
research.Comment: 13 page
PDF-Malware Detection: A Survey and Taxonomy of Current Techniques
Portable Document Format, more commonly known as PDF, has become, in the last 20 years, a standard for document exchange and dissemination due its portable nature and widespread adoption. The flexibility and power of this format are not only leveraged by benign users, but from hackers as well who have been working to exploit various types of vulnerabilities, overcome security restrictions, and then transform the PDF format in one among the leading malicious code spread vectors. Analyzing the content of malicious PDF files to extract the main features that characterize the malware identity and behavior, is a fundamental task for modern threat intelligence platforms that need to learn how to automatically identify new attacks. This paper surveys existing state of the art about systems for the detection of malicious PDF files and organizes them in a taxonomy that separately considers the used approaches and the data analyzed to detect the presence of malicious code. © Springer International Publishing AG, part of Springer Nature 2018
Scalable Recollections for Continual Lifelong Learning
Given the recent success of Deep Learning applied to a variety of single
tasks, it is natural to consider more human-realistic settings. Perhaps the
most difficult of these settings is that of continual lifelong learning, where
the model must learn online over a continuous stream of non-stationary data. A
successful continual lifelong learning system must have three key capabilities:
it must learn and adapt over time, it must not forget what it has learned, and
it must be efficient in both training time and memory. Recent techniques have
focused their efforts primarily on the first two capabilities while questions
of efficiency remain largely unexplored. In this paper, we consider the problem
of efficient and effective storage of experiences over very large time-frames.
In particular we consider the case where typical experiences are O(n) bits and
memories are limited to O(k) bits for k << n. We present a novel scalable
architecture and training algorithm in this challenging domain and provide an
extensive evaluation of its performance. Our results show that we can achieve
considerable gains on top of state-of-the-art methods such as GEM.Comment: AAAI 201
A 'glocal' approach for real-time emergency event detection in Twitter
Social media like Twitter offer not only an unprecedented amount of user-generated content covering developing emergencies but also act as a collector of news produced by heterogeneous sources, including big and small media companies as well as public authorities. However, this volume, velocity, and variety of data constitute the main value and, at the same time, the key challenge to implement and automatic detection and tracking of independent emergency events from the real-time stream of tweets. Leveraging online clustering and considering both textual and geographical features, we propose, implement, and evaluate an algorithm to automatically detect emergency events applying a ‘glocal’ approach, i.e., offering a global coverage while detecting events at local (municipality level) scale
Automated user modeling for personalized digital libraries
Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to
improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in
an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information
- …