17 research outputs found
Temporal search in document streams
In this thesis, we address major challenges in searching temporal document collections. In such collections, documents are created and/or edited over time. Examples of temporal document collections are web archives, news archives, blogs, personal emails and enterprise documents. Unfortunately, traditional IR approaches based on termmatching only can give unsatisfactory results when searching temporal document collections. The reason for this is twofold: the contents of documents are strongly time-dependent, i.e., documents are about events happened at particular time periods, and a query representing an information need can be time-dependent as well, i.e., a temporal query. On the other hand, time-only-based methods fall short when it comes to reasoning about events in social media. During the last few years users create chronologically ordered documents about topics that draw their attention in an ever increasing pace. However, with the vast adoption of social media, new types of marketing campaigns have been developed in order to promote content, i.e. brands, products, celebrities, etc
The Utility of Data Transformation for Alignment, De Novo Assembly and Classification of Short Read Virus Sequences.
Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work, we explored the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Leveraging highly compressed sequence transformations to accelerate sequence comparison, our approach yielded comparable accuracy to existing approaches, further demonstrating its suitability for sequences originating from diverse virus populations. We assessed the application of our methodology using both synthetic and real viral pathogen sequences. Our results show that the use of highly compressed sequence approximations can provide accurate results, with analytical performance retained and even enhanced through appropriate dimensionality reduction of sequence data
Smartmonitor: Using smart devices to perform structural health monitoring
In this demonstration, we are presenting SmartMonitor, a distributed Structural Health Monitoring (SHM) system consisting of smart devices. Over the last few years, the vast majority of smart devices is equipped with accelerometers that can be utilized towards building SHM systems with hundreds of nodes. We describe a scalable, fault-tolerant communication protocol, that performs best-effort time synchronization of the nodes and is used to implement a decentralized version of the popular peak-picking SHM method. The implemented interactive system can be easily installed in any accelerometer-equipped Android device and the user has a number of options for configuring the system or analyzing the collected data and computed outcomes. © 2013 VLDB Endowment
Language agnostic meme-filtering for hashtag-based social network analysis
Users in social networks utilize hashtags for a variety of reasons. In many cases, hashtags serve retrieval purposes by labeling the content they accompany. More often than not, hashtags are used to promote content, ideas, or conversations producing viral memes. This paper addresses a specific case of hashtag classification: meme-filtering. We argue that hashtags that are correlated with memes may hinder many valuable social media algorithms like trend detection and event identification. We propose and evaluate a set of language-agnostic features that aid the separation of these two classes: meme-hashtags and event-hashtags. The proposed approach is evaluated on two large datasets of Twitter messages written in English and German. A proof-of-concept application of the meme-filtering approach to the problem of event detection is presented. © 2015, Springer-Verlag Wien