6,861 research outputs found
Análise colaborativa de grandes conjuntos de séries temporais
The recent expansion of metrification on a daily basis has led to the production
of massive quantities of data, and in many cases, these collected metrics
are only useful for knowledge building when seen as a full sequence of
data ordered by time, which constitutes a time series. To find and interpret
meaningful behavioral patterns in time series, a multitude of analysis software
tools have been developed. Many of the existing solutions use annotations
to enable the curation of a knowledge base that is shared between a group
of researchers over a network. However, these tools also lack appropriate
mechanisms to handle a high number of concurrent requests and to properly
store massive data sets and ontologies, as well as suitable representations
for annotated data that are visually interpretable by humans and explorable by
automated systems. The goal of the work presented in this dissertation is to
iterate on existing time series analysis software and build a platform for the
collaborative analysis of massive time series data sets, leveraging state-of-the-art technologies for querying, storing and displaying time series and annotations.
A theoretical and domain-agnostic model was proposed to enable
the implementation of a distributed, extensible, secure and high-performant
architecture that handles various annotation proposals in simultaneous and
avoids any data loss from overlapping contributions or unsanctioned changes.
Analysts can share annotation projects with peers, restricting a set of collaborators
to a smaller scope of analysis and to a limited catalog of annotation
semantics. Annotations can express meaning not only over a segment of time,
but also over a subset of the series that coexist in the same segment. A novel
visual encoding for annotations is proposed, where annotations are rendered
as arcs traced only over the affected series’ curves in order to reduce visual
clutter. Moreover, the implementation of a full-stack prototype with a reactive
web interface was described, directly following the proposed architectural and
visualization model while applied to the HVAC domain. The performance of
the prototype under different architectural approaches was benchmarked, and
the interface was tested in its usability. Overall, the work described in this dissertation
contributes with a more versatile, intuitive and scalable time series
annotation platform that streamlines the knowledge-discovery workflow.A recente expansão de metrificação diária levou à produção de quantidades
massivas de dados, e em muitos casos, estas métricas são úteis para
a construção de conhecimento apenas quando vistas como uma sequência
de dados ordenada por tempo, o que constitui uma série temporal. Para se
encontrar padrões comportamentais significativos em séries temporais, uma
grande variedade de software de análise foi desenvolvida. Muitas das soluções
existentes utilizam anotações para permitir a curadoria de uma base
de conhecimento que é compartilhada entre investigadores em rede. No entanto,
estas ferramentas carecem de mecanismos apropriados para lidar com
um elevado número de pedidos concorrentes e para armazenar conjuntos
massivos de dados e ontologias, assim como também representações apropriadas
para dados anotados que são visualmente interpretáveis por seres
humanos e exploráveis por sistemas automatizados. O objetivo do trabalho
apresentado nesta dissertação é iterar sobre o software de análise de séries
temporais existente e construir uma plataforma para a análise colaborativa
de grandes conjuntos de séries temporais, utilizando tecnologias estado-de-arte
para pesquisar, armazenar e exibir séries temporais e anotações. Um
modelo teórico e agnóstico quanto ao domínio foi proposto para permitir a
implementação de uma arquitetura distribuída, extensível, segura e de alto
desempenho que lida com várias propostas de anotação em simultâneo e
evita quaisquer perdas de dados provenientes de contribuições sobrepostas
ou alterações não-sancionadas. Os analistas podem compartilhar projetos
de anotação com colegas, restringindo um conjunto de colaboradores a uma
janela de análise mais pequena e a um catálogo limitado de semântica de
anotação. As anotações podem exprimir significado não apenas sobre um
intervalo de tempo, mas também sobre um subconjunto das séries que coexistem
no mesmo intervalo. Uma nova codificação visual para anotações é
proposta, onde as anotações são desenhadas como arcos traçados apenas
sobre as curvas de séries afetadas de modo a reduzir o ruído visual. Para
além disso, a implementação de um protótipo full-stack com uma interface
reativa web foi descrita, seguindo diretamente o modelo de arquitetura e visualização
proposto enquanto aplicado ao domínio AVAC. O desempenho do
protótipo com diferentes decisões arquiteturais foi avaliado, e a interface foi
testada quanto à sua usabilidade. Em geral, o trabalho descrito nesta dissertação
contribui com uma abordagem mais versátil, intuitiva e escalável para
uma plataforma de anotação sobre séries temporais que simplifica o fluxo de
trabalho para a descoberta de conhecimento.Mestrado em Engenharia Informátic
M-SSE: an effective searchable symmetric encryption with enhanced security for mobile devices
Searchable Encryption (SE) allows mobile devices with limited computing and storage resources to outsource data to an untrusted cloud server. Users are able to search and retrieve the outsourced, however, it suffers from information and privacy leakage. The reason is that most of the previous works rely on the single cloud model, which allows that the cloud server get all the search information from users. In this paper, we present a new scheme M-SSE that achieves both forward and backward security based on a multi-cloud technique. The new scheme is secure against both adaptive file injection attack and size pattern attack by utilizing multiple cloud servers. Experiment results show that our scheme is effective compared with the other existing schemes
Social media analytics: a survey of techniques, tools and platforms
This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing
A sentence-based image search engine
Nowadays people are more interested in searching the relevant images directly through search engines like Google, Yahoo or Bing, these image search engines have dedicated extensive research effort to the problem of keyword-based image retrieval. However, the most widely used keyword-based image search engine Google is reported to have a precision of only 39%. And all of these systems have limitation in creating sentence-based queries for images.
This thesis studies a practical image search scenario, where many people feel annoyed by using only keywords to find images for their ideas of speech or presentation through trial and error. This thesis proposes and realizes a sentence-based image search engine (SISE) that offers the option of querying images by sentence. Users can naturally create sentence-based queries simply by inputting one or several sentences to retrieve a list of images that match their ideas well.
The SISE relies on automatic concept detection and tagging techniques to provide support for searching visual content using sentence-based queries. The SISE gathered thousands of input sentences from TED talk, covering many areas like science, economy, politics, education and so on. The comprehensive evaluation of this system was focused on usability (perceived image usefulness) aspect. The final comprehensive precision has been reached 60.7%. The SISE is found to be able to retrieve matching images for a wide variety of topics, across different areas, and provide subjectively more useful results than keyword-based image search engines --Abstract, page iii
Computing phonological generalization over real speech exemplars
Though it has attracted growing attention from phonologists and phoneticians Exemplar Theory (e g Bybee 2001) has hitherto lacked an explicit production model that can apply to speech signals An adequate model must be able to generalize but this presents the problem of how to generate an output that generalizes over a collection of unique variable-length signals Rather than resorting to a priori phonological units such as phones we adopt a dynamic programming approach using an optimization criterion that is sensitive to the frequency of similar subsequences within other exemplars the Phonological Exemplar-Based Learning System We show that PEBLS displays pattern-entrenchment behaviour central to Exemplar Theory s account of phonologization (C) 2010 Elsevier Ltd All rights reserve
- …