6,516 research outputs found
Recommended from our members
Visual analysis of social networks in space and time using smartphone logs
We designed and applied novel interactive visualisation to investigate how social networks - derived from smartphone logs - are embedded in time and space. Social networks were identified through direct calls between participants and calls to mutual contacts of participants. Direct contact between participants was sparse and deriving networks through mutual contacts helped enrich the social networks. Our resulting interactive visualisation tool offers four linked and co-ordinated views of spatial, temporal, individual and social network aspects of the data. Brushing and altering techniques help us investigate how these aspects relate. We also simultaneously display some demographic and attitudinal variables to help add context to the behaviours we observe. Using these techniques, we were able to characterise spatial and temporal aspects of participants' social networks and suggest explanations for some of them. We reflect on the extent to which such analysis helps us understand social communication behaviour
Semantic Modeling of Analytic-based Relationships with Direct Qualification
Successfully modeling state and analytics-based semantic relationships of
documents enhances representation, importance, relevancy, provenience, and
priority of the document. These attributes are the core elements that form the
machine-based knowledge representation for documents. However, modeling
document relationships that can change over time can be inelegant, limited,
complex or overly burdensome for semantic technologies. In this paper, we
present Direct Qualification (DQ), an approach for modeling any semantically
referenced document, concept, or named graph with results from associated
applied analytics. The proposed approach supplements the traditional
subject-object relationships by providing a third leg to the relationship; the
qualification of how and why the relationship exists. To illustrate, we show a
prototype of an event-based system with a realistic use case for applying DQ to
relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic
Computing (IEEE ICSC 2015
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Semantic processing of EHR data for clinical research
There is a growing need to semantically process and integrate clinical data
from different sources for clinical research. This paper presents an approach
to integrate EHRs from heterogeneous resources and generate integrated data in
different data formats or semantics to support various clinical research
applications. The proposed approach builds semantic data virtualization layers
on top of data sources, which generate data in the requested semantics or
formats on demand. This approach avoids upfront dumping to and synchronizing of
the data with various representations. Data from different EHR systems are
first mapped to RDF data with source semantics, and then converted to
representations with harmonized domain semantics where domain ontologies and
terminologies are used to improve reusability. It is also possible to further
convert data to application semantics and store the converted results in
clinical research databases, e.g. i2b2, OMOP, to support different clinical
research settings. Semantic conversions between different representations are
explicitly expressed using N3 rules and executed by an N3 Reasoner (EYE), which
can also generate proofs of the conversion processes. The solution presented in
this paper has been applied to real-world applications that process large scale
EHR data.Comment: Accepted for publication in Journal of Biomedical Informatics, 2015,
preprint versio
Recommended from our members
SensePath: Understanding the Sensemaking Process Through Analytic Provenance
Sensemaking is described as the process of comprehension, finding meaning and gaining insight from information, producing new knowledge and informing further action. Understanding the sensemaking process allows building effective visual analytics tools to make sense of large and complex datasets. Currently, it is often a manual and time-consuming undertaking to comprehend this: researchers collect observation data, transcribe screen capture videos and think-aloud recordings, identify recurring patterns, and eventually abstract the sensemaking process into a general model. In this paper, we propose a general approach to facilitate such a qualitative analysis process, and introduce a prototype, SensePath, to demonstrate the application of this approach with a focus on browser-based online sensemaking. The approach is based on a study of a number of qualitative research sessions including observations of users performing sensemaking tasks and post hoc analyses to uncover their sensemaking processes. Based on the study results and a follow-up participatory design session with HCI researchers, we decided to focus on the transcription and coding stages of thematic analysis. SensePath automatically captures user's sensemaking actions, i.e., analytic provenance, and provides multi-linked views to support their further analysis. A number of other requirements elicited from the design session are also implemented in SensePath, such as easy integration with existing qualitative analysis workflow and non-intrusive for participants. The tool was used by an experienced HCI researcher to analyze two sensemaking sessions. The researcher found the tool intuitive and considerably reduced analysis time, allowing better understanding of the sensemaking process
- …