19,975 research outputs found
Designing Traceability into Big Data Systems
Providing an appropriate level of accessibility and traceability to data or
process elements (so-called Items) in large volumes of data, often
Cloud-resident, is an essential requirement in the Big Data era.
Enterprise-wide data systems need to be designed from the outset to support
usage of such Items across the spectrum of business use rather than from any
specific application view. The design philosophy advocated in this paper is to
drive the design process using a so-called description-driven approach which
enriches models with meta-data and description and focuses the design process
on Item re-use, thereby promoting traceability. Details are given of the
description-driven design of big data systems at CERN, in health informatics
and in business process management. Evidence is presented that the approach
leads to design simplicity and consequent ease of management thanks to loose
typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International
Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore
July 2015. arXiv admin note: text overlap with arXiv:1402.5764,
arXiv:1402.575
Enabling quantitative data analysis through e-infrastructures
This paper discusses how quantitative data analysis in the social sciences can engage with and exploit an e-Infrastructure. We highlight how a number of activities which are central to quantitative data analysis, referred to as ‘data management’, can benefit from e-infrastructure support. We conclude by discussing how these issues are relevant to the DAMES (Data Management through e-Social Science) research Node, an ongoing project that aims to develop e-Infrastructural resources for quantitative data analysis in the social sciences
Italian center for Astronomical Archives publishing solution: modular and distributed
The Italian center for Astronomical Archives tries to provide astronomical
data resources as interoperable services based on IVOA standards. Its VO
expertise and knowledge comes from active participation within IVOA and VO at
European and international level, with a double-fold goal: learn from the
collaboration and provide inputs to the community. The first solution to build
an easy to configure and maintain resource publisher conformant to VO standards
proved to be too optimistic. For this reason it has been necessary to re-think
the architecture with a modular system built around the messaging concept,
where each modular component speaks to the other interested parties through a
system of broker-managed queues. The first implemented protocol, the Simple
Cone Search, shows the messaging task architecture connecting the parametric
HTTP interface to the database backend access module, the logging module, and
allows multiple cone search resources to be managed together through a
configuration manager module. Even if relatively young, it already proved the
flexibility required by the overall system when the database backend changed
from MySQL to PostgreSQL+PgSphere. Another implementation test has been made to
leverage task distribution over multiple servers to serve simultaneously: FITS
cubes direct linking, cubes cutout and cubes positional merging. Currently the
implementation of the SIA-2.0 standard protocol is ongoing while for TAP we
will be adapting the TAPlib library. Alongside these tools a first
administration tool (TASMAN) has been developed to ease the build up and
maintenance of TAP_SCHEMA-ta including also ObsCore maintenance capability.
Future work will be devoted at widening the range of VO protocols covered by
the set of available modules, improve the configuration management and develop
specific purpose modules common to all the service components.Comment: SPIE Astronomical Telescopes + Instrumentation 2018, Software and
Cyberinfrastructure for Astronomy V, pre-publishing draft proceeding (reduced
abstract
Towards the design of a platform for abuse detection in OSNs using multimedial data analysis
Online social networks (OSNs) are becoming increasingly popular every day. The vast amount of data created by users and their actions yields interesting opportunities, both socially and economically. Unfortunately, these online communities are prone to abuse and inappropriate behaviour such as cyber bullying. For victims, this kind of behaviour can lead to depression and other severe problems. However, due to the huge amount of users and data it is impossible to manually check all content posted on the social network. We propose a pluggable architecture with reusable components, able to quickly detect harmful content. The platform uses text-, image-, audio- and video-based analysis modules to detect inappropriate content or high risk behaviour. Domain services aggregate this data and flag user profiles if necessary. Social network moderators need only check the validity of the flagged profiles. This paper reports upon key requirements of the platform, the architectural components and important challenges
- …