348 research outputs found
Hillview:A trillion-cell spreadsheet for big data
Hillview is a distributed spreadsheet for browsing very large datasets that
cannot be handled by a single machine. As a spreadsheet, Hillview provides a
high degree of interactivity that permits data analysts to explore information
quickly along many dimensions while switching visualizations on a whim. To
provide the required responsiveness, Hillview introduces visualization
sketches, or vizketches, as a simple idea to produce compact data
visualizations. Vizketches combine algorithmic techniques for data
summarization with computer graphics principles for efficient rendering. While
simple, vizketches are effective at scaling the spreadsheet by parallelizing
computation, reducing communication, providing progressive visualizations, and
offering precise accuracy guarantees. Using Hillview running on eight servers,
we can navigate and visualize datasets of tens of billions of rows and
trillions of cells, much beyond the published capabilities of competing
systems
The WFCAM Science Archive
We describe the WFCAM Science Archive (WSA), which is the primary point of
access for users of data from the wide-field infrared camera WFCAM on the
United Kingdom Infrared Telescope (UKIRT), especially science catalogue
products from the UKIRT Infrared Deep Sky Survey (UKIDSS). We describe the
database design with emphasis on those aspects of the system that enable users
to fully exploit the survey datasets in a variety of different ways. We give
details of the database-driven curation applications that take data from the
standard nightly pipeline-processed and calibrated files for the production of
science-ready survey datasets. We describe the fundamentals of querying
relational databases with a set of astronomy usage examples, and illustrate the
results.Comment: 28 pages, 18 figures; accepted for publication in MNRAS (2007
November 8
Guiding legacy systems for evolution. PmatE: a case study of maintenance and engineering
Even though software change is inevitable, accurate maintenance can extend software lifespan in a subtle way
when both budget and time constraints get in the way of software replacement. In the University of Aveiro, the
project PmatE – a quiz web platform created to encourage students to like Math – emerged in the early 1990’s and stacked several applications over the decades without major planning, cleaning or upgrade. This resulted in a huge-sized framework that was crucial to be always available and online and had high operational cost, leading to an increasing amount of technical debt. After 3 decades, the project was studied, refactored and refurbished, leading to a stable consistent framework ready for evolution and software spinouts. This work shows how to manage and engineer solutions to maintain a legacy system and evolve it even when tied up to heavy constraints.info:eu-repo/semantics/publishedVersio
The VISTA Science Archive
We describe the VISTA Science Archive (VSA) and its first public release of
data from five of the six VISTA Public Surveys. The VSA exists to support the
VISTA Surveys through their lifecycle: the VISTA Public Survey consortia can
use it during their quality control assessment of survey data products before
submission to the ESO Science Archive Facility (ESO SAF); it supports their
exploitation of survey data prior to its publication through the ESO SAF; and,
subsequently, it provides the wider community with survey science exploitation
tools that complement the data product repository functionality of the ESO SAF.
This paper has been written in conjunction with the first public release of
public survey data through the VSA and is designed to help its users understand
the data products available and how the functionality of the VSA supports their
varied science goals. We describe the design of the database and outline the
database-driven curation processes that take data from nightly
pipeline-processed and calibrated FITS files to create science-ready survey
datasets. Much of this design, and the codebase implementing it, derives from
our earlier WFCAM Science Archive (WSA), so this paper concentrates on the
VISTA-specific aspects and on improvements made to the system in the light of
experience gained in operating the WSA.Comment: 22 pages, 16 figures. Minor edits to fonts and typos after
sub-editting. Published in A&
VIOLA - A multi-purpose and web-based visualization tool for neuronal-network simulation output
Neuronal network models and corresponding computer simulations are invaluable
tools to aid the interpretation of the relationship between neuron properties,
connectivity and measured activity in cortical tissue. Spatiotemporal patterns
of activity propagating across the cortical surface as observed experimentally
can for example be described by neuronal network models with layered geometry
and distance-dependent connectivity. The interpretation of the resulting stream
of multi-modal and multi-dimensional simulation data calls for integrating
interactive visualization steps into existing simulation-analysis workflows.
Here, we present a set of interactive visualization concepts called views for
the visual analysis of activity data in topological network models, and a
corresponding reference implementation VIOLA (VIsualization Of Layer Activity).
The software is a lightweight, open-source, web-based and platform-independent
application combining and adapting modern interactive visualization paradigms,
such as coordinated multiple views, for massively parallel neurophysiological
data. For a use-case demonstration we consider spiking activity data of a
two-population, layered point-neuron network model subject to a spatially
confined excitation originating from an external population. With the multiple
coordinated views, an explorative and qualitative assessment of the
spatiotemporal features of neuronal activity can be performed upfront of a
detailed quantitative data analysis of specific aspects of the data.
Furthermore, ongoing efforts including the European Human Brain Project aim at
providing online user portals for integrated model development, simulation,
analysis and provenance tracking, wherein interactive visual analysis tools are
one component. Browser-compatible, web-technology based solutions are therefore
required. Within this scope, with VIOLA we provide a first prototype.Comment: 38 pages, 10 figures, 3 table
Development of a centralized log management system
Os registos de um sistema são uma peça crucial de qualquer sistema e fornecem
uma visão útil daquilo que este está fazendo e do que acontenceu em caso de falha.
Qualquer processo executado num sistema gera registos em algum formato.
Normalmente, estes registos ficam armazenados em memória local. À medida que os
sistemas evoluiram, o número de registos a analisar também aumentou, e, como
consequência desta evolução, surgiu a necessidade de produzir um formato de registos
uniforme, minimizando assim dependências e facilitando o processo de análise.
A ams é uma empresa que desenvolve e cria soluções no mercado dos sensores.
Com vinte e dois centros de design e trĂŞs locais de fabrico, a empresa fornece os seus
serviços a mais de oito mil clientes em todo o mundo. Um centro de design está
localizado no Funchal, no qual está incluida uma equipa de engenheiros de aplicação
que planeiam e desenvolvem applicações de software para clientes internos. O processo
de desenvolvimento destes engenheiros envolve várias aplicações e programas, cada
um com o seu prĂłprio sistema de registos.
Os registos gerados por cada aplicação são mantido em sistemas de
armazenamento distintos. Se um desenvolvedor ou administrador quiser solucionar um
problema que abrange várias aplicações, será necessário percorrer as várias localizações
onde os registos estĂŁo armazenados, colecionando-os e correlacionando-os de forma a
melhor entender o problema. Este processo Ă© cansativo e, se o ambiente for
dimensionado automaticamente, a solução de problemas semelhantes torna-se
inconcebĂvel.
Este projeto teve como principal objetivo resolver estes problemas, criando
assim um Sistema de GestĂŁo de Registos Centralizado capaz de lidar com registos de
várias fontes, como também fornecer serviços que irão ajudar os desenvolvedores e
administradores a melhor entender os diferentes ambientes afetados.
A solução final foi desenvolvida utilizando um conjunto de diferentes tecnologias
de cĂłdigo aberto, tais como a Elastic Stack (Elasticsearch, Logstash e Kibana), Node.js,
GraphQL e Cassandra.
O presente documento descreve o processo e as decisões tomadas para chegar
à solução apresentada.Logs are a crucial piece of any system and give a helpful insight into what it is
doing as well as what happened in case of failure. Every process running on a system
generates logs in some format. Generally, these logs are written to local storage
resources. As systems evolved, the number of logs to analyze increased, and, as a
consequence of this progress, there was the need of having a standardized log format,
minimizing dependencies and making the analysis process easier.
ams is a company that develops and creates sensor solutions. With twenty-two
design centers and three manufacturing locations, the company serves to over eight
thousand clients worldwide. One design center is located in Funchal, which includes a
team of application engineers that design and develop software applications to clients
inside the company. The application engineer’s development process is comprised of
several applications and programs, each having its own logging system.
Log entries generated by different applications are kept in separate storage
systems. If a developer or administrator wants to troubleshoot an issue that includes
several applications, he/she would have to go to different database systems or locations
to collect the logs and correlate them across the several requests. This is a tiresome
process and if the environment is auto-scaled, then troubleshooting an issue is
inconceivable.
This project aimed to solve these problems by creating a Centralized Log
Management System that was capable of handling logs from a variety of sources, as well
as to provide services that will help developers and administrators better understand
the different affected environments.
The deployed solution was developed using a set of different open-source
technologies, such as the Elastic Stack (Elasticsearch, Logstash and Kibana), Node.js,
GraphQL and Cassandra.
The present document describes the process and decisions taken to achieve the
solution
DSpace Manual: Software version 1.5
DSpace is an open source software platform that enables organizations to:
- Capture and describe digital material using a submission workflow module, or a
variety of programmatic ingest options
- Distribute an organization's digital assets over the web through a search and
retrieval system
- Preserve digital assets over the long term
This system documentation includes a functional overview of the system, which is a
good introduction to the capabilities of the system, and should be readable by nontechnical
personnel. Everyone should read this section first because it introduces
some terminology used throughout the rest of the documentation. For people
actually running a DSpace service, there is an installation guide, and sections on
configuration and the directory structure. Note that as of DSpace 1.2, the
administration user interface guide is now on-line help available from within the
DSpace system. Finally, for those interested in the details of how DSpace works, and
those potentially interested in modifying the code for their own purposes, there is a
detailed architecture and design section
Bridging OPC UA and DPWS for Industrial SOA
Two web-service based specifications, OPC Unified Architecture (OPC UA) and Devices Profile for Web Services (DPWS), have been proposed by various researchers and organizations as possible enabling technologies for an event-driven Service Oriented Architecture for monitoring and control in manufacturing applications. This paper aims to propose and demonstrate an approach for bridging these two technologies in a way that is applicable in existing industrial applications.
A merger between OPC UA and DPWS that effectively combines their complementary strengths could help pave the path toward future industrial event-driven SOA applications, with the inherent modularity, agility, and interoperability envisioned by researchers today.
A representation of DPWS devices, services, operations and events in the OPC UA data model is proposed, and a DPWS Module is developed for Ignition, a commercially available HMI/SCADA and MES platform with integrated OPC UA Server. The module discovers DPWS devices in a local network, creates the representation in the address space, and handles subscriptions, input and output parameter values, and invoking operations. A Complex Event Processing component based on Microsoft’s StreamInsight is also integrated with the system, input and output adapters exposing web service interfaces.
The system prototype developed will be used as the base for a use case demonstrator in the European Commission’s Framework Package 7 Project, “Architecture for Service-Oriented Process Monitoring and Control (IMC AESOP).” The project aims to develop a system of systems approach for monitoring and control, based on SOA for very large-scale systems in the process industries
- …