    An Approach to Publish Statistics from Open-Access Journals Using Linked Data Technologies

    Semantic Web encourages digital libraries which include open access journals, to collect, link and share their data across the web in order to ease its processing by machines and humans to get better queries and results. Linked Data technologies enable connecting structured data across the web using the principles and recommendations set out by Tim Berners-Lee in 2006. Several universities develop knowledge, through scholarship and research, under open access policies and use several ways to disseminate information. Open access journals collect, preserve and publish scientific information in digital form using a peer review process. The evaluation of the usage of this kind of publications needs to be expressed in statistics and linked to external resources to give better information about the resources and their relationships. The statistics expressed in a data mart facilitate queries about the history of journals usage by several criteria. This data linked to another datasets gives more information such as: the topics in the research, the origin of the authors, the relation to the national plans, and the relations about the study curriculums. This paper reports a process for publishing an open access journal data mart on the Web using Linked Data technologies in such a way that it can be linked to related datasets. Furthermore, methodological guidelines are presented with related activities. The proposed process was applied extracting statistical data from a university open journal system and publishing it in a SPARQL endpoint using the open source edition of the software OpenLink Virtuoso. In this process the use of open standards facilitates the creation, development and exploitation of knowledge. The RDF Data Cube vocabulary has been used as a model for publishing the multidimensional data on the Web. The visualization was made using CubeViz a faceted browser filtering observations to be presented interactively in charts. The proposed process help to publish statistical datasets in an easy way.This work has been partially supported by the Prometeo Project by SENESCYT, Ecuadorian Government

    Open Spatiotemporal Data Warehouse For Agriculture Production Analytics

    Business Intelligence (BI) technology with Extract, Transform, and Loading process, Data Warehouse, and OLAP have demonstrated the ability of information and knowledge generation for supporting decision making. In the last decade, the advancement of the Web 2.0 technology is improving the accessibility of web of data across the cloud. Linked Open Data, Linked Open Statistical Data, and Open Government Data is increasing massively, creating a more significant computer-recognizable data available for sharing. In agricultural production analytics, data resources with high availability and accessibility is a primary requirement. However, today’s data accessibility for production analytics is limited in the 2 or 3-stars open data format and rarely has attributes for spatiotemporal analytics. The new data warehouse concept has a new approach to combine the openness of data resources with mobility or spatiotemporal data in nature. This new approach could help the decision-makers to use external data to make a crucial decision more intuitive and flexible. This paper proposed the development of a spatiotemporal data warehouse with an integration process using service-oriented architecture and open data sources. The data sources are originating from the Village and Rural Area Information System (SIDeKa) that capture the agricultural production transaction in a daily manner. This paper also describes the way to spatiotemporal analytics for agricultural production using a new spatiotemporal data warehouse approach. The experiment results, by executing six relevant spatiotemporal query samples on DW with fact table contains 324096 tuples with temporal integer/float for each tuple, 4495 tuples of field dimension with geographic data as polygons, 80 tuples of village dimension, dozens of tuples of the district, subdistrict, province dimensions. The DW time dimension contains 3653 tuples representing a date for ten years, proved that this new approach has a convenient, simple model, and expressive performance for supporting executive to make decisions on agriculture production analytics based on spatiotemporal data. This research also underlines the prospects for scaling and nurturing the spatiotemporal data warehouse initiative

    Linked Data -palvelu luontohavaintoaineistoille

    Biologisten havaintoaineistojen julkaiseminen linkitettynä datana mahdollistaa useiden aineistojen yhdistämisen toisiinsa. Yhdistämällä toisiinsa useita samaan asiaan liittyviä aineistoja, voidaan saavuttaa parempi ymmärrys kiinnostuksen kohteena olevasta ilmiöstä kuin tutkimalla aineistoja erikseen. Näin voidaan mahdollistaa tarkempien päätelmien tekeminen aineistojen pohjalta sekä etsiä odotettuja tai odottamattomia yhteyksiä aineistojen välillä. Linkitetyssä datassa käytetty RDF-tietomalli tuo aineistoihin koneluettavuuden ja helpon tavan viitata kaikkiin aineistojen osiin. Linkitettynä datana julkaistuja aineistoja voidaan helposti rikastaa yhä uusilla aineistoilla. Tässä tutkielmassa käsitellään Hangon lintuaseman havaintoaineiston sekä Ilmatieteenlaitoksen Hangon Russarön säähavaintoaineiston mallinnusta, käsittelyä ja hyödyntämistä linkitettynä datana. Aineistot on mallinnettu käyttäen RDF Data Cube -sanastoa, joka parantaa aineistojen yhteentoimivuutta. Lintuhavaintoaineistoon on annotoitu lajitietoa käyttäen ontologiaa Suomen linnuista, jota on rikastettu mm. lajien tuntomerkkiontologialla sekä uhanalaisuustiedoilla. Aineistot on julkaistu Linked Data Finland -alustalla, ja aineistojen välisten yhteyksien hahmottamiseksi on kehitetty visualisointipalvelun prototyyppi. Säätilan tiedetään olevan tärkeimpiä päivittäisen lintumuuton voimakkuuteen vaikuttavia tekijöitä. Visualisointipalvelulla pyritään näyttämään käyttäjälle, miten säätila vaikuttaa lintuhavaintomääriin ja erityisesti havaittuun lintumuuttoon. Aineistojen välisten suhteiden parempi tuntemus mahdollistaa tarkempien päätelmien tekemisen lintuhavaintoaineiston perusteella. Tutkielmassa esitetyt menetelmät ovat yleistettävissä lintu- ja säähavaintoaineistojen lisäksi muihin rakenteeltaan samankaltaisiin aineistoihin

    Modelo de referência para indicadores de inovação regional suportado por dados ligados

    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia e Gestão do Conhecimento, Florianópolis, 2016.A inovação está associada à implementação de um novo, ou significativamente melhorado, produto ou processo, método de marketing, ou método organizacional, e é reconhecida como um fator-chave para o desenvolvimento econômico. Em função de suas características, a inovação passou a ser vista como um processo complexo que acontece em um ambiente onde diferentes tipos de atores interagem em um sistema, que, quando considerado sob o contexto regional, é chamado de Sistema Regional de Inovação, e que demanda de estratégias e políticas que incentivem e potencializem o desenvolvimento das atividades de inovação, e de ferramentas para avaliação das ações. Diferentes índices são propostos para a mensuração da inovação ao nível das empresas ou em nível nacional, mas que são de difícil aplicação em nível regional, em função subjetividade de escolha das variáveis e da falta de disponibilidade de dados. Esta tese propõe um modelo de referência concebido a partir da análise dos modelos de indicadores compostos para mensuração da inovação regional apresentados na literatura, propondo uma classificação hierárquica para os indicadores. Valendo-se de tecnologias semânticas, o modelo é suportado por Dados Ligados, objetivando a exploração do potencial de dados regionais disponibilizados na Web por iniciativas de dados abertos e transparência pública na definição de índices específicos para a inovação regional, em uma forma tal que possibilitem o seu processamento automatizado. Uma prova de conceito é apresentada com a finalidade de demonstrar a viabilidade de utilização do modelo em aplicações reais. Dados sobre municípios de duas mesorregiões do estado de Santa Catarina foram coletados e publicados na forma de dados ligados e foi desenvolvido um protótipo de aplicação Web para visualização e análise comparativa entre diferentes níveis regionais e de agregação de indicadores.Abstract : Innovation is related to the implementation of new, or significantly improved, product or process, marketing method, or organizational method, and is recognized as a key-factor for economic development. Due to its characteristics, innovation is seen as a complex process that occurs in an environment where different actors interact in a system which, when considered in the regional context, is called Regional Innovation System, and demands strategies and policies that encourage and leverage innovation activities and tools for evaluation. There are many different index systems proposed to measure innovation at firm level or at country level, but it is pointed that the choice of variables is subjective, and that is difficult to apply them in the regional level, due to the lack of data availability. This thesis proposes a reference model for classification of innovation indicators. The model is based on the analysis of the literature on composite indicators models for measuring regional innovation. Throung the use of semantic technologies, this model is supported by Linked Data and aims to explore the potential of regional open data, available due to transparency and open data initiatives. A proof of concept is presented for the purpose of demonstrating the feasibility of the model real world applications. Data on municipalities from two regions of the Santa Catarina state were collected, semantically annotated to the model, and published as linked data. A Web application prototype was developed for visualization and comparative analysis of different regional levels and aggregate indicators