Search CORE

32 research outputs found

Generic Statistical Information Model (GSIM)

Author: Gregory Arofan
Lalor Thérèse
Vale Steven
Publication venue
Publication date: 20/04/2013
Field of study

Presentation at the North American Data Documentation Conference (NADDI) 2013Across the world statistical organizations undertake similar activities. Each of these activities use and produce similar information (for example all agencies use classifications, create data sets and publish products). Although the information is at its core the same, organizations tend to describe this information slightly differently (and often in different ways within each organization). There is no common means to describe the information. GSIM is a conceptual model that provides a set of standardized, consistently described information objects, which are the inputs and outputs in the design and production of statistics. DDI is a key standard in both the development of GSIM itself, and as an implementation tool for organizations using GSIM. Beyond that, it also will influence the future directions of DDI development, attracting a larger number of data producers into the DDI community. This presentation introduces GSIM and looks at the interaction between GSIM and DDI (and other related standards), and provides an update on a rapidly-evolving vision around the use of DDI within the statistical institutes in Europe and elsewhere. It will cover both the direct interaction between DDI and GSIM, and also provide a broader context for understanding what that dynamic may mean in the future.Institute for Policy & Social Research, University of Kansas; University of Kansas Libraries; Alfred P. Sloan Foundation; Data Documentation Initiative Allianc

KU ScholarWorks

Methods library of embedded R functions at Statistics Norway

Author: Langsrud Øyvind
Publication venue: The National Institute of statistics, (Romania)
Publication date: 01/01/2017
Field of study

Statistics Norway is modernising the production processes. An important element in this work is a library of functions for statistical computations. In principle, the functions in such a methods library can be programmed in several languages. A modernised production environment demand that these functions can be reused for different statistics products, and that they are embedded within a common IT system. The embedding should be done in such a way that the users of the methods do not need to know the underlying programming language. As a proof of concept, Statistics Norway soon has established a methods library offering a limited number of methods for macro-editing, imputation and confi dentiality. This is done within an area of municipal statistics with R as the only programming language. This paper presents the details and experiences from this work. The problem of fi tting real word applications to simple and strict standards is discussed and exemplifi ed by the development of solutions to regression imputation and table suppression. Keywords: Offi cial statistics, R; Common Statistical Production Architecture, Generic Statistical Information Model, Validation and Transformation Language, Imputation, Statistical disclosure control JEL Classifi cation: C18, C88publishedVersio

Directory of Open Access Journals

NORA - Norwegian Open Research Archives

Statistics Norway's Open Research Repository

Long-term Preservation of Longitudinal Statistical Surveys in Psycholinguistic Research

Author: Lendić Anabela
Poljičak Sušec Martina
Stančić Hrvoje
Publication venue: 'Faculty of Humanities and Social Sciences, University of Zagreb'
Publication date: 01/11/2015
Field of study

Psycholinguistics deals with different types of evidence and obtained data, including confidential information which needs to be protected from disclosure and other security threats. When it comes to speech-language pathologies, researchers in psycholinguistics are especially interested in aphasia. Aphasia is a loss of language ability as a consequence of brain damage, which may result from head injury or stroke. Research data has to be adequately stored, processed, protected and if possible, preserved for secondary use. Authors are proposing possible application of models and tools used in official statistics and concepts from the archival science that could contribute to solving the so far unresolved issues in the research on aphasia and its records management requirements in the context of long-term preservation, trust, and reuse

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Crossref

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Generic Statistical Business Process Model GSBPM . (Version 5.1, January 2019) Norsk oversettelse

Author
Publication venue: Statistisk sentralbyrå
Publication date: 02/12/2019
Field of study

Generic Statistical Business Process Model (GSBPM) beskriver og definerer forretningsprosesser som er nødvendige for å produsere offisiell statistikk. Den beskriver faser, delprosesser og overordnede prosesser i statistikkproduksjonen

Statistics Norway's Open Research Repository

An analysis of existing production frameworks for statistical and geographic information: Synergies, gaps and integration

Author: Ariza-López F.J.
Díaz-Díaz E.
González-Yanes A.
Lopez-Pellicer F.J.
Masó J.
Rodríguez-Pascual A.
Ureña-Cámara M.A.
Vilches-Blázquez L.M.
Villar-Iglesias A.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The production of official statistical and geospatial data is often in the hands of highly specialized public agencies that have traditionally followed their own paths and established their own production frameworks. In this article, we present the main frameworks of these two areas and focus on the possibility and need to achieve a better integration between them through the interoperability of systems, processes, and data. The statistical area is well led and has well-defined frameworks. The geospatial area does not have clear leadership and the large number of standards establish a framework that is not always obvious. On the other hand, the lack of a general and common legal framework is also highlighted. Additionally, three examples are offered: the first is the application of the spatial data quality model to the case of statistical data, the second of the application of the statistical process model to the geospatial case, and the third is the use of linked geospatial and statistical data. These examples demonstrate the possibility of transferring experiences/advances from one area to another. In this way, we emphasize the conceptual proximity of these two areas, highlighting synergies, gaps, and potential integration. © 2021 by the authors. Licensee MDPI, Basel, Switzerland

Repositorio Universidad de Zaragoza

О роли Общей системы метаданных в развитии статистики Азербайджана

Author: A. Sultanova A.
Айнур Султанова Айдын кызы
Publication venue: Information and publishing center "Statistics of Russia"
Publication date: 30/05/2018
Field of study

The author establishes the importance of the Common Metadata Framework adapted to Azerbaijani conditions and the possibility of using this experience by the statistical agencies when creating national statistical metadata systems. The article formulates proposals for upgrading the structure of the Common Metadata Framework with regard to its practical applications. It is the author’s opinion that conclusions and proposals made in this system research can be used to revise the Common Metadata Framework and to develop state programs aimed at improving statistical practice and metadata development strategy within the national statistical systems.Автором обосновываются значение адаптированной к практическим условиям Азербайджанской Республики Общей системы метаданных и возможности использования этого опыта статистическими службами при создании национальных статистических систем метаданных. В статье формулируются предложения по модернизации структуры Общей системы метаданных с учетом возможностей ее реализации на практике. По мнению автора, выводы и предложения, сделанные в процессе системного исследования, могут быть использованы как для актуализации Общей системы метаданных, так и для разработки государственных программ совершенствования официальной статистики и стратегии развития метаданных в национальных статистических системах

Voprosy statistiki (E-Journal) / Вопросы статистики

Capability maturity models towards improved quality of the sustainable development goals indicators data

Author: Estévez Elsa Clara
Fillottrani Pablo Rubén
Marcovecchio Ignacio
Thinyane Mamello
Publication venue
Publication date: 05/12/2022
Field of study

Achieving the Sustainable Development Goals (SDGs) demands coping with the data revolution for sustainable development: the integration of new and traditional data to produce high-quality information that is detailed, timely, and relevant for multiple purposes and to a variety of users. The quality of this information, defined by its completeness, uniqueness, timeliness, validity, accuracy, and consistency, is crucial for appropriate decision making; which leads to improvements in advancing national development imperatives for reaching the goals and targets of the sustainable development agenda. In this paper, we posit that the more mature the organizations within the national data ecosystems are, the higher the quality of data that they produce. The paper motivates for the adoption and mainstreaming of organizational Capability Maturity Models within the SGDs activities. It also presents the preliminary formulation of a multidimensional prescriptive Capability Maturity Model to assess and improve the maturity of organizations within national data ecosystems and, therefore, the effective monitoring of the progress on the SDG targets through the production of better quality indicators data. Furthermore, the paper provides recommendation towards addressing the challenges within the increasingly data-driven domain of social indicators monitoring.Facultad de Informátic

Servicio de Difusión de la Creación Intelectual

Provenance of "after the fact" harmonised community-based demographic and HIV surveillance data from ALPHA cohorts

Author: Kanjala C
Publication venue
Publication date
Field of study

Background: Data about data, metadata, for describing Health and Demographic Surveillance System (HDSS) data have often received insufficient attention. This thesis studied how to develop provenance metadata within the context of HDSS data harmonisation - the network for Analysing Longitudinal Population-based HIV/ AIDS data on Africa (ALPHA). Technologies from the data documentation community were customised, among them: A process model - Generic Longitudinal Business Process Model (GLBPM), two metadata standards - Data Documentation Initiative (DDI) and Standard for Data and Metadata eXchange (SDMX) and a data transformations description language - Structured Data Transform Language (SDTL). Methods: A framework with three complementary facets was used: Creating a recipe for annotating primary HDSS data using the GLBPM and DDI; Approaches for documenting data transformations. At a business level, prospective and retrospective documentation using GLBPM and DDI and retrospectively recovering the more granular details using SDMX and SDTL; Requirements analysis for a user-friendly provenance metadata browser. Results: A recipe for the annotation of HDSS data was created outlining considerations to guide HDSS on metadata entry, staff training and software costs. Regarding data transformations, at a business level, a specialised process model for the HDSS domain was created. It has algorithm steps for each data transformation sub-process and data inputs and outputs. At a lower level, the SDMX and SDTL captured about 80% (17/21) of the variable level transformations. The requirements elicitation study yielded requirements for a provenance metadata browser to guide developers. Conclusions: This is a first attempt ever at creating detailed metadata for this resource or any other similar resources in this field. HDSS can implement these recipes to document their data. This will increase transparency and facilitate reuse thus potentially bringing down costs of data management. It will arguably promote the longevity and wide and accurate use of these data

LSHTM Research Online

Statistical metadata in knowledge discovery.

Author: Burke Maria
Jiménez-Ramírez C
Rodríguez-Flores I
Publication venue: 'Universidad Nacional de Colombia'
Publication date: 01/01/2017
Field of study

Metadata represents the semantic schema of the data collected over the years by an organization in order to apply the business intelligence approach. However, the metadata normally collected are not enough to facilitate knowledge discovery processes because they are conceived, primarily, for the interoperability between information systems. Research undertaken in this study confirmed the need to enrich data warehousing systems with structured meaningful metadata in order to increase the productivity and efficacy of any investigation, including data management and future business analytics. This need led us to adopt and extend the concept of “statistical metadata”. Thus, our proposed conceptual model of statistical metadata not only considers recognized standards, but also represents other additional properties. This means that our conceptual model allows increased levels of detail about the data and quality of the semantic contents.Los metadatos representan el esquema semántico de los datos recolectados a lo largo de los años por una organización para aplicar el enfoque de inteligencia de negocios. Sin embargo, los metadatos normalmente recopilados no son suficientes para facilitar los procesos de descubrimiento de conocimiento porque están concebidos, principalmente, para la interoperabilidad entre sistemas de información. La investigación realizada en este estudio confirmó la necesidad de enriquecer los sistemas de almacenamiento de datos con metadatos significativos y estructurados con el fin de aumentar la productividad y la eficacia de cualquier investigación, incluida la gestión de datos y la analítica futura del negocio. Esta necesidad nos llevó a adoptar y ampliar el concepto de "metadatos estadísticos". Por lo tanto, nuestro modelo conceptual propuesto de metadatos estadísticos no sólo considera estándares reconocidos, sino que también representa otras propiedades adicionales. Esto significa que nuestro modelo conceptual permite mayores niveles de detalle sobre los datos y la calidad de los contenidos semánticos

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Winchester Research Repository

DIALNET

Universidad Nacional De Colombia - Repositorio Institucional UN

Связанные статистические данные: актуальность и перспективы

Author: E. Yasinovskaya D.
K. Laykam E.
Yu. Akatkin M.
Е. Ясиновская Д.
К. Лайкам Э.
Ю. Акаткин М.
Publication venue: 'Information and Publishing Centre Statistics of Russia'
Publication date: 01/05/2020
Field of study

After a detailed argumentation of the study’s relevance, this article discusses the prospects for introducing the concept of linked open statistics produced within the framework of a single information environment that ensures efficient production, dissemination, and reuse of statistical and administrative data. The implementation of this qualitatively new concept based on technological innovations and aimed to meet rapidly growing user demands is a key task of digital transformation, defined by the Government of the Russian Federation in the field of official statistics. The major part of open data concerns statistics such as demographic, economic and social indicators. Describing and presenting them in the form of linked open statistics sets an important background for accelerating socio-economic development by introducing new socially significant state, municipal, non-commercial and commercial services/products.Linked Open Statistical Data (LOSD) allows performing analysis based on a coordinated, integrated information environment as an alternative to using disparate and often controversial data sets. National statistical institutes and government bodies in many countries, together with international organizations, have already chosen the paradigm of linked open statistics. The authors discuss the advantages of this approach, as well as its practical application in international projects.The article presents the examples and best practices of linked open statistics in a number of publications and strategic documents within the European Statistical System. It also shows the constraints of the linked open statistics development due to the lack of accessible ontologies and standards - the extensions necessary to meet the requirements for classification and management of various concepts in statistics domain. The analysis of projects and initiatives carried out in the article reflects the possibilities and prospects of solving this problem in the field of state statistics. The authors formulate a set of recommendations based both on the analysis of international practice and on the results of their own development experience within the research project «Center of Semantic Integration».В данной статье после развернутой аргументации актуальности проведенного исследования рассмотрены перспективы внедрения концепции связанных статистических данных, формируемых в рамках единого информационного пространства, обеспечивающего эффективное производство, распространение и повторное использование статистических и административных данных. Реализация этой качественно новой концепции на основе технологических новаций, предпринимаемая в целях более полного удовлетворения быстро возрастающих потребностей пользователей - ключевая задача цифровой трансформации, определенная Правительством Российской Федерации в области официальной статистики. Большая часть открытых данных связана со статистикой: демографическими, экономическими и социальными показателями. Их описание и представление в виде связанных данных могло бы стать важной основой для ускорения социально-экономического развития страны путем создания новых общественно значимых государственных, муниципальных, некоммерческих и коммерческих услуг/продуктов.В статистике связанные открытые данные (Linked Open Statistical Data, LOSD) позволяют выполнять анализ на основе скоординированной, интегрированной информационной базы как альтернативы использованию разрозненных и часто противоречивых наборов данных. Национальные статистические службы и государственные органы целого ряда стран, а также международные организации уже перешли на парадигму связанных данных. Авторы статьи рассматривают преимущества этого подхода, а также практику его применения в международных проектах.Приведены примеры и лучший опыт создания связанных открытых статистических данных в публикациях и стратегических документах в рамках Европейской статистической системы. Показано, что развитие связанных статистических данных сдерживается отсутствием доступных онтологий и стандартов - расширений, необходимых для обеспечения требований к классификации различных концептов в статистике и управлению ими. Проведенный в статье анализ проектов и инициатив отражает возможности и перспективы решения данной проблемы в сфере государственной статистики. Сформулированные авторами рекомендации основаны как на анализе международной практики, так и на результатах собственного опыта разработок в рамках научно-исследовательского проекта «Центр семантической интеграции»

Voprosy statistiki (E-Journal) / Вопросы статистики