Search CORE

6,644 research outputs found

It's Public Knowledge: The National Digital Archive of Datasets

Author: Sleeman Patricia
Publication venue: Association of Canadian Archivists
Publication date: 01/01/2004
Field of study

This article describes the history and development of the National Digital Archive of Datasets, a service run by the University of London Computer Centre for the National Archives of England. It discusses the project in light of the context in which it emerged in the 1990s, its departure in approach from traditional data archives, and the range of archival functions. Finally, it offers reflections on the project as whole. Cet article décrit l’histoire et le développement du National Digital Archive of Datasets, un service offert par le centre informatique de l’Université de Londres pour les Archives nationales de l’Angleterre. L’auteure présente le contexte dans lequel le projet a émergé dans les années 1990, son approche qui diffère de celle des archives de données informatiques traditionnelles, ainsi que la gamme de ses fonctions archivistiques. Finalement, elle offre des réflexions sur le projet dans son ensemble

Archivaria

Archivaria - the journal of the Association of Canadian Archivists (ACA)

Harvesting Entities from the Web Using Unique Identifiers -- IBEX

Author: Banko M.
Baumgartner R.
Crescenzi V.
Freitag D.
Nakashole N.
Probst K.
Putthividhya D.
Talaika A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/05/2015
Field of study

In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested systematically from Web pages, and how they can be associated with human-readable names for the entities at large scale. Starting with a simple extraction of identifiers and names from Web pages, we show how we can use the properties of unique identifiers to filter out noise and clean up the extraction result on the entire corpus. The end result is a database of millions of uniquely identified entities of different types, with an accuracy of 73--96% and a very high coverage compared to existing knowledge bases. We use this database to compute novel statistics on the presence of products, people, and other entities on the Web.Comment: 30 pages, 5 figures, 9 tables. Complete technical report for A. Talaika, J. A. Biega, A. Amarilli, and F. M. Suchanek. IBEX: Harvesting Entities from the Web Using Unique Identifiers. WebDB workshop, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Grids and the Virtual Observatory

Author: Williams Roy
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2003
Field of study

We consider several projects from astronomy that benefit from the Grid paradigm and associated technology, many of which involve either massive datasets or the federation of multiple datasets. We cover image computation (mosaicking, multi-wavelength images, and synoptic surveys); database computation (representation through XML, data mining, and visualization); and semantic interoperability (publishing, ontologies, directories, and service descriptions)

Caltech Authors

Exploring The Nature Of The Co-emergence Of Students’ Representational Fluency And Functional Thinking

Author: Altindis Nigar
Publication venue: SURFACE at Syracuse University
Publication date: 22/05/2021
Field of study

Abstract In this dissertation, I explore ways to support secondary school students’ meaningful understanding of quadratic functions. Specifically, I investigate how students co-developed representational fluency (RF) and functional thinking (FT), when they gained meaningful understanding of quadratic functions. I also characterize students’ co-emergence of RF and FT on each representation (e.g., a graph, a symbolic equation, and a table) and across multiple representations. To accomplish these goals, I employed a design research methodology: a teaching experiment with eight Turkish-American secondary school students in an after-school context at a Turkish Community Center. I constructed the design principles and design elements for the study by networking two distinct domains of literature—representations and quantitative reasoning—to support students’ meaningful learning. I conducted ongoing and retrospective analyses on the enhanced transcriptions of small- and whole-group interactions. The analyses revealed a learning-ecology framework that supported secondary school students’ meaningful understanding of quadratic functions. The learning-ecology framework consisted of three components: enacted task characteristics, teacher pedagogical moves, and socio-mathematical norms. Furthermore, the findings showed that students employed two types of reasoning when they created and connected representations of quantities and the relationships between them: static thinking and lateral thinking. Static thinking is recalling a learned fact to represent a quantitative relationship with no attention to how quantities covary on a representation, while lateral thinking is a creative way of thinking wherein students conceive of concrete representations of functions as an emergent quantitative relationship. The findings also showed that students’ co-emergence of RF and FT can be operationalized into four levels starting from lesser sophisticated reasoning to greater sophisticated reasoning. Level 0 is a disconnection, level 1 is a partial connection, level 2 is a connection and level 3 is flexible a connection between students’ RF and FT. The dissertation informs teachers and the mathematics education community by (a) reporting and verifying the learning-ecology framework that supported students’ meaningful understanding of quadratic functions; and (b) characterizing students’ co-emergence of RF and FT within and across multiple representations

Syracuse University Research Facility and Collaborative Environment

Exploring the Nature of the Co-emergence of Students\u27 Representational Fluency and Functional Thinking

Author: Altindis Nigar
Publication venue: SURFACE at Syracuse University
Publication date: 23/05/2021
Field of study

Abstract In this dissertation, I explore ways to support secondary school students\u27 meaningful understanding of quadratic functions. Specifically, I investigate how students co-developed representational fluency (RF) and functional thinking (FT), when they gained meaningful understanding of quadratic functions. I also characterize students\u27 co-emergence of RF and FT on each representation (e.g., a graph, a symbolic equation, and a table) and across multiple representations. To accomplish these goals, I employed a design research methodology: a teaching experiment with eight Turkish-American secondary school students in an after-school context at a Turkish Community Center. I constructed the design principles and design elements for the study by networking two distinct domains of literature—representations and quantitative reasoning—to support students\u27 meaningful learning. I conducted ongoing and retrospective analyses on the enhanced transcriptions of small- and whole-group interactions. The analyses revealed a learning-ecology framework that supported secondary school students\u27 meaningful understanding of quadratic functions. The learning-ecology framework consisted of three components: enacted task characteristics, teacher pedagogical moves, and socio-mathematical norms. Furthermore, the findings showed that students employed two types of reasoning when they created and connected representations of quantities and the relationships between them: static thinking and lateral thinking. Static thinking is recalling a learned fact to represent a quantitative relationship with no attention to how quantities covary on a representation, while lateral thinking is a creative way of thinking wherein students conceive of concrete representations of functions as an emergent quantitative relationship. The findings also showed that students\u27 co-emergence of RF and FT can be operationalized into four levels starting from lesser sophisticated reasoning to greater sophisticated reasoning. Level 0 is a disconnection, level 1 is a partial connection, level 2 is a connection and level 3 is flexible a connection between students\u27 RF and FT. The dissertation informs teachers and the mathematics education community by (a) reporting and verifying the learning-ecology framework that supported students\u27 meaningful understanding of quadratic functions; and (b) characterizing students\u27 co-emergence of RF and FT within and across multiple representations

Syracuse University Research Facility and Collaborative Environment

Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Author: Han Shi
He Xinyi
Li Tianle
Lv Xiao
Shao Yijia
Xu Jialiang
Yuan Zejian
Zhang Dongmei
Zhou Mengyu
Publication venue
Publication date: 02/09/2022
Field of study

Many data analysis tasks heavily rely on a deep understanding of tables (multi-dimensional data). Across the tasks, there exist comonly used metadata attributes of table fields / columns. In this paper, we identify four such analysis metadata: Measure/dimension dichotomy, common field roles, semantic field type, and default aggregation function. While those metadata face challenges of insufficient supervision signals, utilizing existing knowledge and understanding distribution. To inference these metadata for a raw table, we propose our multi-tasking Metadata model which fuses field distribution and knowledge graph information into pre-trained tabular models. For model training and evaluation, we collect a large corpus (~582k tables from private spreadsheet and public tabular datasets) of analysis metadata by using diverse smart supervisions from downstream tasks. Our best model has accuracy = 98%, hit rate at top-1 > 67%, accuracy > 80%, and accuracy = 88% for the four analysis metadata inference tasks, respectively. It outperforms a series of baselines that are based on rules, traditional machine learning methods, and pre-trained tabular models. Analysis metadata models are deployed in a popular data analysis product, helping downstream intelligent features such as insights mining, chart / pivot table recommendation, and natural language QA...Comment: 13pages, 7 figures, 9 table

arXiv.org e-Print Archive

Knowledge Graphs 2021: {A} Data Odyssey

Author: Weikum G.
Publication venue
Publication date: 01/01/2021
Field of study

MPG.PuRe