6,644 research outputs found

    It's Public Knowledge: The National Digital Archive of Datasets

    Get PDF
    This article describes the history and development of the National Digital Archive of Datasets, a service run by the University of London Computer Centre for the National Archives of England. It discusses the project in light of the context in which it emerged in the 1990s, its departure in approach from traditional data archives, and the range of archival functions. Finally, it offers reflections on the project as whole. Cet article décrit l’histoire et le développement du National Digital Archive of Datasets, un service offert par le centre informatique de l’Université de Londres pour les Archives nationales de l’Angleterre. L’auteure présente le contexte dans lequel le projet a émergé dans les années 1990, son approche qui diffère de celle des archives de données informatiques traditionnelles, ainsi que la gamme de ses fonctions archivistiques. Finalement, elle offre des réflexions sur le projet dans son ensemble

    Harvesting Entities from the Web Using Unique Identifiers -- IBEX

    Full text link
    In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested systematically from Web pages, and how they can be associated with human-readable names for the entities at large scale. Starting with a simple extraction of identifiers and names from Web pages, we show how we can use the properties of unique identifiers to filter out noise and clean up the extraction result on the entire corpus. The end result is a database of millions of uniquely identified entities of different types, with an accuracy of 73--96% and a very high coverage compared to existing knowledge bases. We use this database to compute novel statistics on the presence of products, people, and other entities on the Web.Comment: 30 pages, 5 figures, 9 tables. Complete technical report for A. Talaika, J. A. Biega, A. Amarilli, and F. M. Suchanek. IBEX: Harvesting Entities from the Web Using Unique Identifiers. WebDB workshop, 201

    Grids and the Virtual Observatory

    Get PDF
    We consider several projects from astronomy that benefit from the Grid paradigm and associated technology, many of which involve either massive datasets or the federation of multiple datasets. We cover image computation (mosaicking, multi-wavelength images, and synoptic surveys); database computation (representation through XML, data mining, and visualization); and semantic interoperability (publishing, ontologies, directories, and service descriptions)

    Exploring The Nature Of The Co-emergence Of Students’ Representational Fluency And Functional Thinking

    Get PDF
    Abstract In this dissertation, I explore ways to support secondary school students’ meaningful understanding of quadratic functions. Specifically, I investigate how students co-developed representational fluency (RF) and functional thinking (FT), when they gained meaningful understanding of quadratic functions. I also characterize students’ co-emergence of RF and FT on each representation (e.g., a graph, a symbolic equation, and a table) and across multiple representations. To accomplish these goals, I employed a design research methodology: a teaching experiment with eight Turkish-American secondary school students in an after-school context at a Turkish Community Center. I constructed the design principles and design elements for the study by networking two distinct domains of literature—representations and quantitative reasoning—to support students’ meaningful learning. I conducted ongoing and retrospective analyses on the enhanced transcriptions of small- and whole-group interactions. The analyses revealed a learning-ecology framework that supported secondary school students’ meaningful understanding of quadratic functions. The learning-ecology framework consisted of three components: enacted task characteristics, teacher pedagogical moves, and socio-mathematical norms. Furthermore, the findings showed that students employed two types of reasoning when they created and connected representations of quantities and the relationships between them: static thinking and lateral thinking. Static thinking is recalling a learned fact to represent a quantitative relationship with no attention to how quantities covary on a representation, while lateral thinking is a creative way of thinking wherein students conceive of concrete representations of functions as an emergent quantitative relationship. The findings also showed that students’ co-emergence of RF and FT can be operationalized into four levels starting from lesser sophisticated reasoning to greater sophisticated reasoning. Level 0 is a disconnection, level 1 is a partial connection, level 2 is a connection and level 3 is flexible a connection between students’ RF and FT. The dissertation informs teachers and the mathematics education community by (a) reporting and verifying the learning-ecology framework that supported students’ meaningful understanding of quadratic functions; and (b) characterizing students’ co-emergence of RF and FT within and across multiple representations

    Exploring the Nature of the Co-emergence of Students\u27 Representational Fluency and Functional Thinking

    Get PDF
    Abstract In this dissertation, I explore ways to support secondary school students\u27 meaningful understanding of quadratic functions. Specifically, I investigate how students co-developed representational fluency (RF) and functional thinking (FT), when they gained meaningful understanding of quadratic functions. I also characterize students\u27 co-emergence of RF and FT on each representation (e.g., a graph, a symbolic equation, and a table) and across multiple representations. To accomplish these goals, I employed a design research methodology: a teaching experiment with eight Turkish-American secondary school students in an after-school context at a Turkish Community Center. I constructed the design principles and design elements for the study by networking two distinct domains of literature—representations and quantitative reasoning—to support students\u27 meaningful learning. I conducted ongoing and retrospective analyses on the enhanced transcriptions of small- and whole-group interactions. The analyses revealed a learning-ecology framework that supported secondary school students\u27 meaningful understanding of quadratic functions. The learning-ecology framework consisted of three components: enacted task characteristics, teacher pedagogical moves, and socio-mathematical norms. Furthermore, the findings showed that students employed two types of reasoning when they created and connected representations of quantities and the relationships between them: static thinking and lateral thinking. Static thinking is recalling a learned fact to represent a quantitative relationship with no attention to how quantities covary on a representation, while lateral thinking is a creative way of thinking wherein students conceive of concrete representations of functions as an emergent quantitative relationship. The findings also showed that students\u27 co-emergence of RF and FT can be operationalized into four levels starting from lesser sophisticated reasoning to greater sophisticated reasoning. Level 0 is a disconnection, level 1 is a partial connection, level 2 is a connection and level 3 is flexible a connection between students\u27 RF and FT. The dissertation informs teachers and the mathematics education community by (a) reporting and verifying the learning-ecology framework that supported students\u27 meaningful understanding of quadratic functions; and (b) characterizing students\u27 co-emergence of RF and FT within and across multiple representations

    Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

    Full text link
    Many data analysis tasks heavily rely on a deep understanding of tables (multi-dimensional data). Across the tasks, there exist comonly used metadata attributes of table fields / columns. In this paper, we identify four such analysis metadata: Measure/dimension dichotomy, common field roles, semantic field type, and default aggregation function. While those metadata face challenges of insufficient supervision signals, utilizing existing knowledge and understanding distribution. To inference these metadata for a raw table, we propose our multi-tasking Metadata model which fuses field distribution and knowledge graph information into pre-trained tabular models. For model training and evaluation, we collect a large corpus (~582k tables from private spreadsheet and public tabular datasets) of analysis metadata by using diverse smart supervisions from downstream tasks. Our best model has accuracy = 98%, hit rate at top-1 > 67%, accuracy > 80%, and accuracy = 88% for the four analysis metadata inference tasks, respectively. It outperforms a series of baselines that are based on rules, traditional machine learning methods, and pre-trained tabular models. Analysis metadata models are deployed in a popular data analysis product, helping downstream intelligent features such as insights mining, chart / pivot table recommendation, and natural language QA...Comment: 13pages, 7 figures, 9 table

    Knowledge Graphs 2021: {A} Data Odyssey

    Get PDF
    • …
    corecore