2,221 research outputs found

    Extracting Linked Data from statistic spreadsheets

    Get PDF
    International audienceFact-checking journalists typically check the accuracy of a claim against some trusted data source. Statistic databases such as those compiled by state agencies are often used as trusted data sources, as they contain valuable, high-quality information. However, their usability is limited when they are shared in a format such as HTML or spreadsheets: this makes it hard to find the most relevant dataset for checking a specific claim, or to quickly extract from a dataset the best answer to a given query. In this work, we provide a conceptual model for the open data comprised in statistics published by INSEE, the national French economic and societal statistics institute. Then, we describe a novel method for extracting RDF Linked Open Data, to populate an instance of this model. We used our method to produce RDF data out of 20k+ Excel spreadsheets, and our validation indicates a 91% rate of successful extraction. Further, we also present a novel algorithm enabling the exploitation of such statistic tables, by (i) identifying the statistic datasets most relevant for a given fact-checking query, and (ii) extracting from each dataset the best specific (precise) query answer it may contain. We have implemented our approach and experimented on the complete corpus of statistics obtained from INSEE, the French national statistic institute. Our experiments and comparisons demonstrate the effectiveness of our proposed method

    Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

    Full text link
    Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 20% reduction in storage, and up to 50% reduction in formula evaluation time

    A Contribution to Conveying Quality Criteria in Mechanical CAD Models and Assemblies through Rubrics and Comprehensive Design Intent Quantification

    Full text link
    Esta investigación examinó el uso de rúbricas de ensamblaje, describiendo su evolución a partir de rúbricas de piezas, y estudió cómo afectan a la autoevaluación de los estudiantes. También se valoró la evaluación de los estudiantes por los instructores, encontrando que, mientras que las rúbricas de ensamblaje fueron parcialmente comprendidas y utilizadas de manera eficiente por los estudiantes, éstas fueron usadas con más éxito por los instructores. En esta investigación se han abordado estrategias diseñadas para mejorar la comunicación de la intención de diseño en modelos CAD, acrecentando así su calidad, con directrices dirigidas a evaluar su eficiencia. Es evidente que se necesitan métricas dirigidas hacia la instrucción de la intención de diseño, ya que la intención de diseño transferida a través de modelos CAD puede realizarse en tres etapas con criterios contradictorios que deben ser equilibrados para llegar a la mejor estrategia de modelado. La investigación ha incluido el desarrollo de un método de validación que demuestra que las rúbricas son dispositivos útiles para garantizar una comunicación consistente de la intención de diseño, y son fundamentales no sólo para evaluar, sino también para comunicar las expectativas del instructor. En esta investigación se examinó cómo definir claramente las cualidades de la intención de diseño para permitir una más fácil evaluación de un ensamblaje CAD. Para todas las dimensiones de la rúbrica, se encontró más concordancia y correlación entre instructores que entre instructores y estudiantes. Existe una correlación moderada/fuerte entre los instructores para las dimensiones de la validez, completitud, concisión y claridad, mientras que existe una ligera correlación para las dimensiones de consistencia e intención del diseño. En segundo lugar, las rúbricas también pueden ser descritas como estáticas o dinámicas. Las rúbricas estáticas, existen sólo en papel, no proporcionan retroalimentación inmediata al educando. Las rúbricas dinámicas realizan cálculos que proporcionan observaciones de evaluación inmediatas al usuario. Además, pueden adaptarse a situaciones específicas dependiendo de la capacidad del usuario. Las rúbricas electrónicas son ideales para rúbricas dinámicas, y permiten el uso y desarrollo de rúbricas adaptativas y adaptables, como se describe a continuación. En tercer lugar, las rúbricas deben ser adaptables lo que debería hacerlas fácilmente comprensibles y fáciles de usar, y adaptativas. Las rúbricas de evaluación se usan cuando un experto determina el progreso pedagógico de un educando, mientras que las rúbricas formativas son empleadas por los propios estudiantes, para trazar su progreso e identificar las deficiencias escolares para las que necesitan apoyo. Las rúbricas se deben refinar y mejorar de forma continuada, en un proceso iterativo y colaborativo, hasta que se alcance un acuerdo satisfactorio, tanto entre evaluadores como entre evaluadores y alumnos. Por ello, se desarrollaron mapas de aserciones que ilustran el modo en que la estrategia de expansión-contracción adapta las rúbricas al progreso del aprendiz de CAD, a la vez que ayudan a comprender las diferentes dimensiones de la rúbrica. Basándose en los experimentos con las rúbricas de ensamblajes, es evidente que las pequeñas diferencias entre los instructores sugieren que la rúbrica de ensamblajes propuesta es lo suficientemente sofisticada como para proporcionar una evaluación acumulativa imparcial del desempeño del alumno. En consecuencia, se puede afirmar con confianza que los evaluadores pueden usarse indistintamente sin sacrificar la precisión. Sin embargo, la rúbrica de ensamblaje posee una eficacia finita para producir una autoevaluación formativa de las habilidades de ensamblaje CAD para nuevos alumnos.This research examined the use of assembly rubrics, described how they evolved from parts rubrics, and studied how they affect student self-evaluation. Instructor assessment of students was also evaluated, finding that while the assembly rubrics were partially understood and effectively used by the students, they were more successfully utilized by the instructors. Strategies designed to improve design intent communication in CAD models, in order to enhance their quality, with guidelines targeted to evaluate efficiency, have been addressed with this research. It is apparent that metrics directed toward the instruction of design intent are needed, since design intent transferred through CAD models can be performed at three stages with competing tradeoffs that must be balanced to arrive at the best modeling strategy. Research included the development of a validation approach that reflects that rubrics are valuable devices to expedite consistent design intent communication, and are vital not only for evaluation, but also for the communication of instructor expectations. This research examined how to clearly define qualities of design intent to enable easier CAD assembly assessment. It has been found that there is more inter-rater agreement and correlation between instructors than between instructors and students, for all rubric dimensions. There is strong to moderate correlation between instructors for the dimensions of validity, completeness, conciseness, and clarity, while slight correlation exists for the dimensions of consistency and design intent. Secondly, rubrics can also be described as being either static or dynamic. Static rubrics, existing in paper form only, do not provide immediate feedback to the learner. Dynamic rubrics perform calculations that provide immediate evaluative observations to the user. Besides, they can be independently adapted to specific situations depending on the capability of the user. Electronic rubrics are ideally suited for dynamic rubrics, and permit the use and development of both adaptable and adaptive rubrics, as described next. Thirdly, rubrics need to be adaptable which should make them easily understood and user-friendly, and adaptive (rubric can change itself, depending on the usage pattern). Evaluative rubrics are used when an expert determines the pedagogical progress of a learner, while formative rubrics are employed by the learners themselves, in order to chart their progress and identify scholastic deficiencies that are in need of remediation. Rubrics must be continually refined and improved, in an iterative, collaborative process, until satisfactory agreement is attained, both between raters, but also between raters and learners. Thus, assertions maps were developed, illustrating how the expand-contract strategy adapts the rubrics to CAD trainee progress, while assisting the understanding of the different rubric dimensions. Based on the assembly rubric experiments, it is apparent that the small differences between instructors suggests that the proposed assemblies rubric is sufficiently sophisticated to furnish an unbiased accumulative assessment of student performance. Accordingly, it can be confidently stated that raters can be used interchangeably without sacrificing accuracy. However, the assembly rubric possesses finite efficacy to produce formative self-evaluation of CAD assembly skills for new learners.Aquesta investigació examinà l'ús de rúbriques de acoblament, descrivint la seua evolució a partir de rúbriques de peces, i estudià cóm afecten a la autoavaluació dels estudiants. També es va valorar la avaluació dels estudiants per els instructors, trobant que, mentre que les rúbriques de acoblament van ser parcialment compreses i fetes servir de manera eficient per els estudiants, van ser usades amb mes èxit per els instructors. En aquesta investigació s'han abordat estrategues dissenyades per a millorar la comunicació de la intenció de disseny en models CAD, creixentat així la seua qualitat, amb directrius dirigides a avaluar la seua eficiència. Es evident que es necessiten mètriques dirigides cap a la instrucció de la intenció de disseny, ja que la intenció de disseny transferida a través de models CAD pot realitzar-se en tres etapes amb criteris contradictoris que deuen ser equilibrats per a arribar a la millor estratègia de modelatge. La investigació ha inclòs el desenvolupament de un mètode de validació que demostra que las rúbriques son dispositius útils per a garantir una comunicació consistent de la intenció de disseny, i son fonamentals no només per a avaluar, però també per a comunicar les expectatives de l'instructor. En aquesta investigació s'examinà com definir clarament les qualitats de la intenció de disseny per a permetre una mes fàcil avaluació de un acoblament CAD. Per a totes les dimensions de la rúbrica, es va trobar mes concordança i correlació entre instructors que no pas entre instructors i estudiants. Existeix una correlació moderada/fort entre els instructors per a les dimensions de la validesa, completesa, concisió i claredat, mentre que existeix una lleugera correlació per a les dimensions de consistència i intenció del disseny. En segon lloc, les rúbriques també poden ser descrites com estàtiques o dinàmiques. Les rúbriques estàtiques, existeixen sòls en paper, no proporcionen retroalimentació immediata a l'educand. Les rúbriques dinàmiques realitzen càlculs que proporcionen observacions de avaluació immediates al usuari. A mes, poden adaptar-se a situacions específiques dependent de la capacitat de l'usuari. Les rúbriques electròniques son ideals per a rúbriques dinàmiques, i permeten l'ús i desenvolupament de rúbriques adaptatives i adaptables, como es descriu a continuació. En tercer lloc, les rúbriques deuen ser adaptables, el que deuria ferles fàcilment comprensibles i fàcils d'usar, i adaptatives. Les rúbriques d'avaluació se usen quant un expert determina el progrés pedagògic de un educand, mentre que les rúbriques formatives son fetes servir per els propis estudiants, per a traçar el seu progrés i identificar les deficiències escolars para a les que necessiten suport. Les rúbriques es deuen refinar i millorar de forma continuada, en un procés iteratiu i col·laboratori, fins que se arriba a un acord satisfactori, tant entre avaluadors como entre avaluadors i alumnes. Por això, es desenvoluparen mapes de assercions que il·lustren el mode en que la estratègia de expansió-contracció adapta les rúbriques al progres del aprenent de CAD, a la vegada que ajuden a comprendre les diferents dimensions de la rúbrica. Basant-se en els experiments amb les rúbriques de acoblaments, es evident que les xicotetes diferencies entre els instructors suggereixen que la rúbrica de acoblaments proposta es lo suficientment sofisticada com per a proporcionar una avaluació acumulativa imparcial del acompliment de l'alumne. En conseqüència, es pot afirmar amb confiança que els avaluadors poden usar-se indistintament sense sacrificar la precisió. No obstant, la rúbrica de acoblaments posseeix una eficàcia finita per a produir una autoavaluació formativa de les habilitats de acoblament CAD per a nous alumnes.Otey, JM. (2017). A Contribution to Conveying Quality Criteria in Mechanical CAD Models and Assemblies through Rubrics and Comprehensive Design Intent Quantification [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/94627TESI

    GeoLinked Data. An application case / Un caso de aplicación

    Get PDF
    In this paper we present the process that has been followed for the development of an application that makes use of several heterogeneous Spanish public datasets that are related to three themes of INSPIRE Directive, specifically Administrative Units, Hydrography, and Statistical Units. Our application aims at analysing existing relations between the Spanish coastal area and different statistical variables such as population, unemployment, dwelling, industry, and building trade. Besides providing ethodological guidelines for the generation, publishing and exploitation of Linked Data from such datasets, we provide an important innovation with respect to other similar processes followed in other initiatives by dealing with the geometrical information of features

    What young graduates earn when they leave study

    Get PDF
    This report examines outcomes for young people who complete a qualification in the New Zealand tertiary education system, looking at differences in incomes for different types of qualifications. Overview People take tertiary education for many reasons. They think about what they enjoy, what they are good at, what they are capable of and what will get them started on a career. Good careers are associated with better health, better well-being and more satisfying lives. So many young people are making their tertiary education choices to gain the skills they need for satisfying and rewarding work. They use a range of information sources to help them make these choices. The information in this report is designed to add to the data available to young people facing those decisions. This information is not just important to students and to their families. The Government makes a very large investment in tertiary education each year – funding tertiary education providers, providing subsidised student loans and granting student allowances. One major purpose of the Government’s investment is to help improve the New Zealand economy and society by raising the level of skill in the population – which helps make our society more productive, contributes to the creation of wealth and leads to better social outcomes. Studying the earnings of graduates is one way of looking at the contribution that the tertiary education system is making to New Zealand’s society and economy. So the information in this report contributes to an understanding of the value New Zealand receives for the investment we make in tertiary education. Key findings Earnings increase with the level of qualification completed. The biggest jump in earnings is between those with qualifications below degree level and those with degrees. Earnings remain consistently higher for those with higher qualifications. Those with higher qualifications consistently earn more for the first seven years post study, with no sign of these benefits decreasing. Employment rates increase with level of qualification gained. For example, in the first year after study, 54 percent of young bachelors graduates who stayed in New Zealand were in employment and 40 percent were in further study. Of young people who had completed a level 1-3 certificate and stayed in New Zealand, 35 percent were in employment and 48 percent were taking more study. Very few young people who complete a qualification at diploma level or above are on a benefit in the first seven years after study. For those who stay in New Zealand, the benefit rate is 6 percent for diploma graduates and 2 percent at bachelors level in each of the first seven years after study. But it is around 14 percent for those who graduated with certificates at levels 1-3. Earnings vary considerably by field of study. Young graduates with bachelors degrees in medicine earn the most of all bachelors graduates. The median income for medical graduates is over 110,300fiveyearsafterleavingstudy,comparedto110,300 five years after leaving study, compared to 51,600 for all young bachelors graduates. Bachelors degree graduates in creative arts have the lowest earnings among young bachelors graduates after five years and they have relatively high rates of benefit receipt. Some qualification types and some fields are associated with high rates of further study. Around half of all young people who complete a certificate or level 5-7 diploma move into further study the next year. Around 60 percent of young bachelors graduates in natural and physical sciences who stay in New Zealand were in further study one year after completion of a bachelors degree, and 32 percent after five years. Those who complete graduate certificates and diplomas have very high employment rates. Employment rates are around 80 percent or just below in the first three years after study for those who have completed a graduate certificate or diploma and who remain in New Zealand. Many of these graduates have completed this qualification as a way of improving their employment prospects or are studying while in employment. The effect of the recession on the earnings of young graduates is still apparent. Although the country as a whole has pulled out of recession, the effects on young people have lingered with graduate earnings continuing to drop in real terms compared to those reported in our first study, Moving on up, for most years after study and at almost all qualification levels. However, there are indications that the rate of decrease in earnings may have been slowing down for recent graduates by the end of the 2012 tax year

    Phylogenetic signal in phonotactics

    Full text link
    Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data--in this instance, statistical phonotactics. We extract phonotactic data from 111 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the viability of employing a new source of readily extractable data in historical and comparative linguistics.Comment: Main text: 32 pages, 17 figures, 1 table. Supplementary Information: 17 pages, 1 figure. Code and data available at http://doi.org/10.5281/zenodo.3936353. This article is in review but not yet accepted for publication in a journa

    Geographies of landscape aesthetics : mapping landscape terminology in digitised historical travel accounts of Loch Lomond and the Trossachs

    Get PDF
    Acknowledgements Ogg.: Conceptualisation, Methodology, Formal Analysis, Investigation, Visualisation, Writing –Original Draft Preparation. Wartmann.: Conceptualisation, Supervision, Writing – Review & Editing.Peer reviewedPublisher PD
    corecore