6 research outputs found

    Work in progress: Data explorer - Assessment data integration, analytics, and visualization for STEM education research

    Get PDF
    Citation: Weese, J. L., & Hsu, W. H. (2016). Work in progress: Data explorer - Assessment data integration, analytics, and visualization for STEM education research.We describe a comprehensive system for comparative evaluation of uploaded and preprocessed data in physics education research with applicability to standardized assessments for discipline-based education research, especially in science, technology, mathematics, and engineering. Views are provided for inspection of aggregate statistics about student scores, comparison over time within one course, or comparison across multiple years. The design of this system includes a search facility for retrieving anonymized data from classes similar to the uploader's own. These visualizations include tracking of student performance on a range of standardized assessments. These assessments can be viewed as pre- and post-tests with comparative statistics (e.g., normalized gain), decomposed by answer in the case of multiple-choice questions, and manipulated using pre-specified data transformations such as aggregation and refinement (drill down and roll up). Furthermore, the system is designed to incorporate a scalable framework for machine learning-based analytics, including clustering and similarity-based retrieval, time series prediction, and probabilistic reasoning. © American Society for Engineering Education, 2016

    Data Integration for Open Data on the Web

    Get PDF
    In this lecture we will discuss and introduce challenges of integrating openly available Web data and how to solve them. Firstly, while we will address this topic from the viewpoint of Semantic Web research, not all data is readily available as RDF or Linked Data, so we will give an introduction to different data formats prevalent on the Web, namely, standard formats for publishing and exchanging tabular, tree-shaped, and graph data. Secondly, not all Open Data is really completely open, so we will discuss and address issues around licences, terms of usage associated with Open Data, as well as documentation of data provenance. Thirdly, we will discuss issues connected with (meta-)data quality issues associated with Open Data on the Web and how Semantic Web techniques and vocabularies can be used to describe and remedy them. Fourth, we will address issues about searchability and integration of Open Data and discuss in how far semantic search can help to overcome these. We close with briefly summarizing further issues not covered explicitly herein, such as multi-linguality, temporal aspects (archiving, evolution, temporal querying), as well as how/whether OWL and RDFS reasoning on top of integrated open data could be help

    A decision support system for eco-efficient biorefinery process comparison using a semantic approach

    Get PDF
    Enzymatic hydrolysis of the main components of lignocellulosic biomass is one of the promising methods to further upgrading it into biofuels. Biomass pre-treatment is an essential step in order to reduce cellulose crystallinity, increase surface and porosity and separate the major constituents of biomass. Scientific literature in this domain is increasing fast and could be a valuable source of data. As these abundant scientific data are mostly in textual format and heterogeneously structured, using them to compute biomass pre-treatment efficiency is not straightforward. This paper presents the implementation of a Decision Support System (DSS) based on an original pipeline coupling knowledge engineering (KE) based on semantic web technologies, soft computing techniques and environmental factor computation. The DSS allows using data found in the literature to assess environmental sustainability of biorefinery systems. The pipeline permits to: (1) structure and integrate relevant experimental data, (2) assess data source reliability, (3) compute and visualize green indicators taking into account data imprecision and source reliability. This pipeline has been made possible thanks to innovative researches in the coupling of ontologies, uncertainty management and propagation. In this first version, data acquisition is done by experts and facilitated by a termino-ontological resource. Data source reliability assessment is based on domain knowledge and done by experts. The operational prototype has been used by field experts on a realistic use case (rice straw). The obtained results have validated the usefulness of the system. Further work will address the question of a higher automation level for data acquisition and data source reliability assessment

    Bringing computational thinking to K-12 and higher education

    Get PDF
    Doctor of PhilosophyDepartment of Computer ScienceWilliam H. HsuSince the introduction of new curriculum standards at K-12 schools, computational thinking has become a major research area. Creating and delivering content to enhance these skills, as well as evaluation, remain open problems. This work describes different interventions based on the Scratch programming language aimed toward improving student self-efficacy in computer science and computational thinking. These interventions were applied at a STEM outreach program for 5th-9th grade students. Previous experience in STEM-related activities and subjects, as well as student self-efficacy, were surveyed using a developed pre- and post-survey. The impact of these interventions on student performance and confidence, as well as the validity of the instrument are discussed. To complement attitude surveys, a translation of Scratch to Blockly is proposed. This will record student programming behaviors for quantitative analysis of computational thinking in support of student self-efficacy. Outreach work with Kansas Starbase, as well as the Girl Scouts of the USA, is also described and evaluated. A key goal for computational thinking in the past 10 years has been to bring computer science to other disciplines. To test the gap from computer science to STEM, computational thinking exercises were embedded in an electromagnetic fields course. Integrating computation into theory courses in physics has been a curricular need, yet there are many difficulties and obstacles to overcome in integrating with existing curricula and programs. Recommendations from this experimental study are given towards integrating CT into physics a reality. As part of a continuing collaboration with physics, a comprehensive system for automated extraction of assessment data for descriptive analytics and visualization is also described

    Robust Entity Linking in Heterogeneous Domains

    Get PDF
    Entity Linking is the task of mapping terms in arbitrary documents to entities in a knowledge base by identifying the correct semantic meaning. It is applied in the extraction of structured data in RDF (Resource Description Framework) from textual documents, but equally so in facilitating artificial intelligence applications, such as Semantic Search, Reasoning and Question and Answering. Most existing Entity Linking systems were optimized for specific domains (e.g., general domain, biomedical domain), knowledge base types (e.g., DBpedia, Wikipedia), or document structures (e.g., tables) and types (e.g., news articles, tweets). This led to very specialized systems that lack robustness and are only applicable for very specific tasks. In this regard, this work focuses on the research and development of a robust Entity Linking system in terms of domains, knowledge base types, and document structures and types. To create a robust Entity Linking system, we first analyze the following three crucial components of an Entity Linking algorithm in terms of robustness criteria: (i) the underlying knowledge base, (ii) the entity relatedness measure, and (iii) the textual context matching technique. Based on the analyzed components, our scientific contributions are three-fold. First, we show that a federated approach leveraging knowledge from various knowledge base types can significantly improve robustness in Entity Linking systems. Second, we propose a new state-of-the-art, robust entity relatedness measure for topical coherence computation based on semantic entity embeddings. Third, we present the neural-network-based approach Doc2Vec as a textual context matching technique for robust Entity Linking. Based on our previous findings and outcomes, our main contribution in this work is DoSeR (Disambiguation of Semantic Resources). DoSeR is a robust, knowledge-base-agnostic Entity Linking framework that extracts relevant entity information from multiple knowledge bases in a fully automatic way. The integrated algorithm represents a collective, graph-based approach that utilizes semantic entity and document embeddings for entity relatedness and textual context matching computation. Our evaluation shows, that DoSeR achieves state-of-the-art results over a wide range of different document structures (e.g., tables), document types (e.g., news documents) and domains (e.g., general domain, biomedical domain). In this context, DoSeR outperforms all other (publicly available) Entity Linking algorithms on most data sets
    corecore