1,358 research outputs found

    Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

    Get PDF
    Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other

    Improving dimensionality reduction projections for data visualization

    Get PDF
    In data science and visualization, dimensionality reduction techniques have been extensively employed for exploring large datasets. These techniques involve the transformation of high-dimensional data into reduced versions, typically in 2D, with the aim of preserving significant properties from the original data. Many dimensionality reduction algorithms exist, and nonlinear approaches such as the t-SNE (t-Distributed Stochastic Neighbor Embedding) and UMAP (Uniform Manifold Approximation and Projection) have gained popularity in the field of information visualization. In this paper, we introduce a simple yet powerful manipulation for vector datasets that modifies their values based on weight frequencies. This technique significantly improves the results of the dimensionality reduction algorithms across various scenarios. To demonstrate the efficacy of our methodology, we conduct an analysis on a collection of well-known labeled datasets. The results demonstrate improved clustering performance when attempting to classify the data in the reduced space. Our proposal presents a comprehensive and adaptable approach to enhance the outcomes of dimensionality reduction for visual data exploration.This research was funded by PID2021-122136OB-C21 from the Ministerio de Ciencia e InnovaciĂłn, Spain, by 839 FEDER (EU) funds.Peer ReviewedPostprint (published version

    Bridging the gap between structures and properties: An investigation and evaluation of students\u27 representational competence

    Get PDF
    The heart of learning chemistry is the ability to connect a compound\u27s structure to its function; Lewis structures provide an essential link in this process. In many cases, their construction is taught using an algorithmic approach, containing a set of step-by-step rules. We believe that this approach is in direct conflict with the precepts of meaningful learning. From a sequential, mixed methods study, we found that students have much difficulty constructing these structures and that the step-by-step rules do not make use of students\u27 relevant prior knowledge. This causes students to develop \u27home grown\u27 rules when unsure of how to progress with the construction process. It also became clear that most students are uncertain of the importance of Lewis structures since they perceive them as being useful only for obtaining structural information but not property information. Using responses from student interviews and open ended questions, the Information from Lewis Structures Survey (ILSS) was developed, validated, and found reliable to assess students\u27 representational competence by determining their understanding of the purpose of Lewis structures. Since students had many problems with the relationship of structures and properties, an alternative curriculum was evaluated to determine if it could help students develop a more meaningful understanding of this process. This instruction was part of a larger NSF-funded general chemistry curriculum redesign called Chemistry, Life, the Universe and Everything (CLUE). Using a control and treatment group, the effectiveness of this new curriculum was evaluated for two main aspects: 1. the students\u27 ability to construct Lewis structures using OrganicPad and 2. the students\u27 representational competence using the ILSS. Through four main studies (a pilot study, instructor effect study, main study, and retention study), we found that the CLUE curriculum helps students develop more expert-like strategies for constructing Lewis structures and a better understanding of why these structures are important by encouraging more meaningfully learning

    Uma abordagem de processamento de linguagem natural para avaliação de complexidade em literatura médica do século XVIII

    Get PDF
    In this paper, we present an experiment for complexity-level analysis of Portuguese texts from the 18th century using NLP tools. The 18th century was the time for the realization of a new world that had been built since the Renaissance, it was the period of consolidation of many of the current sciences. One of its characteristics is the presentation of scientific written records in national languages, rather than Latin, and the expressed wishes that the specialized texts could be more understandable to people of lesser erudition. As such, we intend to collaborate to identify if and how these wishes were fulfilled. To achieve this goal, we resort to an NLP supporting methodology to detect degrees of complexity of a medical work of this time period, and compare it with two other works that have hypothesized lesser and greater complexities. By using NILC-Metrix, we intend to identify features of a continuum of complexity in this kind of documents

    Experimental-Teaching: ‘Help-sheet’ in Examination of Engineering-students

    Get PDF
    The-purpose of this-unfunded, miniature-study is to-examine the-potentials of student-created ‘help-sheets’ and attitudes of undergraduate-students, towards the-sheets, used, at university-examinations, at school of Engineering. A-specifically-designed-experiment, a-survey, and a-document-analysis, were used, as main-instruments, for this-study. A-paired t-test was run on a-sample of 24 students, to-determine whether there was a-statistically-significant mean-difference, between the-student-performances at the-CAT#1 (where ‘help-sheets’ were-used) and: (1) CAT #2, where ‘help-sheets’ were not used; (2) final-exam; and (3) student-average weighted-mean-score, for the-previous-year. Moreover, unpaired-t-test was-employed, to-compare performance, between the-students, who used ‘help-sheet’ (in CAT #1) and these who did not, assuming unequal-variances. Mean; Standard-Deviation (SD); and Standard-Error of the-Mean (SEM) were calculated via Minitab 17.3.1. This-study revealed vast-diversity, in the-quality and composition, of student-created ‘help-sheets’. Moreover, positive-attitudes towards ‘help-sheets’, were identified, in-particular: 88% of the-class have-prepared and utilized their-‘help-sheets’ for the-experiment; 76% reported to-be less-nervous, than usual; 95% agreed, that the-use of ‘help-sheet’ was-beneficial; and 81% confirmed, that they would-like to-use the-same-approach, in other-subjects.  Comparisons of student performance indicated, that the-preparation and use, of student-created ‘help-sheets’ have no impact on student-performance. Academic-performance, however, is just one-of the-many variables, potentially influenced, by the-use of ‘help-sheets’. As-such, the-research-findings show students self-reported reduction of test-anxiety; moreover cheating at-examinations, being-considered as pervasive-practice, at-the school, was not observed, during this-experiment. The-main-recommendations, of the-study were: (1) to-use ‘help-sheets’ in-examinations, on the-grounds that they potentially-reduce both; test-anxiety, and cheating, at-examinations; (2) to-deal with test-anxiety, lecturers should-help students, mastering-it, by self regulation relaxation-techniques; and (3) specific-areas, for future (more-deeper)-research, were identified. Moreover, to-give a-broader-reflection on the-subject-matter, the-following-topics were-also elaborated upon: Traditional examination-modes: ‘closed-book’ vs. ‘open-book’; Alternative-examination-approach: student created reference-material (‘help-sheet’); Cheating, at-exams, at local-context; and Anxiety (concepts, types, mechanism, and consequences; test-anxiety; and self-regulating relaxation-techniques). The-author trusts, findings of this-study, in-conjunction with theoretical-background, given,  adds to-the-body of knowledge, on experimental-teaching, particularly, on the-use of student-prepared reference-materials, such-as ‘help-sheets’, at university-examinations. The-results of the-experiment can also-help university-lecturers decide, whether to-allow their-students to-use ‘help-sheets’. Keywords: ‘cheat-sheet’, test anxiety, exam type, exam performance.

    Advances in Manipulation and Recognition of Digital Ink

    Get PDF
    Handwriting is one of the most natural ways for a human to record knowledge. Recently, this type of human-computer interaction has received increasing attention due to the rapid evolution of touch-based hardware and software. While hardware support for digital ink reached its maturity, algorithms for recognition of handwriting in certain domains, including mathematics, are lacking robustness. Simultaneously, users may possess several pen-based devices and sharing of training data in adaptive recognition setting can be challenging. In addition, resolution of pen-based devices keeps improving making the ink cumbersome to process and store. This thesis develops several advances for efficient processing, storage and recognition of handwriting, which are applicable to the classification methods based on functional approximation. In particular, we propose improvements to classification of isolated characters and groups of rotated characters, as well as symbols of substantially different size. We then develop an algorithm for adaptive classification of handwritten mathematical characters of a user. The adaptive algorithm can be especially useful in the cloud-based recognition framework, which is described further in the thesis. We investigate whether the training data available in the cloud can be useful to a new writer during the training phase by extracting styles of individuals with similar handwriting and recommending styles to the writer. We also perform factorial analysis of the algorithm for recognition of n-grams of rotated characters. Finally, we show a fast method for compression of linear pieces of handwritten strokes and compare it with an enhanced version of the algorithm based on functional approximation of strokes. Experimental results demonstrate validity of the theoretical contributions, which form a solid foundation for the next generation handwriting recognition systems

    SciTech News Volume 71, No. 3 (2017)

    Get PDF
    Columns and Reports From the Editor.........................3 Division News Science-Technology Division....5 Chemistry Division....................8 Conference Report, Marion E, Sparks Professional Development Award Recipient..9 Engineering Division................10 Engineering Division Award, Winners Reflect on their Conference Experience..15 Aerospace Section of the Engineering Division .....18 Architecture, Building Engineering, Construction, and Design Section of the Engineering Division................20 Reviews Sci-Tech Book News Reviews...22 Advertisements IEEE..........................................

    Final Report and Recommendations of the Data Rescue Project at the National Agricultural Library

    Get PDF
    The National Agricultural Library (NAL) identified a need for a framework of guidance to support rapid appraisal and processing for scientific researchers’ collections after being offered collections of scientific data and data-rich materials that required immediate appraisal before acquisition. This Report and accompanying Data Rescue Processing Guide document the work and scholarship of the Data Rescue Project.National Agricultural Librar

    Non-Visual Representation of Complex Documents for Use in Digital Talking Books

    Get PDF
    Essential written information such as text books, bills, and catalogues needs to be accessible by everyone. However, access is not always available to vision-impaired people. As they require electronic documents to be available in specific formats. In order to address the accessibility issues of electronic documents, this research aims to design an affordable, portable, standalone and simple to use complete reading system that will convert and describe complex components in electronic documents to print disabled users
    • 

    corecore