4,533 research outputs found

    Exploring Document Clustering Techniques for Personalized Peer Assessment in Exploratory Courses

    Get PDF
    Proceedings of: Computer-Supported Peer Review in Education: Synergies with Intelligent Tutoring Systems (CSPRED 2010), Pittsburgh, Pennsylvania USA, June 14th, 2010.Peer review has been proposed as a complement to project-based learning in courses covering a wide and heterogeneous syllabus. By reviewing peers' projects, students can explore other subjects thoroughly apart from their own project topic. This objective relies however in a proper distribution of the works to review, which is a complex and time-consuming task. Beyond simple topic selection, students may report different types of works, which influence their peers' assessment; for example, works focused on a project development approach versus in-depth literature researches. Introducing detailed metadata is time-consuming (thus users are typically reluctant) and, even more important, prone to error. In this paper we explore the potential of text mining and natural language processing technologies for automatic classification of texts, in order to facilitate the adaptation and diversification of the works assigned to the students for review, in the context of a course on Artificial Intelligence.This work was partially funded by Best Practice Network ICOPER (Grant No. ECP-2007-EDU-417007), Learn3 project, “Plan Nacional de I+D+I” TIN2008-05163/TSI, and eMadrid network, S2009/TIC-1650, “Investigación y Desarrollo de tecnologías para el e-learning en la Comunidad de Madrid”.Publicad

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    Algoritmilise mõtlemise oskuste hindamise mudel

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneTehnoloogia on kõikjal meie ümber ja arvutiteadus pole enam ainult eraldi distsipliin teadlastele, vaid omab aina laiemat rolli ka teistel aladel. Huvi algoritmilise mõtlemise arendamise vastu kasvab kõigil haridustasemetel alates eelkoolist lõpetades ülikooliga. Sellega seoses vajame aina enam üldhariduskoolide tasemel uuringuid, et omada paremat ülevaadet algoritmilise mõtlemise oskustest, et luua praktiline mudel algoritmilise mõtlemise hindamiseks. Algoritmilist mõtlemist kirjeldatakse paljudes artiklites, kuid sageli pole need omavahel kooskõlas ja puudub ühine arusaamine algoritmilise mõtlemise oskuste dimensioonidest. Doktoritöö sisaldab süstemaatilist kirjanduse analüüsi, kus mõjukamate artiklite sünteesimisel jõutakse kolmeetapilise algoritmilise mõtlemise oskuste mudelini. See mudel koosneb järgnevatest etappidest: i) probleemi defineerimine, ii) probleemi lahendamine ja iii) lahenduse analüüsimine. Need kolm etappi sisaldavad kümmet algoritmilise mõtlemise alamoskust: probleemi formuleerimine, abstrahheerimine, reformuleerimine, osadeks võtmine, andmete kogumine ja analüüs, algoritmiline disain, paralleliseerimine ja itereerimine, automatiseerimine, üldistamine ning tulemuse hindamine. Selleks, et algoritmilist mõtlemist süstemaatiliselt arendada, on vaja mõõtevahendit vastavate oskuste mõõtmiseks põhikoolis. Doktoritöö uurib informaatikaviktoriini Kobrase ülesannete abil, milliseid algoritmilise mõtlemise osaoskusi on võimalik eraldada Kobrase viktoriini tulemustest lähtuvalt ilmnes kaks algoritmilise mõtlemise oskust: algoritmiline disain ja mustrite äratundmine. Lisaks põhikoolile kasutati ülesandeid ka gümnaasiumis millga kinnitati, et kohendatud kujul saab neid ülesandeid kasutada algoritmilise mõtlemise oskuste hindamiseks ka gümnaasiumisgümnaasiumitasemel. Viimase asjana pakutakse doktoritöös välja teoreetilisi ja empiirilisi tulemusi kokkuvõttev algoritmilise mõtlemise oskusi hindav mudel.In the modernizing world, computer science is not only a separate discipline for scientists but has an essential role in many fields. There is an increasing interest in developing computational thinking (CT) skills at various education levels – from kindergarten to university. Therefore, at the comprehensive school level, research is needed to have an understanding of the dimensions of CT skills and to develop a model for assessing CT skills. CT is described in several articles, but these are not in line with each other, and there is missing a common understanding of the dimensions of the skills that should be in the focus while developing and assessing CT skills. In this doctoral study, through a systematic literature review, an overview of the dimensions of CT presented in scientific papers is given. A model for assessing CT skills in three stages is proposed: i) defining the problem, ii) solving the problem, and iii) analyzing the solution. Those three stages consist of ten CT skills: problem formulation, abstraction, problem reformulation, decomposition, data collection and analysis, algorithmic design, parallelization and iteration, automation, generalization, and evaluation. The systematic development of CT skills needs an instrument for assessing CT skills at the basic school level. This doctoral study describes CT skills that can be distinguished from the Bebras (Kobras) international challenge results. Results show that wto CT skills emerged that can be characterized as algorithmic thinking and pattern recognition. These Bebras tasks were also modified to be used for setting directions for developing CT skills at the secondary school level. Eventually, a modified model for assessing CT skills is presented, combining the theoretical and empirical results from the three main studies.https://www.ester.ee/record=b543136

    Customisable e-training programmes based on trainees profiles

    Get PDF
    Dissertation presented at Faculdade de Ciências e Tecnologia of Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer EngineeringOnline training (e-training) is a major driver to promote the development of competencies and knowledge in enterprises. A lack of customizable e-training programmes based on trainees‟ profiles and of continuous maintenance of the training materials prevents the sustainability of industrial training deployment. This dissertation presents a training strategy and a methodology for building training courses with the purpose to provide a trainee oriented industrial training development. The training strategy intends to facilitate the management of all the training components and tasks to be able to build a training structure focused in a specific planned objective. The methodology for building e-training courses proposes to create customizable training materials in an easier way, enabling various organizations to participate actively on its production. Additionally a customisable training programme framework is presented. It is supported by a compliant ontology-based model able to support adaptable training contents, orchestration service, facilitating the efficiency and acceptance of the e-training programmes delivery

    Applying Recommender Systems and Adaptive Hypermedia for e-Learning Personalizatio

    Get PDF
    Learners learn differently because they are different -- and they grow more distinctive as they mature. Personalized learning occurs when e-learning systems make deliberate efforts to design educational experiences that fit the needs, goals, talents, and interests of their learners. Researchers had recently begun to investigate various techniques to help teachers improve e-learning systems. In this paper we present our design and implementation of an adaptive and intelligent web-based programming tutoring system -- Protus, which applies recommendation and adaptive hypermedia techniques. This system aims at automatically guiding the learner's activities and recommend relevant links and actions to him/her during the learning process. Experiments on real data sets show the suitability of using both recommendation and hypermedia techniques in order to suggest online learning activities to learners based on their preferences, knowledge and the opinions of the users with similar characteristics

    Data-Driven Database Education: A Quantitative Study of SQL Learning in an Introductory Database Course

    Get PDF
    The Structured Query Language (SQL) is widely used and challenging to master. Within the context of lab exercises in an introductory database course, this thesis analyzes the student learning process and seeks to answer the question: ``Which SQL concepts, or concept combinations, trouble students the most?\u27\u27 We provide comprehensive taxonomies of SQL concepts and errors, identify common areas of student misunderstanding, and investigate the student problem-solving process. We present an interactive web application used by students to complete SQL lab exercises. In addition, we analyze data collected by this application and we offer suggestions for improvement to database lab activities

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

    Software Plagiarism Detection Using N-grams

    Get PDF
    Plagiarism is an act of copying where one doesn’t rightfully credit the original source. The motivations behind plagiarism can vary from completing academic courses to even gaining economical advantage. Plagiarism exists in various domains, where people want to take credit from something they have worked on. These areas can include e.g. literature, art or software, which all have a meaning for an authorship. In this thesis we conduct a systematic literature review from the topic of source code plagiarism detection methods, then based on the literature propose a new approach to detect plagiarism which combines both similarity detection and authorship identification, introduce our tokenization method for the source code, and lastly evaluate the model by using real life data sets. The goal for our model is to point out possible plagiarism from a collection of documents, which in this thesis is specified as a collection of source code files written by various authors. Our data, which we will use to our statistical methods, consists of three datasets: (1) collection of documents belonging to University of Helsinki’s first programming course, (2) collection of documents belonging to University of Helsinki’s advanced programming course and (3) submissions for source code re-use competition. Statistical methods in this thesis are inspired by the theory of search engines, which are related to data mining when detecting similarity between documents and machine learning when classifying document with the most likely author in authorship identification. Results show that our similarity detection model can be used successfully to retrieve documents for further plagiarism inspection, but false positives are quickly introduced even when using a high threshold that controls the minimum allowed level of similarity between documents. We were unable to use the results of authorship identification in our study, as the results with our machine learning model were not high enough to be used sensibly. This was possibly caused by the high similarity between documents, which is due to the restricted tasks and the course setting that teaches a specific programming style during the timespan of the course

    Predicting Academic Performance: A Systematic Literature Review

    Get PDF
    The ability to predict student performance in a course or program creates opportunities to improve educational outcomes. With effective performance prediction approaches, instructors can allocate resources and instruction more accurately. Research in this area seeks to identify features that can be used to make predictions, to identify algorithms that can improve predictions, and to quantify aspects of student performance. Moreover, research in predicting student performance seeks to determine interrelated features and to identify the underlying reasons why certain features work better than others. This working group report presents a systematic literature review of work in the area of predicting student performance. Our analysis shows a clearly increasing amount of research in this area, as well as an increasing variety of techniques used. At the same time, the review uncovered a number of issues with research quality that drives a need for the community to provide more detailed reporting of methods and results and to increase efforts to validate and replicate work.Peer reviewe

    Acquiring Word-Meaning Mappings for Natural Language Interfaces

    Full text link
    This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance
    corecore