Search CORE

33,396 research outputs found

An Investigation into the Pedagogical Features of Documents

Author: Burns Gully
Gordon Jonathan
Natarajan Prem
Sheng Emily
Publication venue
Publication date: 01/01/2017
Field of study

Characterizing the content of a technical document in terms of its learning utility can be useful for applications related to education, such as generating reading lists from large collections of documents. We refer to this learning utility as the "pedagogical value" of the document to the learner. While pedagogical value is an important concept that has been studied extensively within the education domain, there has been little work exploring it from a computational, i.e., natural language processing (NLP), perspective. To allow a computational exploration of this concept, we introduce the notion of "pedagogical roles" of documents (e.g., Tutorial and Survey) as an intermediary component for the study of pedagogical value. Given the lack of available corpora for our exploration, we create the first annotated corpus of pedagogical roles and use it to test baseline techniques for automatic prediction of such roles.Comment: 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at EMNLP 2017; 12 page

arXiv.org e-Print Archive

Crossref

Training Process Reduction Based On Potential Weights Linear Analysis To Accelarate Back Propagation Network

Author: Asadi Roya
Mustapha Norwati
Sulaiman Nasir
Publication venue
Publication date: 01/07/2009
Field of study

Learning is the important property of Back Propagation Network (BPN) and finding the suitable weights and thresholds during training in order to improve training time as well as achieve high accuracy. Currently, data pre-processing such as dimension reduction input values and pre-training are the contributing factors in developing efficient techniques for reducing training time with high accuracy and initialization of the weights is the important issue which is random and creates paradox, and leads to low accuracy with high training time. One good data preprocessing technique for accelerating BPN classification is dimension reduction technique but it has problem of missing data. In this paper, we study current pre-training techniques and new preprocessing technique called Potential Weight Linear Analysis (PWLA) which combines normalization, dimension reduction input values and pre-training. In PWLA, the first data preprocessing is performed for generating normalized input values and then applying them by pre-training technique in order to obtain the potential weights. After these phases, dimension of input values matrix will be reduced by using real potential weights. For experiment results XOR problem and three datasets, which are SPECT Heart, SPECTF Heart and Liver disorders (BUPA) will be evaluated. Our results, however, will show that the new technique of PWLA will change BPN to new Supervised Multi Layer Feed Forward Neural Network (SMFFNN) model with high accuracy in one epoch without training cycle. Also PWLA will be able to have power of non linear supervised and unsupervised dimension reduction property for applying by other supervised multi layer feed forward neural network model in future work.Comment: 11 pages IEEE format, International Journal of Computer Science and Information Security, IJCSIS 2009, ISSN 1947 5500, Impact factor 0.42

arXiv.org e-Print Archive

Universiti Putra Malaysia Institutional Repository

Multilabel Classification with R Package mlr

Author: Au Quay
Bischl Bernd
Casalicchio Giuseppe
Probst Philipp
Stachl Clemens
Publication venue
Publication date: 03/04/2017
Field of study

We implemented several multilabel classification algorithms in the machine learning package mlr. The implemented methods are binary relevance, classifier chains, nested stacking, dependent binary relevance and stacking, which can be used with any base learner that is accessible in mlr. Moreover, there is access to the multilabel classification versions of randomForestSRC and rFerns. All these methods can be easily compared by different implemented multilabel performance measures and resampling methods in the standardized mlr framework. In a benchmark experiment with several multilabel datasets, the performance of the different methods is evaluated.Comment: 18 pages, 2 figures, to be published in R Journal; reference correcte

arXiv.org e-Print Archive

Data-driven design of intelligent wireless networks: an overview and tutorial

Author: De Poorter Eli
Deschrijver Dirk
Fortuna Carolina
Kulin Merima
Moerman Ingrid
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Data mining based cyber-attack detection

Author: Tianfield Huaglory
Publication venue
Publication date: 31/05/2017
Field of study

ResearchOnline@GCU

Recommended from our members

PetroPlot: A plotting and data management tool set for Microsoft Excel

Author: Asimow Paul D.
Langmuir Charles H.
Su Yongjun
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2003
Field of study

PetroPlot is a 4000-line software code written in Visual Basic for the spreadsheet program Excel that automates plotting and data management tasks for large amount of data. The major plotting functions include: automation of large numbers of multiseries XY plots; normalized diagrams (e.g., spider diagrams); replotting of any complex formatted diagram with multiple series for any other axis parameters; addition of customized labels for individual data points; and labeling flexible log scale axes. Other functions include: assignment of groups for samples based on multiple customized criteria; removal of nonnumeric values; calculation of averages/standard deviations; calculation of correlation matrices; deletion of nonconsecutive rows; and compilation of multiple rows of data for a single sample to single rows appropriate for plotting. A cubic spline function permits curve fitting to complex time series, and comparison of data to the fits. For users of Excel, PetroPlot increases efficiency of data manipulation and visualization by orders of magnitude and allows exploration of large data sets that would not be possible making plots individually. The source codes are open to all users

Harvard University - DASH

Caltech Authors