Search CORE

450 research outputs found

Usability of Error Messages for Introductory Students

Author: Schliep Paul A.
Publication venue: University of Minnesota Morris Digital Well
Publication date: 02/09/2015
Field of study

Error messages are an important tool programmers use to help find and fix mistakes or issues in their code. When an error message is unhelpful, it can be difficult to find the issue and may impose additional challenges in learning the language and concepts. Error messages are especially critical for introductory programmers in understanding problems with their code. Unfortunately, not all error messages in programming are beneficial for novice programmers. This paper discusses the general usability of error messages for introductory programmers, analyses of error messages in compilers and DrRacket, and two methodologies intended to improve error handling

University of Minnesota, Morris (UMM): Digital Well

Partially-supervised context-specific independence mixture modeling.

Author: Georgi B.
Schliep A.
Publication venue
Publication date: 17/09/2007
Field of study

Partially supervised or semi-supervised learning refers to machine learning methods which fall between clustering and classification. In the context of clustering, labels can specify link and do-not-link constraints between data points in di erent ways and constrain the resulting clustering solutions. This is a very natural framework for many biological applications as some labels are often available and even very few label greatly improve clustering results. Context-specific independence models constitute a framework for simultaneous mixture estimation and model structure determination to obtain meaningful models for high-dimensional data with many, possibly uninformative, variables. Here we present the first approach for partial learning of CSI models and demonstrate the e ectiveness of modest amounts of labels for simulated data and for protein sub-family determination

MPG.PuRe

Model-based clustering with Hidden Markov Models and its application to financial times-series data

Author: Knab B.
Schliep A.
Steckemetz B.
Wichern B.
Publication venue: 'NIDA/RTI International'
Publication date: 01/01/2003
Field of study

We have developed a method to partition a set of data into clusters by use of Hidden Markov Models. Given a number of clusters, each of which is represented by one Hidden Markov Model, an iterative procedure finds the combination of cluster models and an assignment of data points to cluster models which maximizes the joint likelihood of the clustering. To reflect the non-Markovian nature of some aspects of the data we also extend classical Hidden Markov Models to employ a non-homogeneous Markov chain, where the non-homogeneity is dependent not on the time of the observation but rather on a quantity derived from previous observations. We present the method, a proof of convergence for the training procedure and an evaluation of the method on simulated time-series data as well as on large data sets of financial time-series from the Public Saving and Loan Banks in Germany

MPG.PuRe

Model-based clustering with Hidden Markov Models and its application to financial times series data

Author: Knab B.
Schliep A.
Steckemetz B.
Wichern B.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

MPG.PuRe

Developing Beginner-Friendly Programming Error Messages

Author: Lemmon Aaron D.
Sax Emma
Schliep Paul A.
Publication venue: University of Minnesota Morris Digital Well
Publication date: 01/04/2015
Field of study

The motivation for our work is to introduce a recently developed programming language, Clojure, in a beginner computer science (CSci) class at the University of Minnesota, Morris. Clojure is an industryaccepted programming language that provides significant benefits for beginner programmers, such as focus on a functional approach to programming which, in UMM experience, provides a good foundation for subsequent CSci curriculum. Learning Clojure in an introductory class opens opportunities for students to collaborate on numerous worldwide projects, as well as take advantage of improvements in modern computing hardware. However, Clojure is challenging to use because of its complicated handling of programmers’ mistakes. Mistakes in computer programming are a natural part of developing software. When a mistake happens, there is a system to notify the programmer of an error. The specific information that the programmer receives, known as an error message, may or may not be helpful in identifying the issue. Clojure error messages are notorious for being confusing to beginners. We are developing a system that intercepts the existing Clojure error messages and automatically rephrases them for beginner programmers. We will conduct usability tests by observing the interactions between beginner programmers and our system, and the feedback we receive will be used to further improve our project. We present our new error message handling and discuss testing our system with new programmers.https://digitalcommons.morris.umn.edu/urs_2015/1005/thumbnail.jp

University of Minnesota, Morris (UMM): Digital Well

pGQL: A probabilistic graphical query language for gene expression time courses

Author: A Schliep
A Schliep
Alexander Schliep
H Hochheiser
IG Costa
Ivan G Costa
J Ernst
KY Yeung
LR Rabiner
M Ashburner
MF Ramoni
R Durbin
Ruben Schilling
S Chu
Z Bar-Joseph
Z Bar-Joseph
Publication venue: BMC
Publication date: 01/01/2011
Field of study

Abstract Background Timeboxes are graphical user interface widgets that were proposed to specify queries on time course data. As queries can be very easily defined, an exploratory analysis of time course data is greatly facilitated. While timeboxes are effective, they have no provisions for dealing with noisy data or data with fluctuations along the time axis, which is very common in many applications. In particular, this is true for the analysis of gene expression time courses, which are mostly derived from noisy microarray measurements at few unevenly sampled time points. From a data mining point of view the robust handling of data through a sound statistical model is of great importance. Results We propose probabilistic timeboxes, which correspond to a specific class of Hidden Markov Models, that constitutes an established method in data mining. Since HMMs are a particular class of probabilistic graphical models we call our method Probabilistic Graphical Query Language. Its implementation was realized in the free software package pGQL. We evaluate its effectiveness in exploratory analysis on a yeast sporulation data set. Conclusions We introduce a new approach to define dynamic, statistical queries on time course data. It supports an interactive exploration of reasonably large amounts of data and enables users without expert knowledge to specify fairly complex statistical models with ease. The expressivity of our approach is by its statistical nature greater and more robust with respect to amplitude and frequency fluctuation than the prior, deterministic timeboxes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

The Entropy of a Binary Hidden Markov Process

Author: A. Schliep
B. Derrida
D.J.C. MacKay
D.S. Fisher
Eytan Domany
G. Grinstein
I. Kanter
I. Kanter
Ido Kanter
L.R. Rabiner
Or Zuk
T.M. Cover
Y. Ephraim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The entropy of a binary symmetric Hidden Markov Process is calculated as an expansion in the noise parameter epsilon. We map the problem onto a one-dimensional Ising model in a large field of random signs and calculate the expansion coefficients up to second order in epsilon. Using a conjecture we extend the calculation to 11th order and discuss the convergence of the resulting series

arXiv.org e-Print Archive

CiteSeerX

Crossref

Identifying and characterizing extrapolation in multivariate response data

Author: Bartley Meridith L
Hanks Ephraim M
Schliep Erin M
Soranno Patricia A
Wagner Tyler
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Extrapolation is defined as making predictions beyond the range of the data used to estimate a statistical model. In ecological studies, it is not always obvious when and where extrapolation occurs because of the multivariate nature of the data. Previous work on identifying extrapolation has focused on univariate response data, but these methods are not directly applicable to multivariate response data, which are more and more common in ecological investigations. In this paper, we extend previous work that identified extrapolation by applying the predictive variance from the univariate setting to the multivariate case. We illustrate our approach through an analysis of jointly modeled lake nutrients and indicators of algal biomass and water clarity in over 7000 inland lakes from across the Northeast and Mid-west US. In addition, we illustrate novel exploratory approaches for identifying regions of covariate space where extrapolation is more likely to occur using classification and regression trees.Comment: 28 pages, 2 supplementary files, 6 main figures, 2 supplementary figures, 2 supplementary table

arXiv.org e-Print Archive

Directory of Open Access Journals