16,027 research outputs found
Modeling peer assessment as a personalized predictor of teacher's grades: The case of OpenAnswer
Questions with open answers are rarely used as e-learning assessment tools because of the resulting high workload for the teacher/tutor that should grade them. This can be mitigated by having students grade each other's answers, but the uncertainty on the quality of the resulting grades could be high.
In our OpenAnswer system we have modeled peer-assessment as a Bayesian network connecting a set of sub-networks (each representing a participating student) to the corresponding answers of her graded peers. The model has shown good ability to predict (without further info from the teacher) the exact teacher mark and a very good ability to predict it within 1 mark from the right one (ground truth). From the available datasets we noticed that different teachers sometimes disagree in their assessment of the same answer. For this reason in this paper we explore how the model can be tailored to the specific teacher to improve its prediction ability. To this aim, we parametrically define the CPTs (Conditional Probability Tables) describing the probabilistic dependence of a Bayesian variable from others in the modeled network, and we optimize the parameters generating the CPTs to obtain the smallest average difference between the predicted grades and the teacher's marks (ground truth). The optimization is carried out separately with respect to each teacher available in our datasets, or respect to the whole datasets.
The paper discusses the results and shows that the prediction performance of our model, when optimized separately for each teacher, improves against the case in which our model is globally optimized respect to the whole dataset, which in turn improves against the predictions of the raw peer-assessment. The improved prediction would allow us to use OpenAnswer, without teacher intervention, as a class monitoring and diagnostic tool
Towards a quantitative evaluation of the relationship between the domain knowledge and the ability to assess peer work
In this work we present the preliminary results provided by the statistical modeling of the cognitive relationship between the knowledge about a topic a the ability to assess peer achievements on the same topic. Our starting point is Bloom's taxonomy of educational objectives in the cognitive domain, and our outcomes confirm the hypothesized ranking. A further consideration that can be derived is that meta-cognitive abilities (e.g., assessment) require deeper domain knowledge
Faster than thought: Detecting sub-second activation sequences with sequential fMRI pattern analysis
Methods for Ordinal Peer Grading
MOOCs have the potential to revolutionize higher education with their wide
outreach and accessibility, but they require instructors to come up with
scalable alternates to traditional student evaluation. Peer grading -- having
students assess each other -- is a promising approach to tackling the problem
of evaluation at scale, since the number of "graders" naturally scales with the
number of students. However, students are not trained in grading, which means
that one cannot expect the same level of grading skills as in traditional
settings. Drawing on broad evidence that ordinal feedback is easier to provide
and more reliable than cardinal feedback, it is therefore desirable to allow
peer graders to make ordinal statements (e.g. "project X is better than project
Y") and not require them to make cardinal statements (e.g. "project X is a
B-"). Thus, in this paper we study the problem of automatically inferring
student grades from ordinal peer feedback, as opposed to existing methods that
require cardinal peer feedback. We formulate the ordinal peer grading problem
as a type of rank aggregation problem, and explore several probabilistic models
under which to estimate student grades and grader reliability. We study the
applicability of these methods using peer grading data collected from a real
class -- with instructor and TA grades as a baseline -- and demonstrate the
efficacy of ordinal feedback techniques in comparison to existing cardinal peer
grading methods. Finally, we compare these peer-grading techniques to
traditional evaluation techniques.Comment: Submitted to KDD 201
Collective estimation of multiple bivariate density functions with application to angular-sampling-based protein loop modeling
This article develops a method for simultaneous estimation of density functions for a collection of populations of protein backbone angle pairs using a data-driven, shared basis that is constructed by bivariate spline functions defined on a triangulation of the bivariate domain. The circular nature of angular data is taken into account by imposing appropriate smoothness constraints across boundaries of the triangles. Maximum penalized likelihood is used to fit the model and an alternating blockwise Newton-type algorithm is developed for computation. A simulation study shows that the collective estimation approach is statistically more efficient than estimating the densities individually. The proposed method was used to estimate neighbor-dependent distributions of protein backbone dihedral angles (i.e., Ramachandran distributions). The estimated distributions were applied to protein loop modeling, one of the most challenging open problems in protein structure prediction, by feeding them into an angular-sampling-based loop structure prediction framework. Our estimated distributions compared favorably to the Ramachandran distributions estimated by fitting a hierarchical Dirichlet process model; and in particular, our distributions showed significant improvements on the hard cases where existing methods do not work well
Recommended from our members
A methodology for the estimation of kappa (Îș) for large datasets. Example application to rock sites in the NGA-East database
This report reviews four of the main approaches (two band-limited and two broadband) currently used for estimating the site Îș0: the acceleration slope (AS) above the corner frequency, the displacement slope (DS) below the corner frequency, the broadband (BB) fit of the spectrum, and the response spectral shape (RESP) template. Using these four methods, estimates of Îș0 for rock sites in Central Eastern North America (CENA) in the shallow crustal dataset from NGAEast are computed for distances less than 100 km.
Using all of the data within 100 km, the mean Îș0 values are 8 msec for the AS approach and 27 msec for the DS approach. These mean values include negative Îș estimates for some sites. If the negative Îș values are removed, then the mean values are 25 msec and 42 msec, respectively. Stacking all spectra together led to mean Îș0 values of 7 and 29 msec, respectively. Overall, the DS approach yields 2â3 times higher values than the AS, which agrees with previous observations, but the uncertainty of the estimates in each case is large. The AS approach seems consistent for magnitudes down to M3 but not below.
There is large within-station variability of Îș that may be related to differences in distance, Q, complexity along the path, or particular source characteristics, such as higher or lower stress drop. The station-to-station differences may be due to site-related factors. Because most sites have been assigned Vs30 = 2000 m/sec, it is not possible to correlate variations in Îș0 with rock stiffness.
Based on the available profile, the individual spectra are corrected for crustal amplification and only affect results below 15 Hz. Since the AS and DS approaches are applied over different frequency ranges, we find that only the DS results are sensitive to the amplification correction. More detailed knowledge of individual near-surface profiles may have effects on AS results, too. Although Îș is considered to be caused solely by damping in the shallow crust, measurement techniques often cannot separate the effects of damping and amplification, and yield the net effect of both phenomena.
The two broadband approaches, BB and RESP, yield similar results. The mean Îș0_BB is 5±0.5 msec across all NEHRP class A sites. The Îș0_RESP for the two events examined is 5 and 6 msec. From literature, the average value of Îș0 in CENA is 6 ± 2 msec. This typical value is similar to the broadband estimates of this study and to the mean ÎșAS when all available recordings are used along with all flags. When only recordings with down-going FAS slope are selected from the dataset, the mean value of ÎșAS increases by a factor of 2â3.
To evaluate the scaling of high-frequency ground motion with Îș, we analyze residuals from ground motion prediction equations (GMPEs) versus Îș estimates. Using the Îș values from the AS approach, the average trend of the ln(PSA) residuals for hard-rock data do not show the expected strong dependence on Îș, but when using Îș values from the DS approach, there is a stronger correlation of the residuals, i.e., a Îș that is more consistent with the commonly used analytically based scaling. The ÎșDS estimates may better reflect the damping in the shallow crust, while the ÎșAS estimates may reflect a net effect of damping and amplification that has not been decoupled. The ÎșDS estimates are higher than the ÎșAS estimates, so the expected effect on the high-frequency ground motion is smaller than that expected for the ÎșAS estimates.
An empirical hard-rock site factor model is developed that represents the combined Vs-Îș0 site factor relative to a 760 m/sec reference-site condition. At low frequencies ( 10 Hz), the residuals do not show the strong increase in the site factors as seen in the analytical model results. A second hard-rock dataset from British Columbia, Canada, is also used. This BC hard-rock residuals show an increase in the 15â50 Hz range that is consistent with the analytical Îș0 scaling for a hard-rock Îș0 of about 0.015 sec.
The variability of the PSA residuals is also used to evaluate the Îș0 scaling for hard-rock sites from analytical modeling. The scatter in existing Îș0 values found in literature is disproportionately large compared to the observed variability in high-frequency ground motions. We compared the predicted ground-motion variability based on analytical modeling to the observed variability in our residuals. While the hard-rock sites are more variable at high frequencies due to the additional Îș0 variability, this additional variability is much less than the variability predicted by the analytical modeling using the variability from Îș0-Vs30 correlations. This is consistent with weaker Îș0 scaling compared to that predicted by the analytical modelling seen in the mean residuals
Quantifying biosynthetic network robustness across the human oral microbiome
Metabolic interactions, such as cross-feeding, play a prominent role in microbial communitystructure. For example, they may underlie the ubiquity of uncultivated microorganisms. We investigated this phenomenon in the human oral microbiome, by analyzing microbial metabolic networks derived from sequenced genomes. Specifically, we devised a probabilistic biosynthetic network robustness metric that describes the chance that an organism could produce a given metabolite, and used it to assemble a comprehensive atlas of biosynthetic capabilities for 88 metabolites across 456 human oral microbiome strains. A cluster of organisms characterized by reduced biosynthetic capabilities stood out within this atlas. This cluster included several uncultivated taxa and three recently co-cultured Saccharibacteria (TM7) phylum species. Comparison across strains also allowed us to systematically identify specific putative metabolic interdependences between organisms. Our method, which provides a new way of converting annotated genomes into metabolic predictions, is easily extendible to other microbial communities and metabolic products.https://www.biorxiv.org/content/10.1101/392621v1First author draf
- âŠ