3,196 research outputs found
Clustering student skill set profiles in a unit hypercube using mixtures of multivariate betas
<br>This paper presents a finite mixture of multivariate betas as a new model-based clustering method tailored to applications where the feature space is constrained to the unit hypercube. The mixture component densities are taken to be conditionally independent, univariate unimodal beta densities (from the subclass of reparameterized beta densities given by Bagnato and Punzo 2013). The EM algorithm used to fit this mixture is discussed in detail, and results from both this beta mixture model and the more standard Gaussian model-based clustering are presented for simulated skill mastery data from a common cognitive diagnosis model and for real data from the Assistment System online mathematics tutor (Feng et al 2009). The multivariate beta mixture appears to outperform the standard Gaussian model-based clustering approach, as would be expected on the constrained space. Fewer components are selected (by BIC-ICL) in the beta mixture than in the Gaussian mixture, and the resulting clusters seem more reasonable and interpretable.</br>
<br>This article is in technical report form, the final publication is available at http://www.springerlink.com/openurl.asp?genre=article &id=doi:10.1007/s11634-013-0149-z</br>
A survey of popular R packages for cluster analysis
Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring datasets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans and hclust functions; the mclust library; the poLCA library; and the clustMD library. The packages/functions cover a variety of cluster analysis methods for continuous data, categorical data or a collection of the two. The contrasting methods in the different packages are briefly introduced and basic usage of the functions is discussed. The use of the different methods is compared and contrasted and then illustrated on example data. In the discussion, links to information on other available libraries for different clustering methods and extensions beyond basic clustering methods are given. The code for the worked examples in Section 2 is available at http://www.stats.gla.ac.uk/~nd29c/Software/ClusterReviewCode.
Identifying Clusters in Bayesian Disease Mapping
Disease mapping is the field of spatial epidemiology interested in estimating
the spatial pattern in disease risk across areal units. One aim is to
identify units exhibiting elevated disease risks, so that public health
interventions can be made. Bayesian hierarchical models with a spatially smooth
conditional autoregressive prior are used for this purpose, but they cannot
identify the spatial extent of high-risk clusters. Therefore we propose a two
stage solution to this problem, with the first stage being a spatially adjusted
hierarchical agglomerative clustering algorithm. This algorithm is applied to
data prior to the study period, and produces potential cluster structures
for the disease data. The second stage fits a separate Poisson log-linear model
to the study data for each cluster structure, which allows for step-changes in
risk where two clusters meet. The most appropriate cluster structure is chosen
by model comparison techniques, specifically by minimising the Deviance
Information Criterion. The efficacy of the methodology is established by a
simulation study, and is illustrated by a study of respiratory disease risk in
Glasgow, Scotland
Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications
Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification
performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins
Appeasing the International Conscience or Providing Post-Conflict Justice: Expanding the Khmer Rouge Tribunal’s Restorative Role
Three decades after the Cambodian civil war, the leaders of the Khmer Rouge will finally be brought before an internationalized domestic tribunal. While the majority of those most responsible have died off or received immunity for their conduct, the Khmer Rouge Tribunal has the historic possibility of reaffirming the importance of international criminal justice and providing an historical narrative of the crimes committed and victims created.
This commentary evaluates the importance of restoration in transitional justice and the importance victims and witnesses play in post-conflict justice. This article will argue that previous post-conflict remedies required a balance of restorative and retribution in order to effectuate transitional justice. In turn, the incorporation and protection of witnesses and victims was vital to reconciliation.
This article summarizes the importance of victims and witnesses in the context of Cambodia and describes mechanisms the Khmer Rouge Tribunal can use to enhance their participation and protection. By expanding the Khmer Rouge Tribunal’s restorative role, it can bring provide post-conflict justice rather then appease international guilt
One Step Forward, Two Step Backwards: Addressing Objections to the ICC’s Prescriptive and Adjudicative Powers
The Rome Statute of the International Criminal Court (ICC) permits the ICC to exercise subject-matter jurisdiction over individuals who engage in war crimes, genocide, crimes against humanity, and crimes of aggression. However, under Article 13, the ICC may only exercise personal jurisdiction over persons referred by the Security Council under Chapter VII, or over nationals of a state party, or persons whose alleged criminal conduct occurred on the territory of a state party
This article evaluates the interplay between principles of public international law and international criminal law in determining whether the ICC’s grant of jurisdiction under the Rome Statute is performed within the limits of international law when exercised against non-party nationals. The importance of this inquiry is ever-increasing in light of greater and more expansive breaches of international humanitarian law in the modern world, committed by nations and individuals who have refused to become parties to the Rome Statute.
This paper suggests that the ICC’s grant of authority to exercise jurisdiction over non-state nationals is consistent with customary norms of international law, even where the state of nationality has not consented to the Court’s jurisdiction. This paper concludes that the ICC was established to prevent impunity by reinvigorating national institutions. It is the culmination of historical lessons that teach against non-cooperation. The Nuremburg and Tokyo tribunals along with the tribunals in Rwanda and Yugoslavia were built for the precise purpose of accounting for crimes which transcend national borders. Objections to the Rome Statute, based on national interests and sovereignty fail in light of the crimes sought to be prevented by an International Criminal Court. Without the safety net provided by international cooperation and prevention of crimes, the world risks facing the dangers it promised “never again.
Spatial clustering of average risks and risk trends in Bayesian disease mapping
Spatiotemporal disease mapping focuses on estimating the spatial pattern in disease risk across a set of nonoverlapping areal units over a fixed period of time. The key aim of such research is to identify areas that have a high average level of disease risk or where disease risk is increasing over time, thus allowing public health interventions to be focused on these areas. Such aims are well suited to the statistical approach of clustering, and while much research has been done in this area in a purely spatial setting, only a handful of approaches have focused on spatiotemporal clustering of disease risk. Therefore, this paper outlines a new modeling approach for clustering spatiotemporal disease risk data, by clustering areas based on both their mean risk levels and the behavior of their temporal trends. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland
Trend Analysis of Annual and Seasonal Rainfall in Tawa command Area
The main objective of the study is to identify the trend in annual rainfall time series data as well as seasonal rainfall time series of four rainy months i.e. June, July, August and September during the period of 1971 to 2015. The annual sessional trend of rainfall was determinedby non-parametric Mann-Kendall test. Also a non-parametric Sen's Slope estimator was used for the determination of magnitude of trend. A functional relationship has been developed between variables using linear regression analysis in order to determine a linear trend of rainfall for the study area. The study concludes considering the result of all statistical test results, that the study area has shown variability in annual sessional rainfall pattern due to climatic variations. Also the sessional trend analysis of rainfall has suggested that there is a trendvariationof rainfall in the rainymonths
Diversity driven Attention Model for Query-based Abstractive Summarization
Abstractive summarization aims to generate a shorter version of the document
covering all the salient points in a compact and coherent fashion. On the other
hand, query-based summarization highlights those points that are relevant in
the context of a given query. The encode-attend-decode paradigm has achieved
notable success in machine translation, extractive summarization, dialog
systems, etc. But it suffers from the drawback of generation of repeated
phrases. In this work we propose a model for the query-based summarization task
based on the encode-attend-decode paradigm with two key additions (i) a query
attention model (in addition to document attention model) which learns to focus
on different portions of the query at different time steps (instead of using a
static representation for the query) and (ii) a new diversity based attention
model which aims to alleviate the problem of repeating phrases in the summary.
In order to enable the testing of this model we introduce a new query-based
summarization dataset building on debatepedia. Our experiments show that with
these two additions the proposed model clearly outperforms vanilla
encode-attend-decode models with a gain of 28% (absolute) in ROUGE-L scores.Comment: Accepted at ACL 201
- …
