Search CORE

8,407 research outputs found

Near-Optimal Algorithms for Differentially-Private Principal Components

Author: Chaudhuri Kamalika
Sarwate Anand D.
Sinha Kaushik
Publication venue
Publication date: 07/08/2013
Field of study

Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Shocker Open Access Repository

Analyzing Learners Behavior in MOOCs: An Examination of Performance and Motivation Using a Data-Driven Approach

Author: Al-Shabandar R
Hussain A
Keight R
Liatsis P
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Massive Open Online Courses (MOOCs) have been experiencing increasing use and popularity in highly ranked universities in recent years. The opportunity of accessing high quality courseware content within such platforms, while eliminating the burden of educational, financial and geographical obstacles has led to a rapid growth in participant numbers. The increasing number and diversity of participating learners has opened up new horizons to the research community for the investigation of effective learning environments. Learning Analytics has been used to investigate the impact of engagement on student performance. However, extensive literature review indicates that there is little research on the impact of MOOCs, particularly in analyzing the link between behavioral engagement and motivation as predictors of learning outcomes. In this study, we consider a dataset, which originates from online courses provided by Harvard University and Massachusetts Institute of Technology, delivered through the edX platform [1]. Two sets of empirical experiments are conducted using both statistical and machine learning techniques. Statistical methods are used to examine the association between engagement level and performance, including the consideration of learner educational backgrounds. The results indicate a significant gap between success and failure outcome learner groups, where successful learners are found to read and watch course material to a higher degree. Machine learning algorithms are used to automatically detect learners who are lacking in motivation at an early time in the course, thus providing instructors with insight in regards to student withdrawal

LJMU Research Online (Liverpool John Moores University)

Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method.

Author: FRENKLACH Michael
Hegde Arun
Li Wenyu
Liu Zhenyuan
Oreluk James
Packard Andrew
Zubarev Dmitry
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

We report an evaluation of a semi-empirical quantum chemical method PM7 from the perspective of uncertainty quantification. Specifically, we apply Bound-to-Bound Data Collaboration, an uncertainty quantification framework, to characterize (a) variability of PM7 model parameter values consistent with the uncertainty in the training data and (b) uncertainty propagation from the training data to the model predictions. Experimental heats of formation of a homologous series of linear alkanes are used as the property of interest. The training data are chemically accurate, i.e., they have very low uncertainty by the standards of computational chemistry. The analysis does not find evidence of PM7 consistency with the entire data set considered as no single set of parameter values is found that captures the experimental uncertainties of all training data. A set of parameter values for PM7 was able to capture the training data within ±1 kcal/mol, but not to the smaller level of uncertainty in the reported data. Nevertheless, PM7 was found to be consistent for subsets of the training data. In such cases, uncertainty propagation from the chemically accurate training data to the predicted values preserves error within bounds of chemical accuracy if predictions are made for the molecules of comparable size. Otherwise, the error grows linearly with the relative size of the molecules

arXiv.org e-Print Archive

Directory of Open Access Journals

eScholarship - University of California

Regularized Discriminant Analysis Incorporating Prior Knowledge on Gene Functional Groups

Author: Jelizarow Monika
Publication venue
Publication date: 01/01/2009
Field of study

Regularized Discriminant Analysis Incorporating Prior Knowledge on Gene Functional Groups

Author: Jelizarow Monika
Publication venue
Publication date: 01/01/2009
Field of study

Open Access LMU

The New Realities of the Business Cycle

Author: Various
Publication venue
Publication date
Field of study

Research Papers in Economics

Recommended from our members

Real-time decoding of question-and-answer speech dialogue using human cortical activity.

Author: Chang Edward F
Leonard Matthew K
Makin Joseph G
Moses David A
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

Natural communication often occurs in dialogue, differentially engaging auditory and sensorimotor brain regions during listening and speaking. However, previous attempts to decode speech directly from the human brain typically consider listening or speaking tasks in isolation. Here, human participants listened to questions and responded aloud with answers while we used high-density electrocorticography (ECoG) recordings to detect when they heard or said an utterance and to then decode the utterance's identity. Because certain answers were only plausible responses to certain questions, we could dynamically update the prior probabilities of each answer using the decoded question likelihoods as context. We decode produced and perceived utterances with accuracy rates as high as 61% and 76%, respectively (chance is 7% and 20%). Contextual integration of decoded question likelihoods significantly improves answer decoding. These results demonstrate real-time decoding of speech in an interactive, conversational setting, which has important implications for patients who are unable to communicate

eScholarship - University of California