3,843 research outputs found
Foundational principles for large scale inference: Illustrations through correlation mining
When can reliable inference be drawn in the "Big Data" context? This paper
presents a framework for answering this fundamental question in the context of
correlation mining, with implications for general large scale inference. In
large scale data applications like genomics, connectomics, and eco-informatics
the dataset is often variable-rich but sample-starved: a regime where the
number of acquired samples (statistical replicates) is far fewer than the
number of observed variables (genes, neurons, voxels, or chemical
constituents). Much of recent work has focused on understanding the
computational complexity of proposed methods for "Big Data." Sample complexity
however has received relatively less attention, especially in the setting when
the sample size is fixed, and the dimension grows without bound. To
address this gap, we develop a unified statistical framework that explicitly
quantifies the sample complexity of various inferential tasks. Sampling regimes
can be divided into several categories: 1) the classical asymptotic regime
where the variable dimension is fixed and the sample size goes to infinity; 2)
the mixed asymptotic regime where both variable dimension and sample size go to
infinity at comparable rates; 3) the purely high dimensional asymptotic regime
where the variable dimension goes to infinity and the sample size is fixed.
Each regime has its niche but only the latter regime applies to exa-scale data
dimension. We illustrate this high dimensional framework for the problem of
correlation mining, where it is the matrix of pairwise and partial correlations
among the variables that are of interest. We demonstrate various regimes of
correlation mining based on the unifying perspective of high dimensional
learning rates and sample complexity for different structured covariance models
and different inference tasks
A SYSTEMIC FUNCTIONAL ANALYSIS ON JAVANESE POLITENESS: TAKING SPEECH LEVEL INTO MOOD STRUCTURE
Speech level is an important aspect in Javanese grammar. It is just like, among others, tenses in English.
Thus, the involvement of speech level in any study of Javanese grammar is highly necessary. On the other
hand, speech level must also be studied the grammatical point of view. So far, however, there are very
limited numbers—if any does really exist—of grammatical study on Javanese speech level. Most major
studies on Javanese speech level are of sociolinguistics, lexical taxonomy or grouping, and prescriptive
analysis. It is probably due to the idea of speech level as merely a social phenomenon has been taken for
granted. Therefore, taking the speech level system into a grammatical analysis seems hardly possible. It
is assumed that the seemingly impossible attempt comes only to the formal approach of the grammar
study tradition for it has neglected the social aspect. Hence, it is necessary to look for an alternative
grammatical approach which is able to cope with the speech level both grammatically and socially. A
particular approach of grammar which involves social context is systemic functional grammar (SFG).
SFG proposes that language has three kinds of functional component. One of them is the interpersonal
function. This function sees language as an interaction between addresser and addressee—language is
used for enacting participants‘ roles and relation among them. The interpersonal function is expressed
through a particular grammatical structure, namely mood structure. This article is going present a
demonstration of systemic functional analysis on Javanese speech level by taking it into the mood
structure analysis. In addition, this paper aims for two kinds of potential significance. First, it could be
an adequate description of Javanese speech level grammaticalization. Second, it can be a typological
supplement for SFG in dealing with languages which apply a speech level system
RANCANGAN APLIKASI INTEROPERABILITAS SISTEM INFORMASI INTER DEPARTEMEN
Rancangan aplikasi layanan untuk interoperabilitas sistem informasi instansi pemerintah ini direncanakan dapat menghubungkan beberapa sistem informasi pemerintahan yang berbeda platform (baik dari sisi Sistem Operasi, Bahasa Pemrograman maupun Databasenya), sehingga antar sistem informasi tersebut bisa saling tukar data satu dengan yang lainnya melalui layanan berbasis web/internet menggunakan konsep web service. Rancangan aplikasi interoperabilitas yang dibuat dipergunakan untuk menghubungkan beberapa instansi yaitu : Kependudukan (Depdagri), Kesehatan (Depkses), Bappenas dan BPS (Badan Pusat Statistik) agar bisa saling menukar data dan informasi. Masing-masing instansi dapat menggunakan aplikasi ini untuk mengirim data mereka sesuai dengan format standar yang telah ditentukan yaitu format xml dan dikirimkan ke sebuah server repositori nasional. Dari server inilah instansi yang lain dapat saling melihat data dari instansi lain dan menggunakannya sesuai kebutuhan. Dalam penelitian kali ini diajukan rancangan aplikasi interoperabilitas serta diujicobakan menggunakan data yang berasal dari empat instansi diatas. Aplikasi yang dihasilkan terdiri dari tiga macam aplikasi yaitu : aplikasi xmlCreator, aplikasi xmlReader dan Sistem Informasi yang berbentuk web. Dalam sistem informasi selain disajikan menu informasi data dari masing-masing instansi juga disajikan daftar file xml dan web service
Exponential Strong Converse for Successive Refinement with Causal Decoder Side Information
We consider the -user successive refinement problem with causal decoder
side information and derive an exponential strong converse theorem. The
rate-distortion region for the problem can be derived as a straightforward
extension of the two-user case by Maor and Merhav (2008). We show that for any
rate-distortion tuple outside the rate-distortion region of the -user
successive refinement problem with causal decoder side information, the joint
excess-distortion probability approaches one exponentially fast. Our proof
follows by judiciously adapting the recently proposed strong converse technique
by Oohama using the information spectrum method, the variational form of the
rate-distortion region and H\"older's inequality. The lossy source coding
problem with causal decoder side information considered by El Gamal and
Weissman is a special case () of the current problem. Therefore, the
exponential strong converse theorem for the El Gamal and Weissman problem
follows as a corollary of our result
On Measure Transformed Canonical Correlation Analysis
In this paper linear canonical correlation analysis (LCCA) is generalized by
applying a structured transform to the joint probability distribution of the
considered pair of random vectors, i.e., a transformation of the joint
probability measure defined on their joint observation space. This framework,
called measure transformed canonical correlation analysis (MTCCA), applies LCCA
to the data after transformation of the joint probability measure. We show that
judicious choice of the transform leads to a modified canonical correlation
analysis, which, in contrast to LCCA, is capable of detecting non-linear
relationships between the considered pair of random vectors. Unlike kernel
canonical correlation analysis, where the transformation is applied to the
random vectors, in MTCCA the transformation is applied to their joint
probability distribution. This results in performance advantages and reduced
implementation complexity. The proposed approach is illustrated for graphical
model selection in simulated data having non-linear dependencies, and for
measuring long-term associations between companies traded in the NASDAQ and
NYSE stock markets
- …