2,564 research outputs found
Marginal AMP Chain Graphs
We present a new family of models that is based on graphs that may have
undirected, directed and bidirected edges. We name these new models marginal
AMP (MAMP) chain graphs because each of them is Markov equivalent to some AMP
chain graph under marginalization of some of its nodes. However, MAMP chain
graphs do not only subsume AMP chain graphs but also multivariate regression
chain graphs. We describe global and pairwise Markov properties for MAMP chain
graphs and prove their equivalence for compositional graphoids. We also
characterize when two MAMP chain graphs are Markov equivalent.
For Gaussian probability distributions, we also show that every MAMP chain
graph is Markov equivalent to some directed and acyclic graph with
deterministic nodes under marginalization and conditioning on some of its
nodes. This is important because it implies that the independence model
represented by a MAMP chain graph can be accounted for by some data generating
process that is partially observed and has selection bias. Finally, we modify
MAMP chain graphs so that they are closed under marginalization for Gaussian
probability distributions. This is a desirable feature because it guarantees
parsimonious models under marginalization.Comment: Changes from v1 to v2: Discussion section got extended. Changes from
v2 to v3: New Sections 3 and 5. Changes from v3 to v4: Example 4 added to
discussion section. Changes from v4 to v5: None. Changes from v5 to v6: Some
minor and major errors have been corrected. The latter include the
definitions of descending route and pairwise separation base, and the proofs
of Theorems 5 and
Transfer pricing rules, OECD guidelines, and market distortions
We study the impact of transfer pricing rules on sales prices, firmsā organizational structure, and consumersā utility within a two-country monopolistic competition model featuring source-based profit taxes that differ across countries. Firms can either become multinationals, i.e., they serve the foreign market through a fully controlled affiliate; or they can become exporters, i.e., they serve the foreign market by contracting with an independent distributor. Compared to the benchmark cases, where tax authorities are either unable to audit firms or where they are able to audit them perfectly, the use of the OECDās Comparable Uncontrolled Price (CUP) or Cost-Plus (CP) rule distorts firmsā output and pricing decisions. The reason is that the comparable armās length transactions between exporters and distributors, which serve as benchmarks, are not efficient. We show that implementing the CUP or CP rules is detrimental to consumers in the low tax country, yet benefits consumers in the high tax country.transfer pricing, OECD guidelines, multinationals and exporters, organizational choice, arm's length principle
Multilevel Bayesian framework for modeling the production, propagation and detection of ultra-high energy cosmic rays
Ultra-high energy cosmic rays (UHECRs) are atomic nuclei with energies over
ten million times energies accessible to human-made particle accelerators.
Evidence suggests that they originate from relatively nearby extragalactic
sources, but the nature of the sources is unknown. We develop a multilevel
Bayesian framework for assessing association of UHECRs and candidate source
populations, and Markov chain Monte Carlo algorithms for estimating model
parameters and comparing models by computing, via Chib's method, marginal
likelihoods and Bayes factors. We demonstrate the framework by analyzing
measurements of 69 UHECRs observed by the Pierre Auger Observatory (PAO) from
2004-2009, using a volume-complete catalog of 17 local active galactic nuclei
(AGN) out to 15 megaparsecs as candidate sources. An early portion of the data
("period 1," with 14 events) was used by PAO to set an energy cut maximizing
the anisotropy in period 1; the 69 measurements include this "tuned" subset,
and subsequent "untuned" events with energies above the same cutoff. Also,
measurement errors are approximately summarized. These factors are problematic
for independent analyses of PAO data. Within the context of "standard candle"
source models (i.e., with a common isotropic emission rate), and considering
only the 55 untuned events, there is no significant evidence favoring
association of UHECRs with local AGN vs. an isotropic background. The
highest-probability associations are with the two nearest, adjacent AGN,
Centaurus A and NGC 4945. If the association model is adopted, the fraction of
UHECRs that may be associated is likely nonzero but is well below 50%. Our
framework enables estimation of the angular scale for deflection of cosmic rays
by cosmic magnetic fields; relatively modest scales of to
are favored. Models that assign a large fraction of UHECRs to a
single nearby source (e.g., Centaurus A) are ruled out unless very large
deflection scales are specified a priori, and even then they are disfavored.
However, including the period 1 data alters the conclusions significantly, and
a simulation study supports the idea that the period 1 data are anomalous,
presumably due to the tuning. Accurate and optimal analysis of future data will
likely require more complete disclosure of the data.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS654 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Symbolic Planning and Code Generation for Grounded Dialogue
Large language models (LLMs) excel at processing and generating both text and
code. However, LLMs have had limited applicability in grounded task-oriented
dialogue as they are difficult to steer toward task objectives and fail to
handle novel grounding. We present a modular and interpretable grounded
dialogue system that addresses these shortcomings by composing LLMs with a
symbolic planner and grounded code execution. Our system consists of a reader
and planner: the reader leverages an LLM to convert partner utterances into
executable code, calling functions that perform grounding. The translated
code's output is stored to track dialogue state, while a symbolic planner
determines the next appropriate response. We evaluate our system's performance
on the demanding OneCommon dialogue task, involving collaborative reference
resolution on abstract images of scattered dots. Our system substantially
outperforms the previous state-of-the-art, including improving task success in
human evaluations from 56% to 69% in the most challenging setting.Comment: Accepted to EMNLP 202
Graphical Markov models, unifying results and their interpretation
Graphical Markov models combine conditional independence constraints with
graphical representations of stepwise data generating processes.The models
started to be formulated about 40 years ago and vigorous development is
ongoing. Longitudinal observational studies as well as intervention studies are
best modeled via a subclass called regression graph models and, especially
traceable regressions. Regression graphs include two types of undirected graph
and directed acyclic graphs in ordered sequences of joint responses. Response
components may correspond to discrete or continuous random variables and may
depend exclusively on variables which have been generated earlier. These
aspects are essential when causal hypothesis are the motivation for the
planning of empirical studies.
To turn the graphs into useful tools for tracing developmental pathways and
for predicting structure in alternative models, the generated distributions
have to mimic some properties of joint Gaussian distributions. Here, relevant
results concerning these aspects are spelled out and illustrated by examples.
With regression graph models, it becomes feasible, for the first time, to
derive structural effects of (1) ignoring some of the variables, of (2)
selecting subpopulations via fixed levels of some other variables or of (3)
changing the order in which the variables might get generated. Thus, the most
important future applications of these models will aim at the best possible
integration of knowledge from related studies.Comment: 34 Pages, 11 figures, 1 tabl
A Theory of Emergent In-Context Learning as Implicit Structure Induction
Scaling large language models (LLMs) leads to an emergent capacity to learn
in-context from example demonstrations. Despite progress, theoretical
understanding of this phenomenon remains limited. We argue that in-context
learning relies on recombination of compositional operations found in natural
language data. We derive an information-theoretic bound showing how in-context
learning abilities arise from generic next-token prediction when the
pretraining distribution has sufficient amounts of compositional structure,
under linguistically motivated assumptions. A second bound provides a
theoretical justification for the empirical success of prompting LLMs to output
intermediate steps towards an answer. To validate theoretical predictions, we
introduce a controlled setup for inducing in-context learning; unlike previous
approaches, it accounts for the compositional nature of language. Trained
transformers can perform in-context learning for a range of tasks, in a manner
consistent with the theoretical results. Mirroring real-world LLMs in a
miniature setup, in-context learning emerges when scaling parameters and data,
and models perform better when prompted to output intermediate steps. Probing
shows that in-context learning is supported by a representation of the input's
compositional structure. Taken together, these results provide a step towards
theoretical understanding of emergent behavior in large language models
- ā¦