13,918 research outputs found

    Stochastic Discriminative EM

    Full text link
    Stochastic discriminative EM (sdEM) is an online-EM-type algorithm for discriminative training of probabilistic generative models belonging to the exponential family. In this work, we introduce and justify this algorithm as a stochastic natural gradient descent method, i.e. a method which accounts for the information geometry in the parameter space of the statistical model. We show how this learning algorithm can be used to train probabilistic generative models by minimizing different discriminative loss functions, such as the negative conditional log-likelihood and the Hinge loss. The resulting models trained by sdEM are always generative (i.e. they define a joint probability distribution) and, in consequence, allows to deal with missing data and latent variables in a principled way either when being learned or when making predictions. The performance of this method is illustrated by several text classification problems for which a multinomial naive Bayes and a latent Dirichlet allocation based classifier are learned using different discriminative loss functions.Comment: UAI 2014 paper + Supplementary Material. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI 2014), edited by Nevin L. Zhang and Jian Tian. AUAI Pres

    Real-time datasets really do make a difference: definitional change, data release, and forecasting

    Get PDF
    In this paper, the authors empirically assess the extent to which early release inefficiency and definitional change affect prediction precision. In particular, they carry out a series of ex-ante prediction experiments in order to examine: the marginal predictive content of the revision process, the trade-offs associated with predicting different releases of a variable, the importance of particular forms of definitional change, which the authors call "definitional breaks," and the rationality of early releases of economic variables. An important feature of our rationality tests is that they are based solely on the examination of ex-ante predictions, rather than being based on in-sample regression analysis, as are many tests in the extant literature. Their findings point to the importance of making real-time datasets available to forecasters, as the revision process has marginal predictive content, and because predictive accuracy increases when multiple releases of data are used when specifying and estimating prediction models. The authors also present new evidence that early releases of money are rational, whereas prices and output are irrational. Moreover, they find that regardless of which release of our price variable one specifies as the "target" variable to be predicted, using only "first release" data in model estimation and prediction construction yields mean square forecast error (MSFE) "best" predictions. On the other hand, models estimated and implemented using "latest available release" data are MSFE-best for predicting all releases of money. The authors argue that these contradictory findings are due to the relevance of definitional breaks in the data generating processes of the variables that they examine. In an empirical analysis, they examine the real-time predictive content of money for income, and they find that vector autoregressions with money do not perform significantly worse than autoregressions, when predicting output during the last 20 years.Economic forecasting ; Econometrics

    Dephasing and Hyperfine Interaction in Carbon Nanotubes Double Quantum Dots: Disordered Case

    Full text link
    We study theoretically the \emph{return probability experiment}, used to measure the dephasing time T2T_2^*, in a double quantum dot (DQD) in semiconducting carbon nanotubes (CNTs) with spin-orbit coupling and disorder induced valley mixing. Dephasing is due to hyperfine interaction with the spins of the 13{}^{13}C nuclei. Due to the valley and spin degrees of freedom four bounded states exist for any given longitudinal mode in the quantum dot. At zero magnetic field the spin-orbit coupling and the valley mixing split those four states into two Kramers doublets. The valley mixing term for a given dot is determined by the intra-dot disorder and therefore the states in the Kramers doublets belonging to different dots are different. We show how nonzero single-particle interdot tunneling amplitudes between states belonging to different doublets give rise to new avoided crossings, as a function of detuning, in the relevant two particle spectrum, crossing over from the two electrons in one dot states configuration, (0,2)(0,2), to the one electron in each dot configuration, (1,1)(1,1). In contrast to the clean system, multiple Landau-Zener processes affect the separation and the joining stages of each single-shot measurement and they affect the outcome of the measurement in a way that strongly depends on the initial state. We find that a well-defined return probability experiment is realized when, at each single-shot cycle, the (0,2) ground state is prepared. In this case, valley mixing increases the saturation value of the measured return probability, whereas the probability to return to the (0,2) ground state remains unchanged. Finally, we study the effect of the valley mixing in the high magnetic field limit; for a parallel magnetic field the predictions coincide with a clean nanotube, while the disorder effect is always relevant with a magnetic field perpendicular to the nanotube axis.Comment: 22 pages, 11 figure

    Probabilistic Graphical Models on Multi-Core CPUs using Java 8

    Get PDF
    In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.Comment: Pre-print version of the paper presented in the special issue on Computational Intelligence Software at IEEE Computational Intelligence Magazine journa

    Luminous X-ray Flares from Low Mass X-ray Binary Candidates in the Early-Type Galaxy NGC 4697

    Full text link
    We report results of the first search specifically targeting short-timescale X-ray flares from low-mass X-ray binaries in an early-type galaxy. A new method for flare detection is presented. In NGC 4697, the nearest, optically luminous, X-ray faint elliptical galaxy, 3 out of 157 sources are found to display flares at >99.95% probability, and all show more than one flare. Two sources are coincident with globular clusters and show flare durations and luminosities similar to (but larger than) Type-I X-ray superbursts found in Galactic neutron star (NS) X-ray binaries (XRBs). The third source shows more extreme flares. Its flare luminosity (~6E39 erg/s) is very super-Eddington for an NS and is similar to the peak luminosities of the brightest Galactic black hole (BH) XRBs. However, the flare duration (~70 s) is much shorter than are typically seen for outbursts reaching those luminosities in Galactic BH sources. Alternative models for the flares are considered.Comment: Astrophysical Journal Letters, accepted: 4 page
    corecore