295 research outputs found

    Extracting predictive models from marked-p free-text documents at the Royal Botanic Gardens, Kew, London

    Get PDF
    In this paper we explore the combination of text-mining, un-supervised and supervised learning to extract predictive models from a corpus of digitised historical floras. These documents deal with the nomenclature, geographical distribution, ecology and comparative morphology of the species of a region. Here we exploit the fact that portions of text in the floras are marked up as different types of trait and habitat. We infer models from these different texts that can predict different habitat-types based upon the traits of plant species. We also integrate plant taxonomy data in order to assist in the validation of our models. We have shown that by clustering text describing the habitat of different floras we can identify a number of important and distinct habitats that are associated with particular families of species along with statistical significance scores. We have also shown that by using these discovered habitat-types as labels for supervised learning we can predict them based upon a subset of traits, identified using wrapper feature selection

    Stochastic Budget Optimization in Internet Advertising

    Full text link
    Internet advertising is a sophisticated game in which the many advertisers "play" to optimize their return on investment. There are many "targets" for the advertisements, and each "target" has a collection of games with a potentially different set of players involved. In this paper, we study the problem of how advertisers allocate their budget across these "targets". In particular, we focus on formulating their best response strategy as an optimization problem. Advertisers have a set of keywords ("targets") and some stochastic information about the future, namely a probability distribution over scenarios of cost vs click combinations. This summarizes the potential states of the world assuming that the strategies of other players are fixed. Then, the best response can be abstracted as stochastic budget optimization problems to figure out how to spread a given budget across these keywords to maximize the expected number of clicks. We present the first known non-trivial poly-logarithmic approximation for these problems as well as the first known hardness results of getting better than logarithmic approximation ratios in the various parameters involved. We also identify several special cases of these problems of practical interest, such as with fixed number of scenarios or with polynomial-sized parameters related to cost, which are solvable either in polynomial time or with improved approximation ratios. Stochastic budget optimization with scenarios has sophisticated technical structure. Our approximation and hardness results come from relating these problems to a special type of (0/1, bipartite) quadratic programs inherent in them. Our research answers some open problems raised by the authors in (Stochastic Models for Budget Optimization in Search-Based Advertising, Algorithmica, 58 (4), 1022-1044, 2010).Comment: FINAL versio

    Self-dual noncommutative \phi^4-theory in four dimensions is a non-perturbatively solvable and non-trivial quantum field theory

    Full text link
    We study quartic matrix models with partition function Z[E,J]=\int dM \exp(trace(JM-EM^2-(\lambda/4)M^4)). The integral is over the space of Hermitean NxN-matrices, the external matrix E encodes the dynamics, \lambda>0 is a scalar coupling constant and the matrix J is used to generate correlation functions. For E not a multiple of the identity matrix, we prove a universal algebraic recursion formula which gives all higher correlation functions in terms of the 2-point function and the distinct eigenvalues of E. The 2-point function itself satisfies a closed non-linear equation which must be solved case by case for given E. These results imply that if the 2-point function of a quartic matrix model is renormalisable by mass and wavefunction renormalisation, then the entire model is renormalisable and has vanishing \beta-function. As main application we prove that Euclidean \phi^4-quantum field theory on four-dimensional Moyal space with harmonic propagation, taken at its self-duality point and in the infinite volume limit, is exactly solvable and non-trivial. This model is a quartic matrix model, where E has for N->\infty the same spectrum as the Laplace operator in 4 dimensions. Using the theory of singular integral equations of Carleman type we compute (for N->\infty and after renormalisation of E,\lambda) the free energy density (1/volume)\log(Z[E,J]/Z[E,0]) exactly in terms of the solution of a non-linear integral equation. Existence of a solution is proved via the Schauder fixed point theorem. The derivation of the non-linear integral equation relies on an assumption which we verified numerically for coupling constants 0<\lambda\leq (1/\pi).Comment: LaTeX, 64 pages, xypic figures. v4: We prove that recursion formulae and vanishing of \beta-function hold for general quartic matrix models. v3: We add the existence proof for a solution of the non-linear integral equation. A rescaling of matrix indices was necessary. v2: We provide Schwinger-Dyson equations for all correlation functions and prove an algebraic recursion formula for their solutio

    To be or not to be? What molecules say about Runcina brenkoae Thompson, 1980 (Gastropoda: Heterobranchia: Runcinida)

    Get PDF
    Runcinids are poorly known minute marine slugs inhabiting intertidal and shallow subtidal rocky shores. Among the European species, Runcina brenkoae, described from the Adriatic Sea in the Mediterranean, has been described to display chromatic variability, placing in question the true identity and geographic distribution of the species. In this paper we investigate the taxonomic status of R. brenkoae based on specimens from the central and western Mediterranean Sea and the southern Iberian coastline of Portugal and Spain, following an integrative approach combining multi-locus molecular phylogenetics based on the mitochondrial markers cytochrome c oxidase subunit I and 16S rRNA and the nuclear gene histone H3, together with the study of morpho-anatomical characters investigated by scanning electron microscopy. To aid in species delimitation, the Automatic Barcode Gap Discovery and Bayesian Poisson tree process methods were employed. Our results indicate the existence of a complex of three species previously identified as R. brenkoae, namely two new species here described (R. marcosi n. sp. and R. lusitanica n. sp.) and R. brenkoae proper

    Probing Primordial Non-Gaussianity with Large-Scale Structure

    Full text link
    We consider primordial non-Gaussianity due to quadratic corrections in the gravitational potential parametrized by a non-linear coupling parameter fnl. We study constraints on fnl from measurements of the galaxy bispectrum in redshift surveys. Using estimates for idealized survey geometries of the 2dF and SDSS surveys and realistic ones from SDSS mock catalogs, we show that it is possible to probe |fnl|~100, after marginalization over bias parameters. We apply our methods to the galaxy bispectrum measured from the PSCz survey, and obtain a 2sigma-constraint |fnl|< 1800. We estimate that an all sky redshift survey up to z~1 can probe |fnl|~1. We also consider the use of cluster abundance to constrain fnl and find that in order to be sensitive to |fnl|~100, cluster masses need to be determined with an accuracy of a few percent, assuming perfect knowledge of the mass function and cosmological parameters.Comment: 15 pages, 7 figure

    Parity nonconservation in deuteron photoreactions

    Full text link
    We calculate the asymmetries in parity nonconserving deuteron photodisintegration due to circularly polarized photons gamma+d to n+p with the photon laboratory energy ranging from the threshold up to 10 MeV and the radiative capture of thermal polarized neutrons by protons n+p to gamma+d. We use the leading order electromagnetic Hamiltonian neglecting the smaller nuclear exchange currents. Comparative calculations are done by using the Reid93 and Argonne v18 potentials for the strong interaction and the DDH and FCDH "best" values for the weak couplings in a weak one-meson exchange potential. A weak NDelta transition potential is used to incorporate also the Delta(1232)-isobar excitation in the coupled-channels formalism.Comment: 14 pages, 13 figures (18 eps files), LaTeX2
    • …
    corecore