308 research outputs found

    A Latent Dirichlet Framework for Relevance Modeling

    Full text link
    Abstract. Relevance-based language models operate by estimating the probabilities of observing words in documents relevant (or pseudo relevant) to a topic. However, these models assume that if a document is relevant to a topic, then all tokens in the document are relevant to that topic. This could limit model robustness and effectiveness. In this study, we propose a Latent Dirichlet relevance model, which relaxes this assumption. Our approach derives from current research on Latent Dirichlet Allocation (LDA) topic models. LDA has been extensively explored, especially for generating a set of topics from a corpus. A key attraction is that in LDA a document may be about several topics. LDA itself, however, has a limitation that is also addressed in our work. Topics generated by LDA from a corpus are synthetic, i.e., they do not necessarily correspond to topics identified by humans for the same corpus. In contrast, our model explicitly considers the relevance relationships between documents and given topics (queries). Thus unlike standard LDA, our model is directly applicable to goals such as relevance feedback for query modification and text classification, where topics (classes and queries) are provided upfront. Thus although the focus of our paper is on improving relevance-based language models, in effect our approach bridges relevance-based language models and LDA addressing limitations of both. Finally, we propose an idea that takes advantage of “bagof-words” assumption to reduce the complexity of Gibbs sampling based learning algorithm

    Effect of degree of salinity on seed germination and initial growth of chickpea (Cicer arietinum)

    Get PDF
    Chickpea (Cicer arietinum L.) is one of the main pulse crops cultivated mostly in the arid and semi-arid regions of the world, very often on saline lands. The problem is that it has not been clearly determined yet what is the safe salinity degree for obtaining uniform and vigorous sprouts of the crop without significant suppression in the parameters of initial growth and development. The goal of our study was to determine the effect of different NaCl concentrations in solutions on chickpea germination and initial growth to determine the safe degree of salinity for the crop cultivation. The study was carried out in greenhouse conditions of Kherson State Agrarian University. We studied the effect of five different gradually increasing degrees of NaCl solutions on the germination percentage and initial growth of chickpea (variety Rosanna, kabuli type) that was germinated in laboratory conditions in flasks filled with sand, at the temperature of 25 oC. A significant decrease in all the studied parameters was observed with the increase of salinity degree. However, we think that a considerable decrease of the crop germination and initial growth started with NaCl concentration of 1.79 g/L: germination percentage decreased by 33.9%, plant height – by 7.8 cm, root length – by 5.5 cm in comparison to the control variant (not saline conditions). Therefore, we conclude that the chickpea can be efficiently cultivated on slightly-saline lands. Besides, the results of linear regression analysis revealed that the most susceptible stage of chickpea growth and development is germination because this stage had strong close inter-connection with the degree of salinity. Further growth of the crop was less affected by the salinity stress. We recommend cultivation of chickpea on the saline lands only with a slight salinity level

    Урожайність зерна квасолі звичайної залежно від обробітку грунту, мінеральних добрив та ширини міжряддя при зрошенні

    Get PDF
     Actuality. Nowadays there is a tendency to change agricultural producers attitude to leguminous crops. First of all it concerns winter pea, chickpea, lentil and haricot. At the moment sown areas under haricot are being expanded mainly at the expense of the private sector. The industrial cultivation is insignificant and does not extend, mainly, because of too little knowledge in haricot cultivation technology elements in different agro-climatic zones. Lately the works dedicated to different aspects of the haricot cultivation technology have been conducted by Saiko O.Yu., Horova T.K., Cherkasova V.K., Akulenko V.V., Bakhmat M.I., Omae H., Kumar A., Egawa E. Along with this, effect of tillage depth, fertilizers and inter-row spacing on the haricot productivity under conditions of the energy and financial crisis has not been completely studied for the irrigated conditions of the South Steppe Zone of Ukraine, that determined  necessity of the proper studies in the above-mentioned field. Methods of the studies. The research on the improvement of the elements of the cultivation technology of haricot beans in the south of Ukraine was carried out using a three-factor field experiment on the territory of the agricultural cooperative «Radianska zemlia» in Bilozerskyi district of Kherson region. The field experiments were repeated four times. The location of the variants was carried out using a split plot method with partial randomization. In the field experiments, the following factors and their variants were  examined: Factor A — basic soil tillage: tillage of 20-22 cm deep; tillage of 28-30 cm deep. Factor B – nutrition background: no fertilizer; N45P45; N90P90. Factor C – the width of row spacing, cm: 15; 30; 45; 60. The studies were carried out with accordance to the common methodology of the field experiments. Precipitation amounts during vegetation period of the crop were determined by using the rain gauge installed at the research plot.  Temperature, air relative humidity were fixed by using the data of the Kherson regional meteorological station. Crop yields were established by using the method of entire harvesting. Grain yields data were recalculated to the standard moisture of 14% and one hundred percent purity. Yields data were estimated by using the agronomic criteria and statistically processed. The investigated elements of the cultivation technology of haricot beans when irrigated from the Ingulets main canal, its water belongs to the second quality grade (limited suitability), have a significant effect on the productivity of the crop. On average, in 2014-2016, the grain yields ranged from 1.47 to 3.37 t/ha. According to the results the cultivation of 28-30 cm deep ensured better conditions for the growth and development of haricot plants, which affected the level of its yield. On average, for the years of the research the implementation of this cultivation ensured the yield increase from 0.02 to 0.08 t/ha compared to the tillage of 20-22 cm deep. But these figures according to the conducted dispersion analysis are within the limits of the error (LSD(P<0.05) for the years of the research was 0,04-0,05 t/ha for factor A). The tillage of 20-22 cm deep ensured the yield formation of the crop of the varieties without fertilizers, on average, for the years of the research – 1.95 t/ha. The application of nitrogen-phosphorous fertilizers in the amount of 45 kg/ha of the active substance provided increase in the yields of haricot bean grain at the level of 0.41-0.78 t/ha. The increase in twice of the amount of active ingredient did not provide a similar increase in the growth of the grain yield. Under these conditions, the growth compared to non-fertilized variants was 0.54-0.78 t/ha, and compared to the previous variant it was only 6.1%. The analysis of the long-term data shows that the maximum productivity of haricot plants was fixed when the row width was 45 cm. When the row spacing was between 15 and 45 cm wide the crop yield increased, on average, from 1.79 to 2.97 t/ha when the tillage was 20-22 cm deep and from 1.81 to 3.04 when the tillage was 28-30 cm deep. A further increase in the width of the row spacing to 60 cm led to a significant reduction in the crop yield. Conclusions. The research conducted with haricot under irrigation with irrigation water of the second quality grade (the Ingulets irrigated area) during 2014-2016 showed that the highest productivity of the plants of 3,37 t/ha is formed when the tillage is 28-30 cm deep, the application of mineral fertilizers with the norm N90P90and the row width of 45 cm. Taking into account the conducted dispersion data analysis we maintain that the most suitable for the introduction into production will be the following elements of the agro-technological complex of cultivation of the crop, which will include tillage of 20-22 cm deep, the application of mineral fertilizers with the norm N45P45 and sowing with the spacing of 45 cm. These technological elements will provide haricot grain yield of at the level of 3.09 t/ha.У статті представлені результати багаторічних польових досліджень з вивчення продуктивності рослин квасолі звичайної залежно від обробітку ґрунту, мінеральних добрив та ширини міжряддя при зрошенні в умовах Південного Степу України. Проаналізовані експериментальні дані та висновки, підтверджені проведенням дисперсійного аналізу. Удосконалені елементи технології вирощування культури, які дають змогу отримувати високі врожаї зерна квасолі звичайної

    A framework for evaluating automatic image annotation algorithms

    Get PDF
    Several Automatic Image Annotation (AIA) algorithms have been introduced recently, which have been found to outperform previous models. However, each one of them has been evaluated using either different descriptors, collections or parts of collections, or "easy" settings. This fact renders their results non-comparable, while we show that collection-specific properties are responsible for the high reported performance measures, and not the actual models. In this paper we introduce a framework for the evaluation of image annotation models, which we use to evaluate two state-of-the-art AIA algorithms. Our findings reveal that a simple Support Vector Machine (SVM) approach using Global MPEG-7 Features outperforms state-of-the-art AIA models across several collection settings. It seems that these models heavily depend on the set of features and the data used, while it is easy to exploit collection-specific properties, such as tag popularity especially in the commonly used Corel 5K dataset and still achieve good performance

    Should Part M lead to more inclusive designs?: built environment professionals' perspective

    Get PDF
    An inclusive built environment design should reflect the fact that most people experience changes in the level of abilities during the different stages in life. The design should facilitate greater participation and inclusion of people of all ages and abilities by providing accessible and usable environments. Unfortunately, it is observed that some built environments pose challenges with regards to accessibility and usability for people with a range of impairment. The current Part M of the Building Regulations and the associated Approved Document underline basic minimum statutory requirement and suggest reasonable provision to ensure buildings are accessible and useable. An e-survey carried out on 104 construction professionals such as building control officers, planners and building surveyors revealed a greater need for engagement of built environment professionals to understand the inclusive design perspective. This is because compliance with Part M of Building Regulations does not necessarily cater to the needs of users with all types of impairment

    Comparing Probabilistic Models for Melodic Sequences

    Get PDF
    Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.Comment: in Proceedings of the ECML-PKDD 2011. Lecture Notes in Computer Science, vol. 6913, pp. 289-304. Springer (2011

    a term is known by the company it keeps”: On selecting a good expansion set in pseudo-relevance feedback

    Get PDF
    Abstract. It is well known that pseudo-relevance feedback (PRF) improves the retrieval performance of Information Retrieval (IR) systems in general. However, a recent study by Cao et al [3] has shown that a non-negligible fraction of expansion terms used by PRF algorithms are harmful to the retrieval. In other words, a PRF algorithm would be better off if it were to use only a subset of the feedback terms. The challenge then is to find a good expansion set from the set of all candidate expansion terms. A natural approach to solve the problem is to make term independence assumption and use one or more term selection criteria or a statistical classifier to identify good expansion terms independent of each other. In this work, we challenge this approach and show empirically that a feedback term is neither good nor bad in itself in general; the behavior of a term depends very much on other expansion terms. Our finding implies that a good expansion set can not be found by making term independence assumption in general. As a principled solution to the problem, we propose spectral partitioning of expansion terms using a specific term-term interaction matrix. We demonstrate on several test collections that expansion terms can be partitioned into two sets and the best of the two sets gives substantial improvements in retrieval performance over model-based feedback

    Sparse Kernel Learning for Image Annotation

    Get PDF
    In this paper we introduce a sparse kernel learning frame-work for the Continuous Relevance Model (CRM). State-of-the-art image annotation models linearly combine evidence from several different feature types to improve image anno-tation accuracy. While previous authors have focused on learning the linear combination weights for these features, there has been no work examining the optimal combination of kernels. We address this gap by formulating a sparse kernel learning framework for the CRM, dubbed the SKL-CRM, that greedily selects an optimal combination of ker-nels. Our kernel learning framework rapidly converges to an annotation accuracy that substantially outperforms a host of state-of-the-art annotation models. We make two surprising conclusions: firstly, if the kernels are chosen correctly, only a very small number of features are required so to achieve superior performance over models that utilise a full suite of feature types; and secondly, the standard default selection of kernels commonly used in the literature is sub-optimal, and it is much better to adapt the kernel choice based on the feature type and image dataset

    Study of the prevalence of hypersensitivity β-lactam antibiotics among the population of Ukraine

    Get PDF
    Алергічні реакції до ß-лактамних антибіотиків є найбільш частою причиною побічних медикаментозних реакцій, опосередкованих специфічними імунологічними механізмами. Метою даної роботи було визначення поширеності гіперчутливості до ß-лактамних антибіотиків серед населення шляхом дослідження анамнестичних даних і проведення алергологічного обстеження для підвищення безпеки антибіотикотерапії і поліпшення фармакоекономічного профілю лікування. Аллергические реакции к ß-лактамным антибиотикам является наиболее частой причиной побочных медикаментозных реакций, опосредованных специфическими иммунологическими механизмами. Целью данной работы было определение распространенности гиперчувствительности к ß-лактамным антибиотикам среди населения путем исследования анамнестических данных и проведение аллергологического обследования для повышения безопасности антибиотикотерапии и улучшения фармакоэкономического профиля лечения. Allergic reactions to β-lactam antibiotics are the most common cause of adverse drug reactions mediated by specific immunological mechanisms. The purpose of this work was to determine the prevalence of hypersensitivity to ß-lactam antibiotics among the population by studying anamnestic data and conducting an allergic examination to improve the safety of antibiotic therapy and improve the pharmacoeconomic profile of treatment

    An information-theoretic framework for semantic-multimedia retrieval

    Get PDF
    This article is set in the context of searching text and image repositories by keyword. We develop a unified probabilistic framework for text, image, and combined text and image retrieval that is based on the detection of keywords (concepts) using automated image annotation technology. Our framework is deeply rooted in information theory and lends itself to use with other media types. We estimate a statistical model in a multimodal feature space for each possible query keyword. The key element of our framework is to identify feature space transformations that make them comparable in complexity and density. We select the optimal multimodal feature space with a minimum description length criterion from a set of candidate feature spaces that are computed with the average-mutual-information criterion for the text part and hierarchical expectation maximization for the visual part of the data. We evaluate our approach in three retrieval experiments (only text retrieval, only image retrieval, and text combined with image retrieval), verify the framework’s low computational complexity, and compare with existing state-of-the-art ad-hoc models
    corecore