1,019 research outputs found

    A Rapidly Deployable Classification System using Visual Data for the Application of Precision Weed Management

    Full text link
    In this work we demonstrate a rapidly deployable weed classification system that uses visual data to enable autonomous precision weeding without making prior assumptions about which weed species are present in a given field. Previous work in this area relies on having prior knowledge of the weed species present in the field. This assumption cannot always hold true for every field, and thus limits the use of weed classification systems based on this assumption. In this work, we obviate this assumption and introduce a rapidly deployable approach able to operate on any field without any weed species assumptions prior to deployment. We present a three stage pipeline for the implementation of our weed classification system consisting of initial field surveillance, offline processing and selective labelling, and automated precision weeding. The key characteristic of our approach is the combination of plant clustering and selective labelling which is what enables our system to operate without prior weed species knowledge. Testing using field data we are able to label 12.3 times fewer images than traditional full labelling whilst reducing classification accuracy by only 14%.Comment: 36 pages, 14 figures, published Computers and Electronics in Agriculture Vol. 14

    Multivariate Approaches to Classification in Extragalactic Astronomy

    Get PDF
    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono-or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.Comment: Open Access paper. http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>. \<10.3389/fspas.2015.00003 \&g

    Clustering constrained by dependencies

    Get PDF
    Clustering is the unsupervised method of grouping data samples to form a partition of a given dataset. Such grouping is typically done based on homogeneity assumptions of clusters over an attribute space and hence the precise definition of the similarity metric affects the clusters inferred. In recent years, new formulations of clustering have emerged that posit indirect constraints on clustering, typically in terms of preserving dependencies between data samples and auxiliary variables. These formulations find applications in bioinformatics, web mining, social network analysis, and many other domains. The purpose of this survey is to provide a gentle introduction to these formulations, their mathematical assumptions, and the contexts under which they are applicable

    Objective Classification of Galaxy Spectra using the Information Bottleneck Method

    Get PDF
    A new method for classification of galaxy spectra is presented, based on a recently introduced information theoretical principle, the `Information Bottleneck'. For any desired number of classes, galaxies are classified such that the information content about the spectra is maximally preserved. The result is classes of galaxies with similar spectra, where the similarity is determined via a measure of information. We apply our method to approximately 6000 galaxy spectra from the ongoing 2dF redshift survey, and a mock-2dF catalogue produced by a Cold Dark Matter-based semi-analytic model of galaxy formation. We find a good match between the mean spectra of the classes found in the data and in the models. For the mock catalogue, we find that the classes produced by our algorithm form an intuitively sensible sequence in terms of physical properties such as colour, star formation activity, morphology, and internal velocity dispersion. We also show the correlation of the classes with the projections resulting from a Principal Component Analysis.Comment: submitted to MNRAS, 17 pages, Latex, with 14 figures embedde

    Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

    Get PDF
    In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1

    Compression and Classification Methods for Galaxy Spectra in Large Redshift Surveys

    Get PDF
    Methods for compression and classification of galaxy spectra, which are useful for large galaxy redshift surveys (such as the SDSS, 2dF, 6dF and VIRMOS), are reviewed. In particular, we describe and contrast three methods: (i) Principal Component Analysis, (ii) Information Bottleneck, and (iii) Fisher Matrix. We show applications to 2dF galaxy spectra and to mock semi-analytic spectra, and we discuss how these methods can be used to study physical processes of galaxy formation, clustering and galaxy biasing in the new large redshift surveys.Comment: Review talk, proceedings of MPA/MPE/ESO Conference "Mining the Sky", 2000, Garching, Germany; 20 pages, 5 figure
    corecore