11 research outputs found

    Plant identification using deep convolutional networks based on principal component analysis

    Get PDF
    Plants have substantial effects in human vitality through their different uses in agriculture, food industry, pharmacology, and climate control. The large number of herbs and plant species and shortage of skilled botanists have increased the need for automated plant identification systems in recent years. As one of the challenging problems in object recognition, automatic plant identification aims to assign the plant in an image to a known taxon or species using machine learning and computer vision algorithms. However, this problem is challenging due to the inter-class similarities within a plant family and large intra-class variations in background, occlusion, pose, color, and illumination. In this thesis, we propose an automatic plant identification system based on deep convolutional networks. This system uses a simple baseline and applies principal component analysis (PCA) to patches of images to learn the network weights in an unsupervised learning approach. After multi-stage PCA filter banks are learned, a simple binary hashing is applied to output maps and the obtained maps are subsampled through max-pooling. Finally, the spatial pyramid pooling is applied to the downsampled data to extract features from block histograms. A multi-class linear support vector machine is then trained to classify the different species. The system performance is evaluated on the plant identification datasets of LifeCLEF 2014 in terms of classification accuracy, inverse rank score, and robustness against pose (translation, scaling, and rotation) and illumination variations. A comparison of our results with those of the top systems submitted to LifeCLEF 2014 campaign reveals that our proposed system would have achieved the second place in the categories of Entire, Branch, Fruit, Leaf, Scanned Leaf, and Stem, and the third place in the Flower category while having a simpler architecture and lower computational complexity than the winner system(s). We achieved the best accuracy in scanned leaves where we obtained an inverse rank score of 0.6157 and a classification accuracy of 68.25%

    Towards Multi-Level Classification in Deep Plant Identification

    Get PDF
    Tesis de Graduación (Doctorado académico en Ingeniería) Instituto Tecnológico de Costa Rica, 2018.In the last decade, automatic identification of organisms based on computer vision techniques has been a hot topic for both biodiversity scientists and machine learning specialists. Early on, plants became particularly attractive as a subject of study for two main reasons. On the one hand, quick and accurate inventories of plants are critical for biodiversity conservation; for example, they are indispensable in conducting ecosystem inventories, defining models for environmental service payments, and tracking populations of invasive plant species, among others. On the other hand, plants are a more tractable group than, for instance, insects. First of all, the number of species is smaller (around 400,000 compared to more than 8 million). Secondly, they are better understood by the scientific community, particularly with respect to their morphometric features. Thirdly, there are large, fast growing databases of digital images of plants generated by both scientists and the general public. Finally, an incremental approach based first on "flat elements" such as leaves and then the whole plant made it feasible to use computer vision techniques early on. As a result, even mobile apps for the general public are available nowadays. This document presents the key results obtained while tackling the general problem of fully automating the identification of plant species based solely on images. It describes the key findings in a research path that started with a restricted scope, namely, identification of plants from Costa Rica by using a morphometric approach that considers images of fresh leaves only. Then, species from other regions of the world were included, but still using hand-crafted feature extractors. A key methodological turn was the subsequent use of Deep Learning techniques on images of any components of a plant. Then we studied and compared the accuracy of a Deep Learning approach to do identifications based on datasets of images of fresh plants and compared it with datasets of herbarium sheet images for the first time. Among the results obtained during this research, potential biases in automatic plant identification dataset were found and characterized. Feasibility of doing transfer learning between different regions of the world was also proven. Even more importantly, it was for the first time demonstrated that herbarium sheets are a good resource to do identifications of plants mounted on herbarium sheets, which provides additional levels of importance to herbaria around the globe. Finally, as a culmination of this research path, this document presents the results of developing a novel multi-level classification approach that uses knowledge about higher taxonomic levels to carry out not only family and genus level identifications but also to try to improve the accuracy of species level identifications. This last step focuses on the creation of a hierarchical loss function based on known plant taxonomies, coupled with multilevel Deep Learning architectures to guide the model optimization with the prior knowledge of a given class hierarchy.En la última década, la identificación automática de organismos basada en técnicas de visión artificial ha sido un tema popular tanto entre los científicos de la biodiversidad como para los especialistas en aprendizaje automático. Al principio, las plantas se volvieron particularmente atractivas como tema de estudio por dos razones principales. Por un lado, los inventarios rápidos y precisos de plantas son críticos para la conservación de la biodiversidad; por ejemplo, son indispensables para realizar inventarios de ecosistemas, definir modelos para pagos de servicios ambientales y rastrear poblaciones de especies de plantas invasoras, entre otros. Por otro lado, las plantas son un grupo más manejable que, por ejemplo, los insectos. En primer lugar, la cantidad de especies es menor (alrededor de 400,000 en comparación con más de 8 millones de insectos). En segundo lugar, la comunidad científica las comprende mejor, en particular con respecto a sus características morfométricas. En tercer lugar, existen grandes bases de datos de imágenes digitales de plantas generadas tanto por científicos como por el público en general. Finalmente, un enfoque incremental basado primero en "elementos planos" como hojas y luego en toda la planta hizo posible el uso de técnicas de visión por computadora desde el principio. Como resultado, incluso las aplicaciones móviles para el público en general están disponibles en la actualidad. Este documento presenta los resultados clave obtenidos mientras se aborda el problema general de automatizar por completo la identificación de especies de plantas basándose únicamente en imágenes. Describe los hallazgos clave en un camino de investigación que comenzó con un alcance restringido, a saber, la identificación de plantas de Costa Rica mediante el uso de un enfoque morfométrico que considera imágenes de hojas frescas solamente. Luego, se incluyeron especies de otras regiones del mundo, pero todavía se utilizaban extractores de características hechos a mano. Un giro metodológico clave fue el uso posterior de técnicas de aprendizaje profundo (deep learning) en imágenes de cualquier componente de una planta. Luego, estudiamos y comparamos la exactitud de un enfoque de aprendizaje profundo para realizar identificaciones basadas en conjuntos de datos de imágenes de plantas frescas y las comparamos con conjuntos de datos de imágenes de hojas de herbario por primera vez. Entre los resultados obtenidos durante esta investigación, se encontraron y caracterizaron posibles sesgos en el conjunto de datos de identificación automática de plantas. La viabilidad de hacer un aprendizaje de transferencia (transfer learning) entre diferentes regiones del mundo también se demostró. Aún más importante, por primera vez se demostró que las láminas de herbario son un buen recurso para hacer identificaciones de plantas montadas sobre láminas de herbario, lo que proporciona niveles adicionales de importancia para herbarios en todo el mundo. Finalmente, como una culminación de este camino de investigación, este documento presenta los resultados del desarrollo de un nuevo enfoque de clasificación multi-nivel (multi-level) que utiliza el conocimiento sobre niveles taxonómicos superiores para llevar a cabo identificaciones a nivel de familia y género, y también para tratar de mejorar la exactitud de identificaciones a nivel de especie. Este último paso se centra en la creación de una función de pérdida jerárquica basada en taxonomías de plantas conocidas, junto con arquitecturas de aprendizaje profundo de niveles múltiples para guiar la optimización del modelo con el conocimiento previo de una jerarquía de clases dada

    Unveiling the frontiers of deep learning: innovations shaping diverse domains

    Full text link
    Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table

    Large-scale Content-based Visual Information Retrieval

    Get PDF
    Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. Such methods have been intensively studied in the multimedia community to allow managing the massive amount of raw multimedia documents created every day (e.g. video will account to 84% of U.S. internet traffic by 2018). Recent years have consequently witnessed a consistent growth of content-aware and multi-modal search engines deployed on massive multimedia data. Popular multimedia search applications such as Google images, Youtube, Shazam, Tineye or MusicID clearly demonstrated that the first generation of large-scale audio-visual search technologies is now mature enough to be deployed on real-world big data. All these successful applications did greatly benefit from 15 years of research on multimedia analysis and efficient content-based indexing techniques. Yet the maturity reached by the first generation of content-based search engines does not preclude an intensive research activity in the field. There is actually still a lot of hard problems to be solved before we can retrieve any information in images or sounds as easily as we do in text documents. Content-based search methods actually have to reach a finer understanding of the contents as well as a higher semantic level. This requires modeling the raw signals by more and more complex and numerous features, so that the algorithms for analyzing, indexing and searching such features have to evolve accordingly. This thesis describes several of my works related to large-scale content-based information retrieval. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services

    Collaborative Learning of Fine-grained Visual Data

    Get PDF
    Problem: Deep learning based vision systems have achieved near human accuracy in recognizing coarse object categories from visual data. But recognizing fine-grained sub-categories remains an open problem. Tasks like fine-grained species recognition poses further challenges: significant background variation compared to subtle difference between objects, high class imbalance due to scarcity of samples for endangered species, cost of domain expert annotations and labeling, etc. Methodology: The existing approaches, like transfer learning, to solve the problem of learning small specialized datasets are still inadequate in case of fine-grained sub-categories. The hypothesis of this work is that collaborative filters should be incorporated into the present learning frameworks to better address these challenges. The intuition comes from the fact that collaborative representation based classifiers have been earlier used for face recognition problems which present similar challenges. Outcomes: Keeping the above hypothesis in mind, the thesis achieves the following objectives: 1) It demonstrates the suitability of collaborative classifiers for fine-grained recognition 2) It expands the state-of-the-art by incorporating automated background suppression into collaborative classification formulation 3) It incorporates the collaborative cost function into supervised learning (deep convolutional network) and unsupervised learning (clustering algorithms) 4) Lastly, during the work several benchmark fine-grained image datasets have been introduced on NZ and Indian butterflies and bird species recognition

    Learning from small and imbalanced dataset of images using generative adversarial neural networks.

    Get PDF
    The performance of deep learning models is unmatched by any other approach in supervised computer vision tasks such as image classification. However, training these models requires a lot of labeled data, which are not always available. Labelling a massive dataset is largely a manual and very demanding process. Thus, this problem has led to the development of techniques that bypass the need for labelling at scale. Despite this, existing techniques such as transfer learning, data augmentation and semi-supervised learning have not lived up to expectations. Some of these techniques do not account for other classification challenges, such as a class-imbalance problem. Thus, these techniques mostly underperform when compared with fully supervised approaches. In this thesis, we propose new methods to train a deep model on image classification with a limited number of labeled examples. This was achieved by extending state-of-the-art generative adversarial networks with multiple fake classes and network switchers. These new features enabled us to train a classifier using large unlabeled data, while generating class specific samples. The proposed model is label agnostic and is suitable for different classification scenarios, ranging from weakly supervised to fully supervised settings. This was used to address classification challenges with limited labeled data and a class-imbalance problem. Extensive experiments were carried out on different benchmark datasets. Firstly, the proposed approach was used to train a classification model and our findings indicated that the proposed approach achieved better classification accuracies, especially when the number of labeled samples is small. Secondly, the proposed approach was able to generate high-quality samples from class-imbalance datasets. The samples' quality is evident in improved classification performances when generated samples were used in neutralising class-imbalance. The results are thoroughly analyzed and, overall, our method showed superior performances over popular resampling technique and the AC-GAN model. Finally, we successfully applied the proposed approach as a new augmentation technique to two challenging real-world problems: face with attributes and legacy engineering drawings. The results obtained demonstrate that the proposed approach is effective even in extreme cases

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

    Exploring Animal Behavior Through Sound: Volume 1

    Get PDF
    This open-access book empowers its readers to explore the acoustic world of animals. By listening to the sounds of nature, we can study animal behavior, distribution, and demographics; their habitat characteristics and needs; and the effects of noise. Sound recording is an efficient and affordable tool, independent of daylight and weather; and recorders may be left in place for many months at a time, continuously collecting data on animals and their environment. This book builds the skills and knowledge necessary to collect and interpret acoustic data from terrestrial and marine environments. Beginning with a history of sound recording, the chapters provide an overview of off-the-shelf recording equipment and analysis tools (including automated signal detectors and statistical methods); audiometric methods; acoustic terminology, quantities, and units; sound propagation in air and under water; soundscapes of terrestrial and marine habitats; animal acoustic and vibrational communication; echolocation; and the effects of noise. This book will be useful to students and researchers of animal ecology who wish to add acoustics to their toolbox, as well as to environmental managers in industry and government
    corecore