222 research outputs found

    An uncertainty prediction approach for active learning - application to earth observation

    Get PDF
    Mapping land cover and land usage dynamics are crucial in remote sensing since farmers are encouraged to either intensify or extend crop use due to the ongoing rise in the world’s population. A major issue in this area is interpreting and classifying a scene captured in high-resolution satellite imagery. Several methods have been put forth, including neural networks which generate data-dependent models (i.e. model is biased toward data) and static rule-based approaches with thresholds which are limited in terms of diversity(i.e. model lacks diversity in terms of rules). However, the problem of having a machine learning model that, given a large amount of training data, can classify multiple classes over different geographic Sentinel-2 imagery that out scales existing approaches remains open. On the other hand, supervised machine learning has evolved into an essential part of many areas due to the increasing number of labeled datasets. Examples include creating classifiers for applications that recognize images and voices, anticipate traffic, propose products, act as a virtual personal assistant and detect online fraud, among many more. Since these classifiers are highly dependent from the training datasets, without human interaction or accurate labels, the performance of these generated classifiers with unseen observations is uncertain. Thus, researchers attempted to evaluate a number of independent models using a statistical distance. However, the problem of, given a train-test split and classifiers modeled over the train set, identifying a prediction error using the relation between train and test sets remains open. Moreover, while some training data is essential for supervised machine learning, what happens if there is insufficient labeled data? After all, assigning labels to unlabeled datasets is a time-consuming process that may need significant expert human involvement. When there aren’t enough expert manual labels accessible for the vast amount of openly available data, active learning becomes crucial. However, given a large amount of training and unlabeled datasets, having an active learning model that can reduce the training cost of the classifier and at the same time assist in labeling new data points remains an open problem. From the experimental approaches and findings, the main research contributions, which concentrate on the issue of optical satellite image scene classification include: building labeled Sentinel-2 datasets with surface reflectance values; proposal of machine learning models for pixel-based image scene classification; proposal of a statistical distance based Evidence Function Model (EFM) to detect ML models misclassification; and proposal of a generalised sampling approach for active learning that, together with the EFM enables a way of determining the most informative examples. Firstly, using a manually annotated Sentinel-2 dataset, Machine Learning (ML) models for scene classification were developed and their performance was compared to Sen2Cor the reference package from the European Space Agency – a micro-F1 value of 84% was attained by the ML model, which is a significant improvement over the corresponding Sen2Cor performance of 59%. Secondly, to quantify the misclassification of the ML models, the Mahalanobis distance-based EFM was devised. This model achieved, for the labeled Sentinel-2 dataset, a micro-F1 of 67.89% for misclassification detection. Lastly, EFM was engineered as a sampling strategy for active learning leading to an approach that attains the same level of accuracy with only 0.02% of the total training samples when compared to a classifier trained with the full training set. With the help of the above-mentioned research contributions, we were able to provide an open-source Sentinel-2 image scene classification package which consists of ready-touse Python scripts and a ML model that classifies Sentinel-2 L1C images generating a 20m-resolution RGB image with the six studied classes (Cloud, Cirrus, Shadow, Snow, Water, and Other) giving academics a straightforward method for rapidly and effectively classifying Sentinel-2 scene images. Additionally, an active learning approach that uses, as sampling strategy, the observed prediction uncertainty given by EFM, will allow labeling only the most informative points to be used as input to build classifiers; Sumário: Uma Abordagem de Previsão de Incerteza para Aprendizagem Ativa – Aplicação à Observação da Terra O mapeamento da cobertura do solo e a dinâmica da utilização do solo são cruciais na deteção remota uma vez que os agricultores são incentivados a intensificar ou estender as culturas devido ao aumento contínuo da população mundial. Uma questão importante nesta área é interpretar e classificar cenas capturadas em imagens de satélite de alta resolução. Várias aproximações têm sido propostas incluindo a utilização de redes neuronais que produzem modelos dependentes dos dados (ou seja, o modelo é tendencioso em relação aos dados) e aproximações baseadas em regras que apresentam restrições de diversidade (ou seja, o modelo carece de diversidade em termos de regras). No entanto, a criação de um modelo de aprendizagem automática que, dada uma uma grande quantidade de dados de treino, é capaz de classificar, com desempenho superior, as imagens do Sentinel-2 em diferentes áreas geográficas permanece um problema em aberto. Por outro lado, têm sido utilizadas técnicas de aprendizagem supervisionada na resolução de problemas nas mais diversas áreas de devido à proliferação de conjuntos de dados etiquetados. Exemplos disto incluem classificadores para aplicações que reconhecem imagem e voz, antecipam tráfego, propõem produtos, atuam como assistentes pessoais virtuais e detetam fraudes online, entre muitos outros. Uma vez que estes classificadores são fortemente dependente do conjunto de dados de treino, sem interação humana ou etiquetas precisas, o seu desempenho sobre novos dados é incerta. Neste sentido existem propostas para avaliar modelos independentes usando uma distância estatística. No entanto, o problema de, dada uma divisão de treino-teste e um classificador, identificar o erro de previsão usando a relação entre aqueles conjuntos, permanece aberto. Mais ainda, embora alguns dados de treino sejam essenciais para a aprendizagem supervisionada, o que acontece quando a quantidade de dados etiquetados é insuficiente? Afinal, atribuir etiquetas é um processo demorado e que exige perícia, o que se traduz num envolvimento humano significativo. Quando a quantidade de dados etiquetados manualmente por peritos é insuficiente a aprendizagem ativa torna-se crucial. No entanto, dada uma grande quantidade dados de treino não etiquetados, ter um modelo de aprendizagem ativa que reduz o custo de treino do classificador e, ao mesmo tempo, auxilia a etiquetagem de novas observações permanece um problema em aberto. A partir das abordagens e estudos experimentais, as principais contribuições deste trabalho, que se concentra na classificação de cenas de imagens de satélite óptico incluem: criação de conjuntos de dados Sentinel-2 etiquetados, com valores de refletância de superfície; proposta de modelos de aprendizagem automática baseados em pixels para classificação de cenas de imagens de satétite; proposta de um Modelo de Função de Evidência (EFM) baseado numa distância estatística para detetar erros de classificação de modelos de aprendizagem; e proposta de uma abordagem de amostragem generalizada para aprendizagem ativa que, em conjunto com o EFM, possibilita uma forma de determinar os exemplos mais informativos. Em primeiro lugar, usando um conjunto de dados Sentinel-2 etiquetado manualmente, foram desenvolvidos modelos de Aprendizagem Automática (AA) para classificação de cenas e seu desempenho foi comparado com o do Sen2Cor – o produto de referência da Agência Espacial Europeia – tendo sido alcançado um valor de micro-F1 de 84% pelo classificador, o que representa uma melhoria significativa em relação ao desempenho Sen2Cor correspondente, de 59%. Em segundo lugar, para quantificar o erro de classificação dos modelos de AA, foi concebido o Modelo de Função de Evidência baseado na distância de Mahalanobis. Este modelo conseguiu, para o conjunto de dados etiquetado do Sentinel-2 um micro-F1 de 67,89% na deteção de classificação incorreta. Por fim, o EFM foi utilizado como uma estratégia de amostragem para a aprendizagem ativa, uma abordagem que permitiu atingir o mesmo nível de desempenho com apenas 0,02% do total de exemplos de treino quando comparado com um classificador treinado com o conjunto de treino completo. Com a ajuda das contribuições acima mencionadas, foi possível desenvolver um pacote de código aberto para classificação de cenas de imagens Sentinel-2 que, utilizando num conjunto de scripts Python, um modelo de classificação, e uma imagem Sentinel-2 L1C, gera a imagem RGB correspondente (com resolução de 20m) com as seis classes estudadas (Cloud, Cirrus, Shadow, Snow, Water e Other), disponibilizando à academia um método direto para a classificação de cenas de imagens do Sentinel-2 rápida e eficaz. Além disso, a abordagem de aprendizagem ativa que usa, como estratégia de amostragem, a deteção de classificacão incorreta dada pelo EFM, permite etiquetar apenas os pontos mais informativos a serem usados como entrada na construção de classificadores

    Use of Proteomics to Probe Dynamic Changes in Cyanobacteria

    Get PDF
    Cyanobacteria are unicellular photosynthetic microorganisms that capture and convert light energy to chemical energy, which is the precursor for feed, fuel, and food. These oxygenic phototrophs appear blue-green in color due to the blue bilin pigments in their phycobilisomes and green chlorophyll pigments in their photosystems. They also have diverse morphologies, and thrive in terrestrial, marine water, fresh water, as well as extreme environments. Cyanobacteria have developed a number of protective mechanisms and adaptive responses that allow the photosynthetic process to operate optimally under diverse and extreme conditions. Prolonged deprivation of essential nutrients, such as nitrogen and sulfur, commonly found in the natural environments cyanobacteria grow in, can disrupt crucial metabolic activities and promote the production of lethal reactive oxygen species. The dynamic remodeling of protein complexes and structures facilitates adaptation to environmental stresses, however, specific protein modifications are poorly understood. Synthetic and systems biology approaches have been used to study how photosynthetic microorganisms optimize their cellular metabolism in response to adverse environmental conditions. To gain insights on how cyanobacteria cope with environmental changes, we created a global proteomics map of redox-sensitive amino acid residues and examined the degradation of light harvesting apparatus in cyanobacteria. These studies offered significant insights into the broad redox regulation and protein degradation, advancing knowledge of how photosynthetic microbial cells dynamically rely on protective mechanisms to survive changing environmental conditions

    Tree Peony Species Are a Novel Resource for Production of α-Linolenic Acid

    Get PDF
    Tree peony is known worldwide for its excellent ornamental and medical values, but recent reports that their seeds contain over 40% α-linolenic acid (ALA), an essential fatty acid for humans drew additional interest of biochemists. To understand the key factors that contribute to this rich accumulation of ALA, we carried out a comprehensive study of oil accumulation in developing seeds of nine wild tree peony species. The fatty acid content and composition was highly variable among the nine species; however, we selected a high- (P. rockii) and low-oil (P. lutea) accumulating species for a comparative transcriptome analysis. Similar to other oilseed transcriptomic studies, upregulation of select genes involved in plastidial fatty acid synthesis, and acyl editing, desaturation and triacylglycerol assembly in the endoplasmic reticulum was noted in seeds of P. rockii relative to P. lutea. Also, in association with the ALA content, transcript levels for fatty acid desaturases (SAD, FAD2 and FAD3), which encode for enzymes necessary for polyunsaturated fatty acid synthesis were higher in P. rockii compared to P. lutea. We further showed that the overexpression of PrFAD2 and PrFAD3 in Arabidopsis increased linoleic and α-linolenic acid content, respectively and modulated their final ratio in the seed oil. In conclusion, we identified the key steps that contribute to efficient ALA synthesis and validated the necessary desaturases in P. rockii that are responsible for not only increasing oil content but also modulating 18:2/18:3 ratio in seeds. Together, these results will aid to improve essential fatty acid content in seeds of tree peonies and other crops of agronomic interest

    UVR8 mediated spatial differences as a prerequisite for UV-B induced inflorescence phototropism

    Get PDF
    In Arabidopsis hypocotyls, phototropins are the dominant photoreceptors for the positive phototropism response towards unilateral ultraviolet-B (UV-B) radiation. We report a stark contrast of response mechanism with inflorescence stems with a central role for UV RESISTANCE LOCUS 8 (UVR8). The perception of UV-B occurs mainly in the epidermis and cortex with a lesser contribution of the endodermis. Unilateral UV-B exposure does not lead to a spatial difference in UVR8 protein levels but does cause differential UVR8 signal throughout the stem with at the irradiated side 1) increase of the transcription factor ELONGATED HYPOCOTYL 5 (HY5), 2) an associated strong activation of flavonoid biosynthesis genes and flavonoid accumulation, 3) increased GA2oxidase expression, diminished gibberellin1 levels and accumulation of DELLA protein REPRESSOR OF GA1 (RGA) and, 4) increased expression of the auxin transport regulator, PINOID, contributing to local diminished auxin signalling. Our molecular findings are in support of the Blaauw theory (1919), suggesting that differential growth occurs trough unilateral photomorphogenic growth inhibition. Together the data indicate phototropin independent inflorescence phototropism through multiple locally UVR8-regulated hormone pathways

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF

    Future of Sustainable Agriculture in Saline Environments

    Get PDF
    Food production on present and future saline soils deserves the world’s attention particularly because food security is a pressing issue, millions of hectares of degraded soils are available worldwide, freshwater is becoming increasingly scarce, and the global sea-level rise threatens food production in fertile coastal lowlands. Future of Sustainable Agriculture in Saline Environments aims to showcase the global potential of saline agriculture. The book covers the essential topics, such as policy and awareness, soil management, future crops, and genetic developments, all supplemented by case studies that show how this knowledge has been applied. It offers an overview of current research themes and practical cases focused on enhancing food production on saline lands. FEATURES Describes the critical role of the revitalization of salt-degraded lands in achieving sustainability in agriculture on a global scale Discusses practical solutions toward using drylands and delta areas threatened by salinity for sustainable food production Presents strategies for adaptation to climate change and sea-level rise through food production under saline conditions Addresses the diverse aspects of crop salt tolerance and microbiological associations Highlights the complex problem of salinity and waterlogging and safer management of poor-quality water, supplemented by case studies A PDF version of this book is available for free in Open Access at www.taylorfrancis.com. It has been made available under a Creative Commons Attribution-Non Commercial-No Derivatives 4.0 license

    Mining complex trees for hidden fruit : a graph–based computational solution to detect latent criminal networks : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Technology at Massey University, Albany, New Zealand.

    Get PDF
    The detection of crime is a complex and difficult endeavour. Public and private organisations – focusing on law enforcement, intelligence, and compliance – commonly apply the rational isolated actor approach premised on observability and materiality. This is manifested largely as conducting entity-level risk management sourcing ‘leads’ from reactive covert human intelligence sources and/or proactive sources by applying simple rules-based models. Focusing on discrete observable and material actors simply ignores that criminal activity exists within a complex system deriving its fundamental structural fabric from the complex interactions between actors - with those most unobservable likely to be both criminally proficient and influential. The graph-based computational solution developed to detect latent criminal networks is a response to the inadequacy of the rational isolated actor approach that ignores the connectedness and complexity of criminality. The core computational solution, written in the R language, consists of novel entity resolution, link discovery, and knowledge discovery technology. Entity resolution enables the fusion of multiple datasets with high accuracy (mean F-measure of 0.986 versus competitors 0.872), generating a graph-based expressive view of the problem. Link discovery is comprised of link prediction and link inference, enabling the high-performance detection (accuracy of ~0.8 versus relevant published models ~0.45) of unobserved relationships such as identity fraud. Knowledge discovery uses the fused graph generated and applies the “GraphExtract” algorithm to create a set of subgraphs representing latent functional criminal groups, and a mesoscopic graph representing how this set of criminal groups are interconnected. Latent knowledge is generated from a range of metrics including the “Super-broker” metric and attitude prediction. The computational solution has been evaluated on a range of datasets that mimic an applied setting, demonstrating a scalable (tested on ~18 million node graphs) and performant (~33 hours runtime on a non-distributed platform) solution that successfully detects relevant latent functional criminal groups in around 90% of cases sampled and enables the contextual understanding of the broader criminal system through the mesoscopic graph and associated metadata. The augmented data assets generated provide a multi-perspective systems view of criminal activity that enable advanced informed decision making across the microscopic mesoscopic macroscopic spectrum
    corecore