    Incrementally Learning Objects by Touch: Online Discriminative and Generative Models for Tactile-Based Recognition

    The EnMAP Managed Vegetation Scientific Processor

    Nach jahrelanger wissenschaftlicher und technischer Vorbereitungszeit wird voraussichtlich Ende des Jahres 2020 der Start der orbitalen Phase einer unbemannten deutschen Weltraum-Mission initiiert. Das Environmental Mapping and Analysis Program (EnMAP) wird an Bord des gleichnamigen Satelliten einen hyperspektralen Sensor zur Erfassung terrestrischer Oberflächen tragen. In den Umweltdisziplinen zur Erforschung von Ökosystemen, landwirtschaftlicher, forstwirtschaftlicher und urbaner Flächen, im Bereich der Küsten- und Inlandsgewässer sowie der Geologie und Bodenkunde bereitete man sich im Vorfeld des Starts auf die kommenden Daten vor. Zwar existiert bereits eine Vielzahl an Algorithmen zur wissenschaftlichen Analyse von spektralen Daten, allerdings ergeben sich auch neue Herausforderungen, da die EnMAP-Mission bislang im weltweiten Kontext der Fernerkundung einzigartig ist. Die Abdeckung des vollen optischen Spektrums (420 nm – 2450 nm) in Verbindung mit einer moderaten räumlichen Auflösung von 30 m und einem hohen Signal-Rausch-Verhältnis von mindestens 180 im kurzwelligen Infrarot und über 400 im sichtbaren Spektrum, ermöglichen eine Aufnahmequalität, die bislang nur von flugzeuggestützten Systemen erreicht werden konnte. Die Bemühungen in dieser Dissertation umfassen Aktivitäten in der wissenschaftlichen Vorbereitungsphase zu agrargeographischen Fragestellungen. Algorithmen und Tools zur Analyse der hyperspektralen Daten werden kostenlos im QGIS-Plugin EnMAP-Box 3 zur Verfügung gestellt. Die drängenden Fragen im Agrarsektor drehen sich hierbei um die Ableitung biochemischer und biophysikalischer Parameter aus Fernerkundungsdaten, weshalb die übergeordnete Problemstellung des Promotionsvorhabens die Entwicklung eines wissenschaftsbasierten EnMAP-Tools für bewirtschaftete Vegetationsflächen (EnMAP Managed Vegetation Scientific Processor) darstellt. Zu Beginn wurde eine umfassende Feldkampagne geplant, welche ab April 2014 umgesetzt wurde. Neben der spektralen Erfassung von Blatt-, Bestands- und Bodensignaturen in einem Winterweizen- und einem Maisfeld erfolgte auch die Messung wesentlicher Pflanzenparameter an den exakt gleichen Positionen. Hierzu zählt die non-destruktive Ableitung des Blattflächenindex (LAI), des Blattchlorophyllgehalts (Ccab), des Blattwassergehalts (EWT oder Cw), des relativen Blatttrockengewichts (LMA oder Cm), des mittleren Blattneigungswinkels im Bestand (ALIA) sowie weiterer sekundärer Parameter wie Wuchshöhe, das phänologisches Stadium und der Sonnenvektor. Um die Fähigkeit des späteren EnMAP-Satelliten sich um bis zu 30° orthogonal zur Flugrichtung zu kippen nachzustellen, wurden die spektralen Aufnahmen aus verschiedenen Betrachtungswinkeln erstellt, die dieser Aufnahme-Geometrien nachempfunden sind. Ein gängiges Verfahren zur Ableitung der relevanten Pflanzenparameter ist die Verwendung des Strahlungstransfermodells PROSAIL, welches das spektrale Signal einer Vegetationsfläche auf Basis der zugrundeliegenden biophysikalischen und biochemischen Parameter simuliert. Bei der Umkehr dieses Prozesses können ebendiese Variablen von gemessenen spektralen Daten abgeleitet werden. Hierzu wurde eine Datenbank (Look-Up-Table, LUT) aus PROSAIL-Modellläufen aufgebaut und die in den Feldkampagnen gemessenen Spektren mit dieser abgeglichen. Mit dieser Methode der LUT-Invertierung aus unterschiedlichen Aufnahmewinkeln konnten Genauigkeiten bei der LAI-Schätzung von 18 % und bei Blattchlorophyll von 20 % erzielt werden. Eine starke Anisotropie, also eine Reflexionsabhängigkeit von der Beleuchtungs- und Aufnahmerichtung, wurde bei Winterweizen vor allem für frühe Entwicklungsstadien festgestellt. Bei einer anschließenden Studie zur Unsicherheitsanalyse des Spektralmodells wurden PROSAIL-Ergebnisse, bei denen real gemessene Pflanzenparameter als Input dienten, den zugehörigen Reflektanzspektren gegenübergestellt. Es zeigten sich hierbei mitunter starke Abweichungen zwischen gemessenen und modellierten Spektren, die im Falle des Winterweizens einen saisonalen Verlauf zeichneten. Vor allem während frühen Wachstumsstadien tendierte das Modell dazu die Reflektanz im nahen Infrarot zu überschätzen, während es gegen Ende der Wachstumsperiode eher eine Unterschätzung aufwies. Als Unsicherheitsfaktor wurde die Parametrisierung des Modells ausgemacht, wenn der ALIA-Parameter als echter physikalische Blattwinkel interpretiert wird. Es wurde geschlussfolgert, dass eine Separierung von LAI und ALIA bei der Invertierung von PROSAIL eine korrekte Abschätzung der weniger sensitiven Parameter behindert. Die Erstellung des Vegetations-Prozessors erforderte die Verwendung von Regressions-Algorithmen des maschinellen Lernens (MLRA), da eine Verteilung von großen LUTs an die User nicht praktikabel wäre. Die MLRAs wurden an synthetischen Datensätzen trainiert, wobei zunächst die Optimierung der Hyperparameter im Vordergrund stand, bevor die Anwendung an echten Spektraldaten unternommen wurde. Es konnten dabei erst aussagekräftige Ergebnisse produziert werden, als die Trainingsdaten mit einem künstlichen Rauschen belegt wurden, da die Algorithmen unter einer Überanpassung an die Modellumgebung litten. Mithilfe des Prozessors konnten schließlich LAI, ALIA, Ccab und Cw aus hyperspektralen Daten abgeleitet werden. Künstliche neuronale Netze dienen dabei als Blackbox-Modelle, die in kurzer Zeit große Datenmengen verarbeiten können und somit einen entscheidenden Beitrag zur modernen angewandten Fernerkundung für eine breite User-Community leisten.After years of scientific and technical preparation, the launch of an unmanned German space-mission is planned to be initiated in 2020. The Environmental Mapping and Analysis Program (EnMAP) is going to provide an equally named hyperspectral imager to map land surfaces. Scientists of environmental disciplines of monitoring of ecosystems, agricultural, forestry and urban areas as well as coastal and inland waters, geology and soils prepared themselves for the upcoming data prior to the actual launch. Although there already exists a variety of useful algorithms for a profound analysis of spectral data, new challenges will arise given the uniqueness of the EnMAP-mission in the global context of remote sensing; i.e. coverage of the full range of the optical spectrum (420 nm – 2450 nm) in combination with a moderate spatial resolution of 30 m and a high signal-to-noise ratio of at least 180 in the shortwave infrared and above 400 in the visible spectrum. This enables an imaging quality which to this date has only been reached by airborne systems. The efforts of this dissertation comprise activities in the scientific preparation phase for agro-geographical tasks. Algorithms and tools for an analysis of hyperspectral data are being provided for free in the QGIS-plugin EnMAP-Box 3. Urgent questions in the agricultural sector revolve around the derivation of biochemical and biophysical parameters from remote sensing data. For this reason, the overarching objective of this promotion is the development of a scientific EnMAP-tool for managed areas of vegetation (EnMAP Managed Vegetation Scientific Processor). At first, an extensive field campaign was planned and then started in April, 2014. Apart from spectral observations of leaves, canopies and soils in a winter wheat and a maize field, also relevant plant parameters were acquired at the exact same spots. Namely, they are the Leaf Area Index (LAI), leaf chlorophyll content (Ccab), leaf water content (EWT or Cw), relative dry leaf weight (LMA or Cm), Average Leaf Inclination Angle (ALIA) as well as other secondary parameters like canopy height, phenological stage and the solar vector. Spectral measurements were captured from different observation angles to match ground data with the sensing geometry of the future EnMAP-satellite, which can be tilted up to 30° orthogonal to its direction of flight. A common procedure to derive relevant crop parameters is to make use of the radiative transfer model PROSAIL, which simulates the spectral signal of a vegetated surface based on biophysical and biochemical input parameters. If this process is reverted, said parameters can be derived from measured spectral data. To do so, a Look-Up-Table (LUT) is built containing model runs of PROSAIL and then subsequently compared against spectra from the field campaigns. With this approach of LUT-inversions from different observation angles, an accuracy of 18 % could be achieved for LAI and 20 % for Ccab. Strong anisotropic effects, i.e. dependence on illumination geometry and sensor orientation, were identified for winter wheat mainly in the early stages of plant development. In a consecutive study about uncertainties of the spectral model, PROSAIL results fed with in situ measured crop parameters as input, were opposed to their associated reflectance signatures. A strong deviation between measured and modelled spectra was observed, which – in the case of winter wheat – showed a seasonal behavior. The model tended to overestimate reflectances in the near infrared for early phenological stages and to underestimate them at end of the growing period. The parametrization of the model was identified as an uncertainty factor if the ALIA parameter is interpreted as true physical leaf inclinations. It was concluded that a separation of LAI and ALIA at inversion of PROSAIL prevents an adequate estimation of the less sensitive parameters. The development of the vegetation processor required the use of Machine Learning Regression Algorithms (MLRA), since distribution of large LUTs to the user would be impracticable. The MLRAs were trained with synthetic datasets with primary importance to optimize their hyperparameters, before attempting to apply the algorithms to real spectral data. Significant results could not be obtained until training data were altered with artificial noise, because algorithms suffered from overfitting to the model environment. Executing the processor allowed to derive LAI, ALIA, Ccab and Cw from hyperspectral data. Artificial neural networks served as black box models, which digest great amount of data in a short period of time and thus make a decisive contribution to modern applied remote sensing with relevance for a broad user-community

    Towards Spatial Queries over Phenomena in Sensor Networks

    Today, technology developments enable inexpensive production and deployment of tiny sensing and computing nodes. Networked through wireless radio, such senor nodes form a new platform, wireless sensor networks, which provide novel ability to monitor spatiotemporally continuous phenomena. By treating a wireless sensor network as a database system, users can pose SQL-based queries over phenomena without needing to program detailed sensor node operations. DBMS-internally, intelligent and energyefficient data collection and processing algorithms have to be implemented to support spatial query processing over sensor networks. This dissertation proposes spatial query support for two views of continuous phenomena: field-based and object-based. A field-based view of continuous phenomena depicts them as a value distribution over a geographical area. However, due to the discrete and comparatively sparse distribution of sensor nodes, estimation methods are necessary to generate a field-based query result, and it has to be computed collaboratively ‘in-the-network’ due to energy constraints. This dissertation proposes SWOP, an in-network algorithm using Gaussian Kernel estimation. The key contribution is the use of a small number of Hermite coefficients to approximate the Gaussian Kernel function for sub-clustered sensor nodes, and processes the estimation result efficiently. An object-based view of continuous phenomena is interested in aspects such as the boundary of an ‘interesting region’ (e.g. toxic plume). This dissertation presents NED, which provides object boundary detection in sensor networks. NED encodes partial event estimation results based on confidence levels into optimized, variable length messages exchanged locally among neighboring sensor nodes to save communication cost. Therefore, sensor nodes detect objects and boundaries based on moving averages to eliminate noise effects and enhance detection quality. Furthermore, the dissertation proposes the SNAKE-based approach, which uses deformable curves to track the spatiotemporal changes of such objects incrementally in sensor networks. In the proposed algorithm, only neighboring nodes exchange messages to maintain the curve structures. Based on in-network tracking of deformable curves, other types of spatial and spatiotemporal properties of objects, such as area, can be provided by the sensor network. The experimental results proved that our approaches are resource friendly within the constrained sensor networks, while providing high quality query results

    On Novel Machine Learning Approaches for Acoustic Emission Source Localisation: A Probabilistic Perspective

    With the objective of making engineering infrastructure safer and more cost-effective to operate and maintain, the use of automated strategies for monitoring damage in structures and high value assets are becoming increasingly common. A critical component in the assessment of a structure’s condition is the localisation of defects, with a promising solution the monitoring of acoustic emissions, a technique concerned with passively listening to ultrasonic signals generated by damage mechanisms. With that said, a significant barrier to a more widespread adoption of techniques of this nature are their use in structures with intricate geometrical features and anisotropic materials. In these structures, propagation paths are complex, material parameters often unknown, with stochasticity and a deficiency in complete physical understanding introducing sources of uncertainty that are often unaccounted for. The work contained in this thesis develops and extends a probabilistic framework for localising acoustic emissions in complex structures, handling uncertainty in a principled manner through Bayesian inference. A forward mapping of expected arrival time information is first learnt through the use of Gaussian process regression. For an event with an unknown origin, it is shown that these maps can be used to quantify a likelihood of emission location, providing probable damage locations on the structure. Next, the use of a heteroscedastic noise model is presented, allowing predictions made by the localisation model to be locally-weighted such that sensors contribute to the prediction relative to the quality of coverage offered, returning a more accurate, confident and robust localisation methodology. On the topic of the practicality of implementing the proposed approach, the inclusion of physical insight is considered within a grey-box framework to constrain the Gaussian process to abide by known physical laws. It is demonstrated that the constraints improve performance where the availability of training data reduces, increasing the feasibility of implementing the developed methodology. Finally, localisation is extended to cases where the geometry is not most appropriately characterised in Euclidean space, such as for roller-element and many other types of bearings. It is demonstrated how localisation may also be performed in a condition monitoring setting, as well as demonstrating the ability of the method to handle measurements that are contaminated with significant noise levels

    Sparse Cholesky factorization by greedy conditional selection

    Full text link
    Dense kernel matrices resulting from pairwise evaluations of a kernel function arise naturally in machine learning and statistics. Previous work in constructing sparse approximate inverse Cholesky factors of such matrices by minimizing Kullback-Leibler divergence recovers the Vecchia approximation for Gaussian processes. These methods rely only on the geometry of the evaluation points to construct the sparsity pattern. In this work, we instead construct the sparsity pattern by leveraging a greedy selection algorithm that maximizes mutual information with target points, conditional on all points previously selected. For selecting kk points out of NN, the naive time complexity is O(Nk4)\mathcal{O}(N k^4), but by maintaining a partial Cholesky factor we reduce this to O(Nk2)\mathcal{O}(N k^2). Furthermore, for multiple (mm) targets we achieve a time complexity of O(Nk2+Nm2+m3)\mathcal{O}(N k^2 + N m^2 + m^3), which is maintained in the setting of aggregated Cholesky factorization where a selected point need not condition every target. We apply the selection algorithm to image classification and recovery of sparse Cholesky factors. By minimizing Kullback-Leibler divergence, we apply the algorithm to Cholesky factorization, Gaussian process regression, and preconditioning with the conjugate gradient, improving over kk-nearest neighbors selection