2,160 research outputs found

    The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation

    Full text link
    Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity. We show that they also excel in estimating optical flow and monocular depth, surprisingly, without task-specific architectures and loss functions that are predominant for these tasks. Compared to the point estimates of conventional regression-based methods, diffusion models also enable Monte Carlo inference, e.g., capturing uncertainty and ambiguity in flow and depth. With self-supervised pre-training, the combined use of synthetic and real data for supervised training, and technical innovations (infilling and step-unrolled denoising diffusion training) to handle noisy-incomplete training data, and a simple form of coarse-to-fine refinement, one can train state-of-the-art diffusion models for depth and optical flow estimation. Extensive experiments focus on quantitative performance against benchmarks, ablations, and the model's ability to capture uncertainty and multimodality, and impute missing values. Our model, DDVM (Denoising Diffusion Vision Model), obtains a state-of-the-art relative depth error of 0.074 on the indoor NYU benchmark and an Fl-all outlier rate of 3.26\% on the KITTI optical flow benchmark, about 25\% better than the best published method. For an overview see https://diffusion-vision.github.io

    LiDAR Domain Adaptation - Automotive 3D Scene Understanding

    Get PDF
    Umgebungswahrnehmung und Szeneverständnis spielen bei autonomen Fahrzeugen eine wesentliche Rolle. Ein Fahrzeug muss sich der Geometrie und Semantik seiner Umgebung bewusst sein, um das Verhalten anderer Verkehrsteilnehmer:innen vorherzusagen und sich selbst im fahrbaren Raum zu lokalisieren, um somit richtig zu navigieren. Heutzutage verwenden praktisch alle modernen Wahrnehmungssysteme für das automatisierte Fahren tiefe neuronale Netze. Um diese zu trainieren, werden enorme Datenmengen mit passenden Annotationen benötigt. Die Beschaffung der Daten ist relativ unaufwendig, da nur ein mit den richtigen Sensoren ausgestattetes Fahrzeug herumfahren muss. Die Erstellung von Annotationen ist jedoch ein sehr zeitaufwändiger und teurer Prozess. Erschwerend kommt hinzu, dass autonome Fahrzeuge praktisch überall (z.B. Europa und Asien, auf dem Land und in der Stadt) und zu jeder Zeit (z.B. Tag und Nacht, Sommer und Winter, Regen und Nebel) eingesetzt werden müssen. Dies erfordert, dass die Daten eine noch größere Anzahl unterschiedlicher Szenarien und Domänen abdecken. Es ist nicht praktikabel, Daten für eine solche Vielzahl von Domänen zu sammeln und zu annotieren. Wenn jedoch nur mit Daten aus einer Domäne trainiert wird, führt dies aufgrund von Unterschieden in den Daten zu einer schlechten Leistung in einer anderen Zieldomäne. Für eine sicherheitskritische Anwendung ist dies nicht akzeptabel. Das Gebiet der sogenannten Domänenanpassung führt Methoden ein, die helfen, diese Domänenlücken ohne die Verwendung von Annotationen aus der Zieldomäne zu schließen und somit auf die Entwicklung skalierbarer Wahrnehmungssysteme hinzuarbeiten. Die Mehrzahl der Arbeiten zur Domänenanpassung konzentriert sich auf die zweidimensionale Kamerawahrnehmung. In autonomen Fahrzeugen ist jedoch das dreidimensionale Verständnis der Szene essentiell, wofür heutzutage häufig LiDAR-Sensoren verwendet werden. Diese Dissertation befasst sich mit der Domänenanpassung für LiDAR-Wahrnehmung unter mehreren Aspekten. Zunächst wird eine Reihe von Techniken vorgestellt, die die Leistung und die Laufzeit von semantischen Segmentierungssystemen verbessern. Die gewonnenen Erkenntnisse werden in das Wahrnehmungsmodell integriert, das in dieser Dissertation verwendet wird, um die Wirksamkeit der vorgeschlagenen Domänenanpassungsansätze zu bewerten. Zweitens werden bestehende Ansätze diskutiert und Forschungslücken durch die Formulierung von offenen Forschungsfragen aufgezeigt. Um einige dieser Fragen zu beantworten, wird in dieser Dissertation eine neuartige quantitative Metrik vorgestellt. Diese Metrik erlaubt es, den Realismus von LiDAR-Daten abzuschätzen, der für die Leistung eines Wahrnehmungssystems entscheidend ist. So wird die Metrik zur Bewertung der Qualität von LiDAR-Punktwolken verwendet, die zum Zweck des Domänenmappings erzeugt werden, bei dem Daten von einer Domäne in eine anderen übertragen werden. Dies ermöglicht die Wiederverwendung von Annotationen aus einer Quelldomäne in der Zieldomäne. In einem weiteren Feld der Domänenanpassung wird in dieser Dissertation eine neuartige Methode vorgeschlagen, die die Geometrie der Szene nutzt, um domäneninvariante Merkmale zu lernen. Die geometrischen Informationen helfen dabei, die Domänenanpassungsfähigkeiten des Segmentierungsmodells zu verbessern und ohne zusätzlichen Mehraufwand bei der Inferenz die beste Leistung zu erzielen. Schließlich wird eine neuartige Methode zur Erzeugung semantisch sinnvoller Objektformen aus kontinuierlichen Beschreibungen vorgeschlagen, die – mit zusätzlicher Arbeit – zur Erweiterung von Szenen verwendet werden kann, um die Erkennungsfähigkeiten der Modelle zu verbessern. Zusammenfassend stellt diese Dissertation ein umfassendes System für die Domänenanpassung und semantische Segmentierung von LiDAR-Punktwolken im Kontext des autonomen Fahrens vor

    Detection of Power Line Supporting Towers via Interpretable Semantic Segmentation of 3D Point Clouds

    Get PDF
    The inspection and maintenance of energy transmission networks are demanding and crucial tasks for any transmission system operator. They rely on a combination of on-theground staff and costly low-flying helicopters to visually inspect the power grid structure. Recently, LiDAR-based inspections have shown the potential to accelerate and increase inspection precision. These high-resolution sensors allow one to scan an environment and store it in a 3D point cloud format for further processing and analysis by maintenance specialists to prevent fires and damage to the electrical system. However, this task is especially demanding to handle on time when we consider the extensive area that the transmission network covers. Nonetheless, the transition to point cloud data allows us to take advantage of Deep Learning to automate these inspections, by detecting collisions between the grid and the revolving scene. Deep Learning is a recent and powerful tool that has been successfully applied to a myriad of real-life problems, such as image recognition and speech generation. With the introduction of affordable LiDAR sensors, the application of Deep Learning on 3D data emerged, with numerous methods being proposed every day to address difficult problems, from 3D object detection to 3D point cloud segmentation. Alas, state-of-the-art methods are remarkably complex, composed of millions of trainable parameters, and take several weeks, if not months, to train on specific hardware, which makes it difficult for traditional companies, like utilities, to employ them. Therefore, we explore a novel mathematical framework that allows us to define tailored operators that incorporate prior knowledge regarding our problem. These operators are then integrated into a learning agent, called SCENE-Net, that detects power line supporting towers in 3D point clouds. SCENE-Net allows for the interpretability of its results, which is not possible in conventional models, it shows an efficient training and inference time of 85 mn and 20 ms on a regular laptop. Our model is composed of 11 trainable geometrical parameters, like the height of a cylinder, and has a Precision gain of 24% against a comparable CNN with 2190 parameters.A inspeção e manutenção de redes de transmissão de energia são tarefas cruciais para operadores de rede. Recentemente, foram adotadas inspeções utilizando sensores LiDAR de forma a acelerar este processo e aumentar a sua precisão. Estes sensores são objetos de alta precisão que conseguem inspecionar ambientes e guarda-los no formato de nuvens de pontos 3D, para serem posteriormente analisadas por specialistas que procuram prevenir fogos florestais e danos à estruta eléctrica. No entanto, esta tarefa torna-se bastante difícil de concluir em tempo útil pois a rede de transmissão é bastasnte vasta. Por isso, podemos tirar partido da transição para dados LiDAR e utilizar aprendizagem profunda para automatizar as inspeções à rede. Aprendizagem profunda é um campo recente e em grande desenvolvimento, sendo aplicado a vários problemas do nosso quotidiano e facilmente atinge um desempenho superior ao do ser humano, como em reconhecimento de imagens, geração de voz, entre outros. Com o desenvolvimento de sensores LiDAR acessíveis, o uso de aprendizagem profunda em dados 3D rapidamente se desenvolveu, apresentando várias metodologias novas todos os dias que respondem a problemas complexos, como deteção de objetos 3D. No entanto, modelos do estado da arte são incrivelmente complexos e compostos por milhões de parâmetros e demoram várias semanas, senão meses, a treinar em GPU potentes, o que dificulta a sua utilização em empresas tradicionais, como a EDP. Portanto, nós exploramos uma nova teoria matemática que nos permite definir operadores específicos que incorporaram conhecimento sobre o nosso problema. Estes operadores são integrados num modelo de aprendizagem prounda, designado SCENE-Net, que deteta torres de suporte de linhas de transmissão em nuvens de pontos. SCENE-Net permite a interpretação dos seus resultados, aspeto que não é possível com modelos convencionais, demonstra um treino eficiente de 85 minutos e tempo de inferência de 20 milissegundos num computador tradicional. O nosso modelo contém apenas 11 parâmetros geométricos, como a altura de um cilindro, e demonstra um ganho de Precisão de 24% quando comparado com uma CNN com 2190 parâmetros

    Satellite based methane emission estimation for flaring activities in oil and gas industry: A data-driven approach(SMEEF-OGI)

    Get PDF
    Klimaendringer, delvis utløst av klimagassutslipp, utgjør en kritisk global utfordring. Metan, en svært potent drivhusgass med et globalt oppvarmings potensial på 80 ganger karbondioksid, er en betydelig bidragsyter til denne krisen. Kilder til metanutslipp inkluderer olje- og gassindustrien, landbruket og avfallshåndteringen, med fakling i olje- og gassindustrien som en betydelig utslippskilde. Fakling, en standard prosess i olje- og gassindustrien, antas ofte å være 98 % effektiv ved omdannelse av metan til mindre skadelig karbondioksid. Nyere forskning fra University of Michigan, Stanford, Environmental Defense Fund og Scientific Aviation indikerer imidlertid at den allment aksepterte effektiviteten på 98 % av fakling ved konvertering av metan til karbondioksid, en mindre skadelig klimagass, kan være unøyaktig. Denne undersøkelsen revurderer fakkelprosessens effektivitet og dens rolle i metankonvertering. Dette arbeidet fokuserer på å lage en metode for uavhengig å beregne metanutslipp fra olje- og gassvirksomhet for å løse dette problemet. Satellittdata, som er et nyttig verktøy for å beregne klimagassutslipp fra ulike kilder, er inkludert i den foreslåtte metodikken. I tillegg til standard overvåkingsteknikker, tilbyr satellittdata en uavhengig, ikke-påtrengende, rimelig og kontinuerlig overvåkingstilnærming. På bakgrunn av dette er problemstillingen for dette arbeidet følgende "Hvordan kan en datadrevet tilnærming utvikles for å forbedre nøyaktigheten og kvaliteten på estimering av metanutslipp fra faklingsaktiviteter i olje- og gassindustrien, ved å bruke satellittdata fra utvalgte plattformer for å oppdage og kvantifisere fremtidige utslipp basert på maskinlæring mer effektivt?" For å oppnå dette ble følgende mål og aktiviteter utført. * Teoretisk rammeverk og sentrale begreper * Teknisk gjennomgang av dagens toppmoderne satellittplattformer og eksisterende litteratur. * Utvikling av et Proof of Concept * Foreslå en evaluering av metoden * Anbefalinger og videre arbeid Dette arbeidet har tatt i bruk en systematisk tilnærming, som starter med et omfattende teoretisk rammeverk for å forstå bruken av fakling, de miljømessige implikasjonene av metan, den nåværende «state-of-the-art» av forskning, og «state-of-the-art» i felt for fjernmåling via satellitter. Basert på rammeverket utviklet i de innledende fasene av dette arbeidet, ble det formulert en datadrevet metodikk, som benytter VIIRS-datasettet for å få geografiske områder av interesse. Hyperspektrale data og metandata ble samlet fra Sentinel-2 og Sentinel-5P satellittdatasettet. Denne informasjonen ble behandlet via en foreslått rørledning, med innledende justering og forbedring. I dette arbeidet ble bildene forbedret ved å beregne den normaliserte brennindeksen. Resultatet var et datasett som inneholdt plasseringen av kjente fakkelsteder, med data fra både Sentinel-2 og Sentinel-5P-satellitten. Resultatene understreker forskjellene i dekningen mellom Sentinel-2- og Sentinel-5P-data, en faktor som potensielt kan påvirke nøyaktigheten av metanutslippsestimater. De anvendte forbehandlingsteknikkene forbedret dataklarheten og brukervennligheten markant, men deres effektivitet kan avhenge av fakkelstedenes spesifikke egenskaper og rådatakvaliteten. Dessuten, til tross for visse begrensninger, ga kombinasjonen av Sentinel-2 og Sentinel-5P-data effektivt et omfattende datasett egnet for videre analyse. Avslutningsvis introduserer dette prosjektet en oppmuntrende metodikk for å estimere metanutslipp fra fakling i olje- og gassindustrien. Den legger et grunnleggende springbrett for fremtidig forskning, og forbedrer kontinuerlig presisjonen og kvaliteten på data for å bekjempe klimaendringer. Denne metodikken kan sees i flytskjemaet nedenfor. Basert på arbeidet som er gjort i dette prosjektet, kan fremtidig arbeid fokusere på å innlemme alternative kilder til metan data, utvide interesseområdene gjennom industrisamarbeid og forsøke å trekke ut ytterligere detaljer gjennom bildesegmenteringsmetoder. Dette prosjektet legger et grunnlag, og baner vei for påfølgende utforskninger å bygge videre på.Climate change, precipitated in part by greenhouse gas emissions, presents a critical global challenge. Methane, a highly potent greenhouse gas with a global warming potential of 80 times that of carbon dioxide, is a significant contributor to this crisis. Sources of methane emissions include the oil and gas industry, agriculture, and waste management, with flaring in the oil and gas industry constituting a significant emission source. Flaring, a standard process in the Oil and gas industry is often assumed to be 98% efficient when converting methane to less harmful carbon dioxide. However, recent research from the University of Michigan, Stanford, the Environmental Defense Fund, and Scientific Aviation indicates that the widely accepted 98% efficiency of flaring in converting methane to carbon dioxide, a less harmful greenhouse gas, may be inaccurate. This investigation reevaluates the flaring process's efficiency and its role in methane conversion. This work focuses on creating a method to independently calculate methane emissions from oil and gas activities to solve this issue. Satellite data, which is a helpful tool for calculating greenhouse gas emissions from various sources, is included in the suggested methodology. In addition to standard monitoring techniques, satellite data offers an independent, non-intrusive, affordable, and continuous monitoring approach. Based on this, the problem statement for this work is the following “How can a data-driven approach be developed to enhance the accuracy and quality of methane emission estimation from flaring activities in the Oil and Gas industry, using satellite data from selected platforms to detect and quantify future emissions based on Machine learning more effectively?" To achieve this, the following objectives and activities were performed. * Theoretical Framework and key concepts * Technical review of the current state-of-the-art satellite platforms and existing literature. * Development of a Proof of Concept * Proposing an evaluation of the method * Recommendations and further work This work has adopted a systematic approach, starting with a comprehensive theoretical framework to understand the utilization of flaring, the environmental implications of methane, the current state-of-the-art of research, and the state-of-the-art in the field of remote sensing via satellites. Based upon the framework developed during the initial phases of this work, a data-driven methodology was formulated, utilizing the VIIRS dataset to get geographical areas of interest. Hyperspectral and methane data were aggregated from the Sentinel-2 and Sentinel-5P satellite dataset. This information was processed via a proposed pipeline, with initial alignment and enhancement. In this work, the images were enhanced by calculating the Normalized Burn Index. The result was a dataset containing the location of known flare sites, with data from both the Sentinel-2, and the Sentinel-5P satellite. The results underscore the disparities in coverage between Sentinel-2 and Sentinel-5P data, a factor that could potentially influence the precision of methane emission estimates. The applied preprocessing techniques markedly enhanced data clarity and usability, but their efficacy may hinge on the flaring sites' specific characteristics and the raw data quality. Moreover, despite certain limitations, the combination of Sentinel-2 and Sentinel-5P data effectively yielded a comprehensive dataset suitable for further analysis. In conclusion, this project introduces an encouraging methodology for estimating methane emissions from flaring activities within the oil and gas industry. It lays a foundational steppingstone for future research, continually enhancing the precision and quality of data in combating climate change. This methodology can be seen in the flow chart below. Based on the work done in this project, future work could focus on incorporating alternative sources of methane data, broadening the areas of interest through industry collaboration, and attempting to extract further features through image segmentation methods. This project signifies a start, paving the way for subsequent explorations to build upon. Climate change, precipitated in part by greenhouse gas emissions, presents a critical global challenge. Methane, a highly potent greenhouse gas with a global warming potential of 80 times that of carbon dioxide, is a significant contributor to this crisis. Sources of methane emissions include the oil and gas industry, agriculture, and waste management, with flaring in the oil and gas industry constituting a significant emission source. Flaring, a standard process in the Oil and gas industry is often assumed to be 98% efficient when converting methane to less harmful carbon dioxide. However, recent research from the University of Michigan, Stanford, the Environmental Defense Fund, and Scientific Aviation indicates that the widely accepted 98% efficiency of flaring in converting methane to carbon dioxide, a less harmful greenhouse gas, may be inaccurate. This investigation reevaluates the flaring process's efficiency and its role in methane conversion. This work focuses on creating a method to independently calculate methane emissions from oil and gas activities to solve this issue. Satellite data, which is a helpful tool for calculating greenhouse gas emissions from various sources, is included in the suggested methodology. In addition to standard monitoring techniques, satellite data offers an independent, non-intrusive, affordable, and continuous monitoring approach. Based on this, the problem statement for this work is the following “How can a data-driven approach be developed to enhance the accuracy and quality of methane emission estimation from flaring activities in the Oil and Gas industry, using satellite data from selected platforms to detect and quantify future emissions based on Machine learning more effectively?" To achieve this, the following objectives and activities were performed. * Theoretical Framework and key concepts * Technical review of the current state-of-the-art satellite platforms and existing literature. * Development of a Proof of Concept * Proposing an evaluation of the method * Recommendations and further work This work has adopted a systematic approach, starting with a comprehensive theoretical framework to understand the utilization of flaring, the environmental implications of methane, the current state-of-the-art of research, and the state-of-the-art in the field of remote sensing via satellites. Based upon the framework developed during the initial phases of this work, a data-driven methodology was formulated, utilizing the VIIRS dataset to get geographical areas of interest. Hyperspectral and methane data were aggregated from the Sentinel-2 and Sentinel-5P satellite dataset. This information was processed via a proposed pipeline, with initial alignment and enhancement. In this work, the images were enhanced by calculating the Normalized Burn Index. The result was a dataset containing the location of known flare sites, with data from both the Sentinel-2, and the Sentinel-5P satellite. The results underscore the disparities in coverage between Sentinel-2 and Sentinel-5P data, a factor that could potentially influence the precision of methane emission estimates. The applied preprocessing techniques markedly enhanced data clarity and usability, but their efficacy may hinge on the flaring sites' specific characteristics and the raw data quality. Moreover, despite certain limitations, the combination of Sentinel-2 and Sentinel-5P data effectively yielded a comprehensive dataset suitable for further analysis. In conclusion, this project introduces an encouraging methodology for estimating methane emissions from flaring activities within the oil and gas industry. It lays a foundational steppingstone for future research, continually enhancing the precision and quality of data in combating climate change. This methodology can be seen in the flow chart below. Based on the work done in this project, future work could focus on incorporating alternative sources of methane data, broadening the areas of interest through industry collaboration, and attempting to extract further features through image segmentation methods. This project signifies a start, paving the way for subsequent explorations to build upon

    Novel deep learning architectures for marine and aquaculture applications

    Get PDF
    Alzayat Saleh's research was in the area of artificial intelligence and machine learning to autonomously recognise fish and their morphological features from digital images. Here he created new deep learning architectures that solved various computer vision problems specific to the marine and aquaculture context. He found that these techniques can facilitate aquaculture management and environmental protection. Fisheries and conservation agencies can use his results for better monitoring strategies and sustainable fishing practices
    corecore