20 research outputs found
SCDNET: A novel convolutional network for semantic change detection in high resolution optical remote sensing imagery
Abstract With the continuing improvement of remote-sensing (RS) sensors, it is crucial to monitor Earth surface changes at fine scale and in great detail. Thus, semantic change detection (SCD), which is capable of locating and identifying "from-to" change information simultaneously, is gaining growing attention in RS community. However, due to the limitation of large-scale SCD datasets, most existing SCD methods are focused on scene-level changes, where semantic change maps are generated with only coarse boundary or scarce category information. To address this issue, we propose a novel convolutional network for large-scale SCD (SCDNet). It is based on a Siamese UNet architecture, which consists of two encoders and two decoders with shared weights. First, multi-temporal images are given as input to the encoders to extract multi-scale deep representations. A multi-scale atrous convolution (MAC) unit is inserted at the end of the encoders to enlarge the receptive field as well as capturing multi-scale information. Then, difference feature maps are generated for each scale, which are combined with feature maps from the encoders to serve as inputs for the decoders. Attention mechanism and deep supervision strategy are further introduced to improve network performance. Finally, we utilize softmax layer to produce a semantic change map for each time image. Extensive experiments are carried out on two large-scale high-resolution SCD datasets, which demonstrates the effectiveness and superiority of the proposed method
Hypermaps - Beyond occupancy grids
Intelligent and autonomous robotic applications often require robots to have more information about their environment than provided by traditional occupancy maps. An example are semantic maps, which provide qualitative descriptions of the environment. While research in the area of semantic mapping has been performed, most robotic frameworks still offer only occupancy maps.
In this thesis, a framework is developed to handle multi-layered 2D maps in ROS. The framework offers occupancy and semantic layers, but can be extended with new layer types in the future. Furthermore, an algorithm to automatically generate semantic maps from RGB-D images is presented.
Software tests were performed to check if the framework fulfills all set requirements. It was shown that the requirements are accomplished. Furthermore, the semantic mapping algorithm was evaluated with different configurations in two test environments, a laboratory and a floor. While the object shapes of the generated semantic maps were not always accurate and some false detections occurred, most objects were successfully detected and placed on the semantic map. Possible ways to improve the accuracy of the mapping in the future are discussed
Weakly Supervised Silhouette-based Semantic Scene Change Detection
This paper presents a novel semantic scene change detection scheme with only
weak supervision. A straightforward approach for this task is to train a
semantic change detection network directly from a large-scale dataset in an
end-to-end manner. However, a specific dataset for this task, which is usually
labor-intensive and time-consuming, becomes indispensable. To avoid this
problem, we propose to train this kind of network from existing datasets by
dividing this task into change detection and semantic extraction. On the other
hand, the difference in camera viewpoints, for example, images of the same
scene captured from a vehicle-mounted camera at different time points, usually
brings a challenge to the change detection task. To address this challenge, we
propose a new siamese network structure with the introduction of correlation
layer. In addition, we create a publicly available dataset for semantic change
detection to evaluate the proposed method. The experimental results verified
both the robustness to viewpoint difference in change detection task and the
effectiveness for semantic change detection of the proposed networks. Our code
and dataset are available at https://github.com/xdspacelab/sscdnet.Comment: Accepted at the 2020 IEEE International Conference on Robotics and
Automation (ICRA). Code and dataset are available at
https://github.com/xdspacelab/sscdne
Recommended from our members
HealthCyberMap: Mapping the Health Cyberspace Using Hypermedia GIS and Clinical Codes
HealthCyberMap () is a Semantic Web service for healthcare professionals and librarians, patients and the public m general that aims at mappmg parts of medical/ health information resources in cyberspace in novel ways to improve their retrieval and navigation. The Semantic Web ( and ) aims to be the next-generation World Wide Web by giving machine-readable semantics and context to the currently presentation-based Web pages. HealthCyberMap features an unconventional use of GIS (Geographic Information Systems) to map conceptual spaces occupied by collections of medical/ health information resources. Besides mapping the semantic and non-geographical aspects of these resources using suitable spatial metaphors, HealthCyberMap also collects and maps the geographical provenance of these resources. Some of HealthCyberMap Web interfaces are visual (maps for browsing resources by clinical/ health topic, by provenance and by type), while others are textual (multilingual interfaces for browsing resources by language, and a directory of topical resource categories, besides HealthCyberMap Semantic Subject Search Engine that goes beyond conventional free-text and keyword-based search engines, and supports synonyms, disease variants, subtypes, as well as some semantic relationships between terms).
HealthCyberMap adopts a clinical metadata framework built upon a clinical coding scheme (vocabulary or ontology—ICD-9-CM* clinical classification in the current pilot service). Clinical coding schemes serve as a reliable common backbone for topical resource indexing, automated topical classification, topical visualisation and navigation of coded resource pools (using suitable metaphors), and enhanced information retrieval and linking. A resource metadata base based on Dublin Core metadata set with HealthCyberMap’s own extensions holds information about selected high-quality resources. HealthCyberMap then uses GIS spatialisation methods to generate interactive navigational cybermaps from the metadata base. These visual cybermaps are based on familiar metaphors for image-word association to give users a broad overview and understanding of what is available in this complex conceptual space of medical/ health Internet resources and help them navigate it more efficiently and effectively.
HealthCyberMap cybermaps can be considered as semantically-spatialised, ontology-based browsing views of the underlying resource metadata base. Using a clinical coding scheme as a metric for spatialisation (“semantic distance”) is unique to HealthCyberMap and is very much suited for the semantic categorisation and navigation of medical/ health Internet information resources. HealthCyberMap also introduces a useful form of cyberspatial analysis for the detection of topical coverage gaps in its resource pool using choropleth (shaded) maps of human body systems. The project features a cost-effective method for serving Web hypermaps with dynamic metadata base drill-down functionality. It also demonstrates the feasibility of Electronic Patient Record to Online Information Services (like HealthCyberMap) Problem to Knowledge Linking using clinical codes as crisp problem-knowledge linkers or knowledge hooks.
The Semantic Subject Search Engine queries the same HealthCyberMap resource metadata base. Explicit concepts in resource metadata map onto a brokering domain ontology (ICD-9-CM) allowing the search engine to infer implicit meanings (synonyms and semantic relationships) not directly mentioned in either the resource or its metadata. Similarly, user queries would map to the same ontology allowing the search engine to infer the implicit semantics of user queries and use them to optimise retrieval.
A formative evaluation study of HealthCyberMap pilot service using an online user evaluation questionnaire, in addition to analysis of HealthCyberMap server transaction log, has been conducted during the period from 18 April 2002 to 1 June 2002 with very encouraging results. This two-method evaluation approach was guided by methodologies described in NIH Web Site Evaluation and Performance Measures Toolkit among other resources.
Many exciting future possibilities have been also investigated by the author, including the further development of HealthCyberMap as a customisable, location-based medical/ health information service
Memory abstractions for parallel programming
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 156-163).A memory abstraction is an abstraction layer between the program execution and the memory that provides a different "view" of a memory location depending on the execution context in which the memory access is made. Properly designed memory abstractions help ease the task of parallel programming by mitigating the complexity of synchronization or admitting more efficient use of resources. This dissertation describes five memory abstractions for parallel programming: (i) cactus stacks that interoperate with linear stacks, (ii) efficient reducers, (iii) reducer arrays, (iv) ownershipaware transactions, and (v) location-based memory fences. To demonstrate the utility of memory abstractions, my collaborators and I developed Cilk-M, a dynamically multithreaded concurrency platform which embodies the first three memory abstractions. Many dynamic multithreaded concurrency platforms incorporate cactus stacks to support multiple stack views for all the active children simultaneously. The use of cactus stacks, albeit essential, forces concurrency platforms to trade off between performance, memory consumption, and interoperability with serial code due to its incompatibility with linear stacks. This dissertation proposes a new strategy to build a cactus stack using thread-local memory mapping (or TLMM), which enables Cilk-M to satisfy all three criteria simultaneously. A reducer hyperobject allows different branches of a dynamic multithreaded program to maintain coordinated local views of the same nonlocal variable. With reducers, one can use nonlocal variables in a parallel computation without restructuring the code or introducing races. This dissertation introduces memory-mapped reducers, which admits a much more efficient access compared to existing implementations. When used in large quantity, reducers incur unnecessarily high overhead in execution time and space consumption. This dissertation describes support for reducer arrays, which offers the same functionality as an array of reducers with significantly less overhead. Transactional memory is a high-level synchronization mechanism, designed to be easier to use and more composable than fine-grain locking. This dissertation presents ownership-aware transactions, the first transactional memory design that provides provable safety guarantees for "opennested" transactions. On architectures that implement memory models weaker than sequential consistency, programs communicating via shared memory must employ memory-fences to ensure correct execution. This dissertation examines the concept of location-based memoryfences, which unlike traditional memory fences, incurs latency only when synchronization is necessary.by I-Ting Angelina Lee.Ph.D
Topological place recognition for life-long visual localization
Premio Extraordinario de Doctorado de la UAH en el año académico 2016-2017La navegación de vehículos inteligentes o robots móviles en períodos largos de tiempo ha experimentado un gran interés por parte de la comunidad investigadora en los últimos años. Los sistemas basados en cámaras se han extendido ampliamente en el pasado reciente gracias a las mejoras en sus características, precio y reducción de tamaño, añadidos a los progresos en técnicas de visión artificial. Por ello, la localización basada en visión es una aspecto clave para desarrollar una navegación autónoma robusta en situaciones a largo plazo. Teniendo en cuenta esto, la identificación de localizaciones por medio de técnicas de reconocimiento de lugar topológicas puede ser complementaria a otros enfoques como son las soluciones basadas en el Global Positioning System (GPS), o incluso suplementaria cuando la señal GPS no está disponible.El estado del arte en reconocimiento de lugar topológico ha mostrado un funcionamiento satisfactorio en el corto plazo. Sin embargo, la localización visual a largo plazo es problemática debido a los grandes cambios de apariencia que un lugar sufre como consecuencia de elementos dinámicos, la iluminación o la climatología, entre otros. El objetivo de esta tesis es enfrentarse a las dificultades de llevar a cabo una localización topológica eficiente y robusta a lo largo del tiempo. En consecuencia, se van a contribuir dos nuevos enfoques basados en reconocimiento visual de lugar para resolver los diferentes problemas asociados a una localización visual a largo plazo. Por un lado, un método de reconocimiento de lugar visual basado en descriptores binarios es propuesto. La innovación de este enfoque reside en la descripción global de secuencias de imágenes como códigos binarios, que son extraídos mediante un descriptor basado en la técnica denominada Local Difference Binary (LDB). Los descriptores son eficientemente asociados usando la distancia de Hamming y un método de búsqueda conocido como Approximate Nearest Neighbors (ANN). Además, una técnica de iluminación invariante es aplicada para mejorar el funcionamiento en condiciones luminosas cambiantes. El empleo de la descripción binaria previamente introducida proporciona una reducción de los costes computacionales y de memoria.Por otro lado, también se presenta un método de reconocimiento de lugar visual basado en deep learning, en el cual los descriptores aplicados son procesados por una Convolutional Neural Network (CNN). Este es un concepto recientemente popularizado en visión artificial que ha obtenido resultados impresionantes en problemas de clasificación de imagen. La novedad de nuestro enfoque reside en la fusión de la información de imagen de múltiples capas convolucionales a varios niveles y granularidades. Además, los datos redundantes de los descriptores basados en CNNs son comprimidos en un número reducido de bits para una localización más eficiente. El descriptor final es condensado aplicando técnicas de compresión y binarización para realizar una asociación usando de nuevo la distancia de Hamming. En términos generales, los métodos centrados en CNNs mejoran la precisión generando representaciones visuales de las localizaciones más detalladas, pero son más costosos en términos de computación.Ambos enfoques de reconocimiento de lugar visual son extensamente evaluados sobre varios datasets públicos. Estas pruebas arrojan una precisión satisfactoria en situaciones a largo plazo, como es corroborado por los resultados mostrados, que comparan nuestros métodos contra los principales algoritmos del estado del arte, mostrando mejores resultados para todos los casos.Además, también se ha analizado la aplicabilidad de nuestro reconocimiento de lugar topológico en diferentes problemas de localización. Estas aplicaciones incluyen la detección de cierres de lazo basada en los lugares reconocidos o la corrección de la deriva acumulada en odometría visual usando la información proporcionada por los cierres de lazo. Asimismo, también se consideran las aplicaciones de la detección de cambios geométricos a lo largo de las estaciones del año, que son esenciales para las actualizaciones de los mapas en sistemas de conducción autónomos centrados en una operación a largo plazo. Todas estas contribuciones son discutidas al final de la tesis, incluyendo varias conclusiones sobre el trabajo presentado y líneas de investigación futuras