671 research outputs found
Data Collection and Machine Learning Methods for Automated Pedestrian Facility Detection and Mensuration
Large-scale collection of pedestrian facility (crosswalks, sidewalks, etc.) presence data is vital to the success of efforts to improve pedestrian facility management, safety analysis, and road network planning. However, this kind of data is typically not available on a large scale due to the high labor and time costs that are the result of relying on manual data collection methods. Therefore, methods for automating this process using techniques such as machine learning are currently being explored by researchers. In our work, we mainly focus on machine learning methods for the detection of crosswalks and sidewalks from both aerial and street-view imagery. We test data from these two viewpoints individually and with an ensemble method that we refer to as our “dual-perspective prediction model”. In order to obtain this data, we developed a data collection pipeline that combines crowdsourced pedestrian facility location data with aerial and street-view imagery from Bing Maps. In addition to the Convolutional Neural Network used to perform pedestrian facility detection using this data, we also trained a segmentation network to measure the length and width of crosswalks from aerial images. In our tests with a dual-perspective image dataset that was heavily occluded in the aerial view but relatively clear in the street view, our dual-perspective prediction model was able to increase prediction accuracy, recall, and precision by 49%, 383%, and 15%, respectively (compared to using a single perspective model based on only aerial view images). In our tests with satellite imagery provided by the Mississippi Department of Transportation, we were able to achieve accuracies as high as 99.23%, 91.26%, and 93.7% for aerial crosswalk detection, aerial sidewalk detection, and aerial crosswalk mensuration, respectively. The final system that we developed packages all of our machine learning models into an easy-to-use system that enables users to process large batches of imagery or examine individual images in a directory using a graphical interface. Our data collection and filtering guidelines can also be used to guide future research in this area by establishing standards for data quality and labelling
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Machine learning techniques are now integral to the advancement of
intelligent urban services, playing a crucial role in elevating the efficiency,
sustainability, and livability of urban environments. The recent emergence of
foundation models such as ChatGPT marks a revolutionary shift in the fields of
machine learning and artificial intelligence. Their unparalleled capabilities
in contextual understanding, problem solving, and adaptability across a wide
range of tasks suggest that integrating these models into urban domains could
have a transformative impact on the development of smart cities. Despite
growing interest in Urban Foundation Models~(UFMs), this burgeoning field faces
challenges such as a lack of clear definitions, systematic reviews, and
universalizable solutions. To this end, this paper first introduces the concept
of UFM and discusses the unique challenges involved in building them. We then
propose a data-centric taxonomy that categorizes current UFM-related works,
based on urban data modalities and types. Furthermore, to foster advancement in
this field, we present a promising framework aimed at the prospective
realization of UFMs, designed to overcome the identified challenges.
Additionally, we explore the application landscape of UFMs, detailing their
potential impact in various urban contexts. Relevant papers and open-source
resources have been collated and are continuously updated at
https://github.com/usail-hkust/Awesome-Urban-Foundation-Models
Transformer Network for Multi-Person Tracking and Re-Identification in Unconstrained Environment
Multi-object tracking (MOT) has profound applications in a variety of fields,
including surveillance, sports analytics, self-driving, and cooperative
robotics. Despite considerable advancements, existing MOT methodologies tend to
falter when faced with non-uniform movements, occlusions, and
appearance-reappearance scenarios of the objects. Recognizing this inadequacy,
we put forward an integrated MOT method that not only marries object detection
and identity linkage within a singular, end-to-end trainable framework but also
equips the model with the ability to maintain object identity links over long
periods of time. Our proposed model, named STMMOT, is built around four key
modules: 1) candidate proposal generation, which generates object proposals via
a vision-transformer encoder-decoder architecture that detects the object from
each frame in the video; 2) scale variant pyramid, a progressive pyramid
structure to learn the self-scale and cross-scale similarities in multi-scale
feature maps; 3) spatio-temporal memory encoder, extracting the essential
information from the memory associated with each object under tracking; and 4)
spatio-temporal memory decoder, simultaneously resolving the tasks of object
detection and identity association for MOT. Our system leverages a robust
spatio-temporal memory module that retains extensive historical observations
and effectively encodes them using an attention-based aggregator. The
uniqueness of STMMOT lies in representing objects as dynamic query embeddings
that are updated continuously, which enables the prediction of object states
with attention mechanisms and eradicates the need for post-processing
Geo-Information Technology and Its Applications
Geo-information technology has been playing an ever more important role in environmental monitoring, land resource quantification and mapping, geo-disaster damage and risk assessment, urban planning and smart city development. This book focuses on the fundamental and applied research in these domains, aiming to promote exchanges and communications, share the research outcomes of scientists worldwide and to put these achievements better social use. This Special Issue collects fourteen high-quality research papers and is expected to provide a useful reference and technical support for graduate students, scientists, civil engineers and experts of governments to valorize scientific research
Your blush gives you away: detecting hidden mental states with remote photoplethysmography and thermal imaging
Multimodal emotion recognition techniques are increasingly essential for
assessing mental states. Image-based methods, however, tend to focus
predominantly on overt visual cues and often overlook subtler mental state
changes. Psychophysiological research has demonstrated that HR and skin
temperature are effective in detecting ANS activities, thereby revealing these
subtle changes. However, traditional HR tools are generally more costly and
less portable, while skin temperature analysis usually necessitates extensive
manual processing. Advances in remote-PPG and automatic thermal ROI detection
algorithms have been developed to address these issues, yet their accuracy in
practical applications remains limited. This study aims to bridge this gap by
integrating r-PPG with thermal imaging to enhance prediction performance.
Ninety participants completed a 20-minute questionnaire to induce cognitive
stress, followed by watching a film aimed at eliciting moral elevation. The
results demonstrate that the combination of r-PPG and thermal imaging
effectively detects emotional shifts. Using r-PPG alone, the prediction
accuracy was 77% for cognitive stress and 61% for moral elevation, as
determined by SVM. Thermal imaging alone achieved 79% accuracy for cognitive
stress and 78% for moral elevation, utilizing a RF algorithm. An early fusion
strategy of these modalities significantly improved accuracies, achieving 87%
for cognitive stress and 83% for moral elevation using RF. Further analysis,
which utilized statistical metrics and explainable machine learning methods
including SHAP, highlighted key features and clarified the relationship between
cardiac responses and facial temperature variations. Notably, it was observed
that cardiovascular features derived from r-PPG models had a more pronounced
influence in data fusion, despite thermal imaging's higher predictive accuracy
in unimodal analysis.Comment: 28 pages, 6 figure
Localization and Mapping for Self-Driving Vehicles:A Survey
The upsurge of autonomous vehicles in the automobile industry will lead to better driving experiences while also enabling the users to solve challenging navigation problems. Reaching such capabilities will require significant technological attention and the flawless execution of various complex tasks, one of which is ensuring robust localization and mapping. Recent surveys have not provided a meaningful and comprehensive description of the current approaches in this field. Accordingly, this review is intended to provide adequate coverage of the problems affecting autonomous vehicles in this area, by examining the most recent methods for mapping and localization as well as related feature extraction and data security problems. First, a discussion of the contemporary methods of extracting relevant features from equipped sensors and their categorization as semantic, non-semantic, and deep learning methods is presented. We conclude that representativeness, low cost, and accessibility are crucial constraints in the choice of the methods to be adopted for localization and mapping tasks. Second, the survey focuses on methods to build a vehicle’s environment map, considering both the commercial and the academic solutions available. The analysis proposes a difference between two types of environment, known and unknown, and develops solutions in each case. Third, the survey explores different approaches to vehicles’ localization and also classifies them according to their mathematical characteristics and priorities. Each section concludes by presenting the related challenges and some future directions. The article also highlights the security problems likely to be encountered in self-driving vehicles, with an assessment of possible defense mechanisms that could prevent security attacks in vehicles. Finally, the article ends with a debate on the potential impacts of autonomous driving, spanning energy consumption and emission reduction, sound and light pollution, integration into smart cities, infrastructure optimization, and software refinement. This thorough investigation aims to foster a comprehensive understanding of the diverse implications of autonomous driving across various domains
31th International Conference on Information Modelling and Knowledge Bases
Information modelling is becoming more and more important topic for researchers, designers, and users of information systems.The amount and complexity of information itself, the number of abstractionlevels of information, and the size of databases and knowledge bases arecontinuously growing. Conceptual modelling is one of the sub-areas ofinformation modelling. The aim of this conference is to bring together experts from different areas of computer science and other disciplines, who have a common interest in understanding and solving problems on information modelling and knowledge bases, as well as applying the results of research to practice. We also aim to recognize and study new areas on modelling and knowledge bases to which more attention should be paid. Therefore philosophy and logic, cognitive science, knowledge management, linguistics and management science are relevant areas, too. In the conference, there will be three categories of presentations, i.e. full papers, short papers and position papers
- …