13,532 research outputs found
Location Reference Recognition from Texts: A Survey and Comparison
A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of its specific applications is still missing. Further, there is a lack of a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching–based, statistical learning-–based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references worldwide. Results from this thorough evaluation can help inform future methodological developments and can help guide the selection of proper approaches based on application needs
Dermoscopic dark corner artifacts removal: Friend or foe?
Background and Objectives: One of the more significant obstacles in classification of skin cancer is the presence of artifacts. This paper investigates the effect of dark corner artifacts, which result from the use of dermoscopes, on the performance of a deep learning binary classification task. Previous research attempted to remove and inpaint dark corner artifacts, with the intention of creating an ideal condition for models. However, such research has been shown to be inconclusive due to a lack of available datasets with corresponding labels for dark corner artifact cases. Methods: To address these issues, we label 10,250 skin lesion images from publicly available datasets and introduce a balanced dataset with an equal number of melanoma and non-melanoma cases. The training set comprises 6126 images without artifacts, and the testing set comprises 4124 images with dark corner artifacts. We conduct three experiments to provide new understanding on the effects of dark corner artifacts, including inpainted and synthetically generated examples, on a deep learning method. Results: Our results suggest that introducing synthetic dark corner artifacts which have been superimposed onto the training set improved model performance, particularly in terms of the true negative rate. This indicates that deep learning learnt to ignore dark corner artifacts, rather than treating it as melanoma, when dark corner artifacts were introduced into the training set. Further, we propose a new approach to quantifying heatmaps indicating network focus using a root mean square measure of the brightness intensity in the different regions of the heatmaps. Conclusions: The proposed artifact methods can be used in future experiments to help alleviate possible impacts on model performance. Additionally, the newly proposed heatmap quantification analysis will help to better understand the relationships between heatmap results and other model performance metrics
Integrating art and AI: Evaluating the educational impact of AI tools in digital art history learning
This study delves into the burgeoning intersection of Artificial Intelligence (AI) and art history education, an area that has been relatively unexplored. The research focuses on how AI art generators impact learning outcomes in art history for both undergraduate and graduate students enrolled in Ancient Art courses, covering eras from ancient Mesopotamia to the fall of Rome. Utilizing a mixed-methods approach, the study analyzes AI-generated artworks, reflective essays, and survey responses to assess how these generative tools influence students’ comprehension, engagement, and creative interpretation of historical artworks. The study reveals that the use of AI tools in art history not only enhances students’ understanding of artistic concepts but also fosters a deeper, more nuanced appreciation of art from the periods studied. The findings indicate that engaging with AI tools promotes critical thinking and creativity, which are crucial competencies in the study of art history. Survey data further suggest that the integration of AI in art history positively influences students’ perceptions of the discipline, aligning well with contemporary digital trends. One of the significant outcomes of the study is the varied experiences of students with AI tools. While some faced challenges with the technology, particularly in accurately capturing complex artwork details and crafting effective prompts, others found success in using AI to generate detailed and creative interpretations of historical pieces. These experiences underscore the potential of AI as a valuable pedagogical tool in art history and humanities education, offering novel insights into teaching methodologies
SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion
Semantic scene completion (SSC) jointly predicts the semantics and geometry
of the entire 3D scene, which plays an essential role in 3D scene understanding
for autonomous driving systems. SSC has achieved rapid progress with the help
of semantic context in segmentation. However, how to effectively exploit the
relationships between the semantic context in semantic segmentation and
geometric structure in scene completion remains under exploration. In this
paper, we propose to solve outdoor SSC from the perspective of representation
separation and BEV fusion. Specifically, we present the network, named SSC-RS,
which uses separate branches with deep supervision to explicitly disentangle
the learning procedure of the semantic and geometric representations. And a BEV
fusion network equipped with the proposed Adaptive Representation Fusion (ARF)
module is presented to aggregate the multi-scale features effectively and
efficiently. Due to the low computational burden and powerful representation
ability, our model has good generality while running in real-time. Extensive
experiments on SemanticKITTI demonstrate our SSC-RS achieves state-of-the-art
performance.Comment: 8 pages, 5 figures, IROS202
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative Work
The increasing prevalence of Artificial Intelligence (AI) in safety-critical
contexts such as air-traffic control leads to systems that are practical and
efficient, and to some extent explainable to humans to be trusted and accepted.
The present structured literature analysis examines n = 236 articles on the
requirements for the explainability and acceptance of AI. Results include a
comprehensive review of n = 48 articles on information people need to perceive
an AI as explainable, the information needed to accept an AI, and
representation and interaction methods promoting trust in an AI. Results
indicate that the two main groups of users are developers who require
information about the internal operations of the model and end users who
require information about AI results or behavior. Users' information needs vary
in specificity, complexity, and urgency and must consider context, domain
knowledge, and the user's cognitive resources. The acceptance of AI systems
depends on information about the system's functions and performance, privacy
and ethical considerations, as well as goal-supporting information tailored to
individual preferences and information to establish trust in the system.
Information about the system's limitations and potential failures can increase
acceptance and trust. Trusted interaction methods are human-like, including
natural language, speech, text, and visual representations such as graphs,
charts, and animations. Our results have significant implications for future
human-centric AI systems being developed. Thus, they are suitable as input for
further application-specific investigations of user needs
A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges
Measuring and evaluating source code similarity is a fundamental software
engineering activity that embraces a broad range of applications, including but
not limited to code recommendation, duplicate code, plagiarism, malware, and
smell detection. This paper proposes a systematic literature review and
meta-analysis on code similarity measurement and evaluation techniques to shed
light on the existing approaches and their characteristics in different
applications. We initially found over 10000 articles by querying four digital
libraries and ended up with 136 primary studies in the field. The studies were
classified according to their methodology, programming languages, datasets,
tools, and applications. A deep investigation reveals 80 software tools,
working with eight different techniques on five application domains. Nearly 49%
of the tools work on Java programs and 37% support C and C++, while there is no
support for many programming languages. A noteworthy point was the existence of
12 datasets related to source code similarity measurement and duplicate codes,
of which only eight datasets were publicly accessible. The lack of reliable
datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm
languages are the main challenges in the field. Emerging applications of code
similarity measurement concentrate on the development phase in addition to the
maintenance.Comment: 49 pages, 10 figures, 6 table
Continuous Online Extrinsic Calibration of Fisheye Camera and LiDAR
Automated driving systems use multi-modal sensor suites to ensure the
reliable, redundant and robust perception of the operating domain, for example
camera and LiDAR. An accurate extrinsic calibration is required to fuse the
camera and LiDAR data into a common spatial reference frame required by
high-level perception functions. Over the life of the vehicle the value of the
extrinsic calibration can change due physical disturbances, introducing an
error into the high-level perception functions. Therefore there is a need for
continuous online extrinsic calibration algorithms which can automatically
update the value of the camera-LiDAR calibration during the life of the vehicle
using only sensor data.
We propose using mutual information between the camera image's depth
estimate, provided by commonly available monocular depth estimation networks,
and the LiDAR pointcloud's geometric distance as a optimization metric for
extrinsic calibration. Our method requires no calibration target, no ground
truth training data and no expensive offline optimization. We demonstrate our
algorithm's accuracy, precision, speed and self-diagnosis capability on the
KITTI-360 data set.Comment: 4 page
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Continual Learning, Fast and Slow
According to the Complementary Learning Systems (CLS)
theory~\cite{mcclelland1995there} in neuroscience, humans do effective
\emph{continual learning} through two complementary systems: a fast learning
system centered on the hippocampus for rapid learning of the specifics,
individual experiences; and a slow learning system located in the neocortex for
the gradual acquisition of structured knowledge about the environment.
Motivated by this theory, we propose \emph{DualNets} (for Dual Networks), a
general continual learning framework comprising a fast learning system for
supervised learning of pattern-separated representation from specific tasks and
a slow learning system for representation learning of task-agnostic general
representation via Self-Supervised Learning (SSL). DualNets can seamlessly
incorporate both representation types into a holistic framework to facilitate
better continual learning in deep neural networks. Via extensive experiments,
we demonstrate the promising results of DualNets on a wide range of continual
learning protocols, ranging from the standard offline, task-aware setting to
the challenging online, task-free scenario. Notably, on the
CTrL~\cite{veniat2020efficient} benchmark that has unrelated tasks with vastly
different visual images, DualNets can achieve competitive performance with
existing state-of-the-art dynamic architecture
strategies~\cite{ostapenko2021continual}. Furthermore, we conduct comprehensive
ablation studies to validate DualNets efficacy, robustness, and scalability.
Code will be made available at \url{https://github.com/phquang/DualNet}.Comment: arXiv admin note: substantial text overlap with arXiv:2110.0017
FedForgery: Generalized Face Forgery Detection with Residual Federated Learning
With the continuous development of deep learning in the field of image
generation models, a large number of vivid forged faces have been generated and
spread on the Internet. These high-authenticity artifacts could grow into a
threat to society security. Existing face forgery detection methods directly
utilize the obtained public shared or centralized data for training but ignore
the personal privacy and security issues when personal data couldn't be
centralizedly shared in real-world scenarios. Additionally, different
distributions caused by diverse artifact types would further bring adverse
influences on the forgery detection task. To solve the mentioned problems, the
paper proposes a novel generalized residual Federated learning for face Forgery
detection (FedForgery). The designed variational autoencoder aims to learn
robust discriminative residual feature maps to detect forgery faces (with
diverse or even unknown artifact types). Furthermore, the general federated
learning strategy is introduced to construct distributed detection model
trained collaboratively with multiple local decentralized devices, which could
further boost the representation generalization. Experiments conducted on
publicly available face forgery detection datasets prove the superior
performance of the proposed FedForgery. The designed novel generalized face
forgery detection protocols and source code would be publicly available.Comment: The code is available at https://github.com/GANG370/FedForgery. The
paper has been accepted in the IEEE Transactions on Information Forensics &
Securit
- …