Association for the Advancement of Artificial Intelligence: AAAI Publications
Not a member yet
    18992 research outputs found

    A Hazard-Aware Metric for Ordinal Multi-Class Classification in Pathology

    No full text
    Artificial Intelligence (AI) for decision support and diagnosis in pathology could provide immense value to society, improving patient outcomes and alleviating workload demands on pathologists. However, this potential cannot be realized until sufficient methods for testing and evaluation of such AI systems are developed and adopted. We present a novel metric for evaluation of multi-class classification algorithms for pathology, Error Severity Index (ESI), to address the needs of pathologists and pathology lab managers in evaluating AI systems

    Building Intelligent Systems by Combining Machine Learning and Automated Commonsense Reasoning

    No full text
    We present an approach to building systems that emulate human-like intelligence. Our approach uses machine learning technology (including generative AI systems) to extract knowledge from pictures, text, etc., and represents it as (pre-defined) predicates. Next, we use the s(CASP) automated commonsense reasoning system to check the consistency of this extracted knowledge and reason over it in a manner very similar to how a human would do it. We have used our approach for building systems for visual question answering, task-specific chatbots that can ``understand" human dialogs and interactively talk to them, and autonomous driving systems that rely on commonsense reasoning. Essentially, our approach emulates how humans process knowledge where they use sensing and pattern recognition to gain knowledge (Kahneman's System 1 thinking, akin to using a machine learning model), and then use reasoning to draw conclusions, generate response, or take actions (Kahneman's System 2 thinking, akin to automated reasoning)

    Failure-Resistant Intelligent Interaction for Reliable Human-AI Collaboration

    No full text
    My thesis is focusing on how we can overcome the gap people have against machine learning techniques that require a well-defined application scheme and can produce wrong results. I am planning to discuss the principle of the interaction design that fills such a gap based on my past projects that have explored better interactions for applying machine learning in various fields, such as malware analysis, executive coaching, photo editing, and so on. To this aim, my thesis also shed a light on the limitations of machine learning techniques, like adversarial examples, to highlight the importance of "failure-resistant intelligent interaction.

    Invertible Conditional GAN Revisited: Photo-to-Manga Face Translation with Modern Architectures (Student Abstract)

    No full text
    Recent style translation methods have extended their transferability from texture to geometry. However, performing translation while preserving image content when there is a significant style difference is still an open problem. To overcome this problem, we propose Invertible Conditional Fast GAN (IcFGAN) based on GAN inversion and cFGAN. It allows for unpaired photo-to-manga face translation. Experimental results show that our method could translate styles under significant style gaps, while the state-of-the-art methods could hardly preserve image content

    Summarizing Stream Data for Memory-Constrained Online Continual Learning

    No full text
    Replay-based methods have proved their effectiveness on online continual learning by rehearsing past samples from an auxiliary memory. With many efforts made on improving training schemes based on the memory, however, the information carried by each sample in the memory remains under-investigated. Under circumstances with restricted storage space, the informativeness of the memory becomes critical for effective replay. Although some works design specific strategies to select representative samples, by only employing a small number of original images, the storage space is still not well utilized. To this end, we propose to Summarize the knowledge from the Stream Data (SSD) into more informative samples by distilling the training characteristics of real images. Through maintaining the consistency of training gradients and relationship to the past tasks, the summarized samples are more representative for the stream data compared to the original images. Extensive experiments are conducted on multiple online continual learning benchmarks to support that the proposed SSD method significantly enhances the replay effects. We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer. Code in https://github.com/vimar-gu/SSD

    Exploring Channel-Aware Typical Features for Out-of-Distribution Detection

    No full text
    Detecting out-of-distribution (OOD) data is essential to ensure the reliability of machine learning models when deployed in real-world scenarios. Different from most previous test-time OOD detection methods that focus on designing OOD scores, we delve into the challenges in OOD detection from the perspective of typicality and regard the feature’s high-probability region as the feature’s typical set. However, the existing typical-feature-based OOD detection method implies an assumption: the proportion of typical feature sets for each channel is fixed. According to our experimental analysis, each channel contributes differently to OOD detection. Adopting a fixed proportion for all channels results in several channels losing too many typical features or incorporating too many abnormal features, resulting in low performance. Therefore, exploring the channel-aware typical features is crucial to better-separating ID and OOD data. Driven by this insight, we propose expLoring channel-Aware tyPical featureS (LAPS). Firstly, LAPS obtains the channel-aware typical set by calibrating the channel-level typical set with the global typical set from the mean and standard deviation. Then, LAPS rectifies the features into channel-aware typical sets to obtain channel-aware typical features. Finally, LAPS leverages the channel-aware typical features to calculate the energy score for OOD detection. Theoretical and visual analyses verify that LAPS achieves a better bias-variance trade-off. Experiments verify the effectiveness and generalization of LAPS under different architectures and OOD scores

    Sequential Fusion Based Multi-Granularity Consistency for Space-Time Transformer Tracking

    No full text
    Regarded as a template-matching task for a long time, visual object tracking has witnessed significant progress in space-wise exploration. However, since tracking is performed on videos with substantial time-wise information, it is important to simultaneously mine the temporal contexts which have not yet been deeply explored. Previous supervised works mostly consider template reform as the breakthrough point, but they are often limited by additional computational burdens or the quality of chosen templates. To address this issue, we propose a Space-Time Consistent Transformer Tracker (STCFormer), which uses a sequential fusion framework with multi-granularity consistency constraints to learn spatiotemporal context information. We design a sequential fusion framework that recombines template and search images based on tracking results from chronological frames, fusing updated tracking states in training. To further overcome the over-reliance on the fixed template without increasing computational complexity, we design three space-time consistent constraints: Label Consistency Loss (LCL) for label-level consistency, Attention Consistency Loss (ACL) for patch-level ROI consistency, and Semantic Consistency Loss (SCL) for feature-level semantic consistency. Specifically, in ACL and SCL, the label information is used to constrain the attention and feature consistency of the target and the background, respectively, to avoid mutual interference. Extensive experiments have shown that our STCFormer outperforms many of the best-performing trackers on several popular benchmarks

    Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance

    No full text
    Sketch-based terrain generation seeks to create realistic landscapes for virtual environments in various applications such as computer games, animation and virtual reality. Recently, deep learning based terrain generation has emerged, notably the ones based on generative adversarial networks (GAN). However, these methods often struggle to fulfill the requirements of flexible user control and maintain generative diversity for realistic terrain. Therefore, we propose a novel diffusion-based method, namely terrain diffusion network (TDN), which actively incorporates user guidance for enhanced controllability, taking into account terrain features like rivers, ridges, basins, and peaks. Instead of adhering to a conventional monolithic denoising process, which often compromises the fidelity of terrain details or the alignment with user control, a multi-level denoising scheme is proposed to generate more realistic terrains by taking into account fine-grained details, particularly those related to climatic patterns influenced by erosion and tectonic activities. Specifically, three terrain synthesisers are designed for structural, intermediate, and fine-grained level denoising purposes, which allow each synthesiser concentrate on a distinct terrain aspect. Moreover, to maximise the efficiency of our TDN, we further introduce terrain and sketch latent spaces for the synthesizers with pre-trained terrain autoencoders. Comprehensive experiments on a new dataset constructed from NASA Topology Images clearly demonstrate the effectiveness of our proposed method, achieving the state-of-the-art performance. Our code is available at https://github.com/TDNResearch/TDN

    DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning

    No full text
    Recent machine learning algorithms have been developed using well-curated datasets, which often require substantial cost and resources. On the other hand, the direct use of raw data often leads to overfitting towards frequently occurring class information. To address class imbalances cost-efficiently, we propose an active data filtering process during self-supervised pre-training in our novel framework, Duplicate Elimination (DUEL). This framework integrates an active memory inspired by human working memory and introduces distinctiveness information, which measures the diversity of the data in the memory, to optimize both the feature extractor and the memory. The DUEL policy, which replaces the most duplicated data with new samples, aims to enhance the distinctiveness information in the memory and thereby mitigate class imbalances. We validate the effectiveness of the DUEL framework in class-imbalanced environments, demonstrating its robustness and providing reliable results in downstream tasks. We also analyze the role of the DUEL policy in the training process through various metrics and visualizations

    Inverse Weight-Balancing for Deep Long-Tailed Learning

    No full text
    The performance of deep learning models often degrades rapidly when faced with imbalanced data characterized by a long-tailed distribution. Researchers have found that the fully connected layer trained by cross-entropy loss has large weight-norms for classes with many samples, but not for classes with few samples. How to address the data imbalance problem with both the encoder and the classifier seems an under-researched problem. In this paper, we propose an inverse weight-balancing (IWB) approach to guide model training and alleviate the data imbalance problem in two stages. In the first stage, an encoder and classifier (the fully connected layer) are trained using conventional cross-entropy loss. In the second stage, with a fixed encoder, the classifier is finetuned through an adaptive distribution for IWB in the decision space. Unlike existing inverse image frequency that implements a multiplicative margin adjustment transformation in the classification layer, our approach can be interpreted as an adaptive distribution alignment strategy using not only the class-wise number distribution but also the sample-wise difficulty distribution in both encoder and classifier. Experiments show that our method can greatly improve performance on imbalanced datasets such as CIFAR100-LT with different imbalance factors, ImageNet-LT, and iNaturelists2018

    0

    full texts

    18,992

    metadata records
    Updated in last 30 days.
    Association for the Advancement of Artificial Intelligence: AAAI Publications is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇