192 research outputs found

    Validating Multimedia Content Moderation Software via Semantic Fusion

    Full text link
    The exponential growth of social media platforms, such as Facebook and TikTok, has revolutionized communication and content publication in human society. Users on these platforms can publish multimedia content that delivers information via the combination of text, audio, images, and video. Meanwhile, the multimedia content release facility has been increasingly exploited to propagate toxic content, such as hate speech, malicious advertisements, and pornography. To this end, content moderation software has been widely deployed on these platforms to detect and blocks toxic content. However, due to the complexity of content moderation models and the difficulty of understanding information across multiple modalities, existing content moderation software can fail to detect toxic content, which often leads to extremely negative impacts. We introduce Semantic Fusion, a general, effective methodology for validating multimedia content moderation software. Our key idea is to fuse two or more existing single-modal inputs (e.g., a textual sentence and an image) into a new input that combines the semantics of its ancestors in a novel manner and has toxic nature by construction. This fused input is then used for validating multimedia content moderation software. We realized Semantic Fusion as DUO, a practical content moderation software testing tool. In our evaluation, we employ DUO to test five commercial content moderation software and two state-of-the-art models against three kinds of toxic content. The results show that DUO achieves up to 100% error finding rate (EFR) when testing moderation software. In addition, we leverage the test cases generated by DUO to retrain the two models we explored, which largely improves model robustness while maintaining the accuracy on the original test set.Comment: Accepted by ISSTA 202

    Performance Issue Identification in Cloud Systems with Relational-Temporal Anomaly Detection

    Full text link
    Performance issues permeate large-scale cloud service systems, which can lead to huge revenue losses. To ensure reliable performance, it's essential to accurately identify and localize these issues using service monitoring metrics. Given the complexity and scale of modern cloud systems, this task can be challenging and may require extensive expertise and resources beyond the capacity of individual humans. Some existing methods tackle this problem by analyzing each metric independently to detect anomalies. However, this could incur overwhelming alert storms that are difficult for engineers to diagnose manually. To pursue better performance, not only the temporal patterns of metrics but also the correlation between metrics (i.e., relational patterns) should be considered, which can be formulated as a multivariate metrics anomaly detection problem. However, most of the studies fall short of extracting these two types of features explicitly. Moreover, there exist some unlabeled anomalies mixed in the training data, which may hinder the detection performance. To address these limitations, we propose the Relational- Temporal Anomaly Detection Model (RTAnomaly) that combines the relational and temporal information of metrics. RTAnomaly employs a graph attention layer to learn the dependencies among metrics, which will further help pinpoint the anomalous metrics that may cause the anomaly effectively. In addition, we exploit the concept of positive unlabeled learning to address the issue of potential anomalies in the training data. To evaluate our method, we conduct experiments on a public dataset and two industrial datasets. RTAnomaly outperforms all the baseline models by achieving an average F1 score of 0.929 and Hit@3 of 0.920, demonstrating its superiority

    Prism: Revealing Hidden Functional Clusters from Massive Instances in Cloud Systems

    Full text link
    Ensuring the reliability of cloud systems is critical for both cloud vendors and customers. Cloud systems often rely on virtualization techniques to create instances of hardware resources, such as virtual machines. However, virtualization hinders the observability of cloud systems, making it challenging to diagnose platform-level issues. To improve system observability, we propose to infer functional clusters of instances, i.e., groups of instances having similar functionalities. We first conduct a pilot study on a large-scale cloud system, i.e., Huawei Cloud, demonstrating that instances having similar functionalities share similar communication and resource usage patterns. Motivated by these findings, we formulate the identification of functional clusters as a clustering problem and propose a non-intrusive solution called Prism. Prism adopts a coarse-to-fine clustering strategy. It first partitions instances into coarse-grained chunks based on communication patterns. Within each chunk, Prism further groups instances with similar resource usage patterns to produce fine-grained functional clusters. Such a design reduces noises in the data and allows Prism to process massive instances efficiently. We evaluate Prism on two datasets collected from the real-world production environment of Huawei Cloud. Our experiments show that Prism achieves a v-measure of ~0.95, surpassing existing state-of-the-art solutions. Additionally, we illustrate the integration of Prism within monitoring systems for enhanced cloud reliability through two real-world use cases.Comment: The paper was accepted by the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023

    A Large-scale Benchmark for Log Parsing

    Full text link
    Log data is pivotal in activities like anomaly detection and failure diagnosis in the automated maintenance of software systems. Due to their unstructured format, log parsing is often required to transform them into a structured format for automated analysis. A variety of log parsers exist, making it vital to benchmark these tools to comprehend their features and performance. However, existing datasets for log parsing are limited in terms of scale and representativeness, posing challenges for studies that aim to evaluate or develop log parsers. This problem becomes more pronounced when these parsers are evaluated for production use. To address these issues, we introduce a new collection of large-scale annotated log datasets, named LogPub, which more accurately mirrors log data observed in real-world software systems. LogPub comprises 14 datasets, each averaging 3.6 million log lines. Utilizing LogPub, we re-evaluate 15 log parsers in a more rigorous and practical setting. We also propose a new evaluation metric to lessen the sensitivity of current metrics to imbalanced data distribution. Furthermore, we are the first to scrutinize the detailed performance of log parsers on logs that represent rare system events and offer comprehensive information for system troubleshooting. Parsing such logs accurately is vital yet challenging. We believe that our work could shed light on the design and evaluation of log parsers in more realistic settings, thereby facilitating their implementation in production systems

    FaultProfIT: Hierarchical Fault Profiling of Incident Tickets in Large-scale Cloud Systems

    Full text link
    Postmortem analysis is essential in the management of incidents within cloud systems, which provides valuable insights to improve system's reliability and robustness. At CloudA, fault pattern profiling is performed during the postmortem phase, which involves the classification of incidents' faults into unique categories, referred to as fault pattern. By aggregating and analyzing these fault patterns, engineers can discern common faults, vulnerable components and emerging fault trends. However, this process is currently conducted by manual labeling, which has inherent drawbacks. On the one hand, the sheer volume of incidents means only the most severe ones are analyzed, causing a skewed overview of fault patterns. On the other hand, the complexity of the task demands extensive domain knowledge, which leads to errors and inconsistencies. To address these limitations, we propose an automated approach, named FaultProfIT, for Fault pattern Profiling of Incident Tickets. It leverages hierarchy-guided contrastive learning to train a hierarchy-aware incident encoder and predicts fault patterns with enhanced incident representations. We evaluate FaultProfIT using the production incidents from CloudA. The results demonstrate that FaultProfIT outperforms state-of-the-art methods. Our ablation study and analysis also verify the effectiveness of hierarchy-guided contrastive learning. Additionally, we have deployed FaultProfIT at CloudA for six months. To date, FaultProfIT has analyzed 10,000+ incidents from 30+ cloud services, successfully revealing several fault trends that have informed system improvements.Comment: Accepted by Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE SEIP 2024

    Artificial Intelligence for Complex Network: Potential, Methodology and Application

    Full text link
    Complex networks pervade various real-world systems, from the natural environment to human societies. The essence of these networks is in their ability to transition and evolve from microscopic disorder-where network topology and node dynamics intertwine-to a macroscopic order characterized by certain collective behaviors. Over the past two decades, complex network science has significantly enhanced our understanding of the statistical mechanics, structures, and dynamics underlying real-world networks. Despite these advancements, there remain considerable challenges in exploring more realistic systems and enhancing practical applications. The emergence of artificial intelligence (AI) technologies, coupled with the abundance of diverse real-world network data, has heralded a new era in complex network science research. This survey aims to systematically address the potential advantages of AI in overcoming the lingering challenges of complex network research. It endeavors to summarize the pivotal research problems and provide an exhaustive review of the corresponding methodologies and applications. Through this comprehensive survey-the first of its kind on AI for complex networks-we expect to provide valuable insights that will drive further research and advancement in this interdisciplinary field.Comment: 51 pages, 4 figures, 10 table

    Single-layer perceptron artificial visual system for orientation detection

    Get PDF
    Orientation detection is an essential function of the visual system. In our previous works, we have proposed a new orientation detection mechanism based on local orientation-selective neurons. We assume that there are neurons solely responsible for orientation detection, with each neuron dedicated to detecting a specific local orientation. The global orientation is inferred from the local orientation information. Based on this mechanism, we propose an artificial visual system (AVS) by utilizing a single-layer of McCulloch-Pitts neurons to realize these local orientation-sensitive neurons and a layer of sum pooling to realize global orientation detection neurons. We demonstrate that such a single-layer perceptron artificial visual system (AVS) is capable of detecting global orientation by identifying the orientation with the largest number of activated orientation-selective neurons as the global orientation. To evaluate the effectiveness of this single-layer perceptron AVS, we perform computer simulations. The results show that the AVS works perfectly for global orientation detection, aligning with the majority of physiological experiments and models. Moreover, we compare the performance of the single-layer perceptron AVS with that of a traditional convolutional neural network (CNN) on orientation detection tasks. We find that the single-layer perceptron AVS outperforms CNN in various aspects, including identification accuracy, noise resistance, computational and learning cost, hardware implementation feasibility, and biological plausibility

    Fast-MC-PET: A Novel Deep Learning-aided Motion Correction and Reconstruction Framework for Accelerated PET

    Full text link
    Patient motion during PET is inevitable. Its long acquisition time not only increases the motion and the associated artifacts but also the patient's discomfort, thus PET acceleration is desirable. However, accelerating PET acquisition will result in reconstructed images with low SNR, and the image quality will still be degraded by motion-induced artifacts. Most of the previous PET motion correction methods are motion type specific that require motion modeling, thus may fail when multiple types of motion present together. Also, those methods are customized for standard long acquisition and could not be directly applied to accelerated PET. To this end, modeling-free universal motion correction reconstruction for accelerated PET is still highly under-explored. In this work, we propose a novel deep learning-aided motion correction and reconstruction framework for accelerated PET, called Fast-MC-PET. Our framework consists of a universal motion correction (UMC) and a short-to-long acquisition reconstruction (SL-Reon) module. The UMC enables modeling-free motion correction by estimating quasi-continuous motion from ultra-short frame reconstructions and using this information for motion-compensated reconstruction. Then, the SL-Recon converts the accelerated UMC image with low counts to a high-quality image with high counts for our final reconstruction output. Our experimental results on human studies show that our Fast-MC-PET can enable 7-fold acceleration and use only 2 minutes acquisition to generate high-quality reconstruction images that outperform/match previous motion correction reconstruction methods using standard 15 minutes long acquisition data.Comment: Accepted at Information Processing in Medical Imaging (IPMI 2023

    Rv1985c, a promising novel antigen for diagnosis of tuberculosis infection from BCG-vaccinated controls

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Antigens encoded in the region of difference (RD) of <it>Mycobacterium tuberculosis </it>constitute a potential source of specific antigens for immunodiagnosis. In the present study, recombinant protein Rv1985c from RD2 was cloned, expressed, purified, immunologically characterized and investigated for its potentially diagnostic value for tuberculosis (TB) infection among BCG-vaccinated individuals.</p> <p>Methods</p> <p>T-cell response to Rv1985c was evaluated by IFN-γ ELISPOT in 56 TB patients, 20 latent TB infection (LTBI) and 30 BCG-vaccinated controls in comparison with the commercial T-SPOT. <it>TB </it>kit. Humoral response was evaluated by ELISA in 117 TB patients, 45 LTBI and 67 BCG-vaccinated controls, including all those who had T-cell assay, in comparison with a commercial IgG kit.</p> <p>Results</p> <p>Rv1985c was specifically recognized by cellular and humoral responses from both TB and LTBI groups compared with healthy controls. Rv1985c IgG-ELISA achieved 52% and 62% sensitivity respectively, which outperformed the sensitivity of PATHOZYME-MYCO kit (34%) in detecting active TB (P = 0.011), whereas IFN-γ Rv1985c-ELISPOT achieved 71% and 55% sensitivity in detecting active and LTBI, respectively. Addition of Rv1985c increased sensitivities of ESAT-6, CFP-10 and ESAT-6/CFP-10 combination in detecting TB from 82.1% to 89.2% (P = 0.125), 67.9% to 87.5% (P < 0.001) and 85.7% to 92.9% (P = 0.125), respectively.</p> <p>Conclusions</p> <p>In conclusion, Rv1985c is a novel antigen which can be used to immunologically diagnose TB infection along with other immunodominant antigens among BCG-vaccinated population.</p
    • …
    corecore