708 research outputs found

    Deep Graph Embedding for IoT Botnet Traffic Detection

    Get PDF
    Botnet attacks have mainly targeted computers in the past, which is a fundamental cybersecurity problem. Due to the booming of Internet of things (IoT) devices, an increasing number of botnet attacks are now targeting IoT devices. Researchers have proposed several mechanisms to avoid botnet attacks, such as identification by communication patterns or network topology and defence by DNS blacklisting. A popular direction for botnet detection currently relies on the specific topological characteristics of botnets and uses machine learning models. However, it relies on network experts’ domain knowledge for feature engineering. Recently, neural networks have shown the capability of representation learning. This paper proposes a new approach to extracting graph features via graph neural networks. To capture the particular topology of the botnet, we transform the network traffic into graphs and train a graph neural network to extract features. In our evaluations, we use graph embedding features to train six machine learning models and compare them with the performance of traditional graph features in identifying botnet nodes. The experimental results show that botnet traffic detection is still challenging even with neural networks. We should consider the impact of data, features, and algorithms for an accurate and robust solution

    Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

    Full text link
    Large Language Models (LLMs) have been demonstrated effective for code generation. Due to the complexity and opacity of LLMs, little is known about how these models generate code. To deepen our understanding, we investigate whether LLMs attend to the same parts of a natural language description as human programmers during code generation. An analysis of five LLMs on a popular benchmark, HumanEval, revealed a consistent misalignment between LLMs' and programmers' attention. Furthermore, we found that there is no correlation between the code generation accuracy of LLMs and their alignment with human programmers. Through a quantitative experiment and a user study, we confirmed that, among twelve different attention computation methods, attention computed by the perturbation-based method is most aligned with human attention and is constantly favored by human programmers. Our findings highlight the need for human-aligned LLMs for better interpretability and programmer trust.Comment: 13 pages, 8 figures, 7 table

    Exploring the axion potential and axion walls in dense quark matter

    Full text link
    We study the potential of the Quantum Chromodynamics axion in hot and/or dense quark matter, within a Nambu-Jona-Lasinio-like model that includes the coupling of the axion to quarks. Differently from previous studies, we implement local electrical neutrality and β−\beta-equilibrium, which are relevant for the description of the quark matter in the core of compact stellar objects. Firstly we compute the effects of the chiral crossover on the axion mass and self-coupling. We find that the low energy properties of axion are very sensitive to the phase transition of Quantum Chromodynamics, in particular, when the bulk quark matter is close to criticality. Then, for the first time in the literature we compute the axion potential at finite quark chemical potential and study the axion domain walls in bulk quark matter. We find that the energy barrier between two adjacent vacuum states decrease in the chirally restored phase: this results in a lower surface tension of the walls. Finally, we comment on the possibility of production of walls in dense quark matter.Comment: 10 pages, 7 figure

    Towards Consistent Video Editing with Text-to-Image Diffusion Models

    Full text link
    Existing works have advanced Text-to-Image (TTI) diffusion models for video editing in a one-shot learning manner. Despite their low requirements of data and computation, these methods might produce results of unsatisfied consistency with text prompt as well as temporal sequence, limiting their applications in the real world. In this paper, we propose to address the above issues with a novel EI2^2 model towards \textbf{E}nhancing v\textbf{I}deo \textbf{E}diting cons\textbf{I}stency of TTI-based frameworks. Specifically, we analyze and find that the inconsistent problem is caused by newly added modules into TTI models for learning temporal information. These modules lead to covariate shift in the feature space, which harms the editing capability. Thus, we design EI2^2 to tackle the above drawbacks with two classical modules: Shift-restricted Temporal Attention Module (STAM) and Fine-coarse Frame Attention Module (FFAM). First, through theoretical analysis, we demonstrate that covariate shift is highly related to Layer Normalization, thus STAM employs a \textit{Instance Centering} layer replacing it to preserve the distribution of temporal features. In addition, {STAM} employs an attention layer with normalized mapping to transform temporal features while constraining the variance shift. As the second part, we incorporate {STAM} with a novel {FFAM}, which efficiently leverages fine-coarse spatial information of overall frames to further enhance temporal consistency. Extensive experiments demonstrate the superiority of the proposed EI2^2 model for text-driven video editing

    DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration

    Full text link
    Blind face restoration (BFR) is important while challenging. Prior works prefer to exploit GAN-based frameworks to tackle this task due to the balance of quality and efficiency. However, these methods suffer from poor stability and adaptability to long-tail distribution, failing to simultaneously retain source identity and restore detail. We propose DiffBFR to introduce Diffusion Probabilistic Model (DPM) for BFR to tackle the above problem, given its superiority over GAN in aspects of avoiding training collapse and generating long-tail distribution. DiffBFR utilizes a two-step design, that first restores identity information from low-quality images and then enhances texture details according to the distribution of real faces. This design is implemented with two key components: 1) Identity Restoration Module (IRM) for preserving the face details in results. Instead of denoising from pure Gaussian random distribution with LQ images as the condition during the reverse process, we propose a novel truncated sampling method which starts from LQ images with part noise added. We theoretically prove that this change shrinks the evidence lower bound of DPM and then restores more original details. With theoretical proof, two cascade conditional DPMs with different input sizes are introduced to strengthen this sampling effect and reduce training difficulty in the high-resolution image generated directly. 2) Texture Enhancement Module (TEM) for polishing the texture of the image. Here an unconditional DPM, a LQ-free model, is introduced to further force the restorations to appear realistic. We theoretically proved that this unconditional DPM trained on pure HQ images contributes to justifying the correct distribution of inference images output from IRM in pixel-level space. Truncated sampling with fractional time step is utilized to polish pixel-level textures while preserving identity information

    BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering

    Full text link
    Developing blind video deflickering (BVD) algorithms to enhance video temporal consistency, is gaining importance amid the flourish of image processing and video generation. However, the intricate nature of video data complicates the training of deep learning methods, leading to high resource consumption and instability, notably under severe lighting flicker. This underscores the critical need for a compact representation beyond pixel values to advance BVD research and applications. Inspired by the classic scale-time equalization (STE), our work introduces the histogram-assisted solution, called BlazeBVD, for high-fidelity and rapid BVD. Compared with STE, which directly corrects pixel values by temporally smoothing color histograms, BlazeBVD leverages smoothed illumination histograms within STE filtering to ease the challenge of learning temporal data using neural networks. In technique, BlazeBVD begins by condensing pixel values into illumination histograms that precisely capture flickering and local exposure variations. These histograms are then smoothed to produce singular frames set, filtered illumination maps, and exposure maps. Resorting to these deflickering priors, BlazeBVD utilizes a 2D network to restore faithful and consistent texture impacted by lighting changes or localized exposure issues. BlazeBVD also incorporates a lightweight 3D network to amend slight temporal inconsistencies, avoiding the resource consumption issue. Comprehensive experiments on synthetic, real-world and generated videos, showcase the superior qualitative and quantitative results of BlazeBVD, achieving inference speeds up to 10x faster than state-of-the-arts

    Quarterly GDP forecast based on coupled economic and energy feature WA-LSTM model

    Get PDF
    Existing macroeconomic forecasting methods primarily focus on the characteristics of economic data, but they overlook the energy-related features concealed behind these economic characteristics, which may lead to inaccurate GDP predictions. Therefore, this paper meticulously analyzes the relationship between energy big data and economic data indicators, explores the coupling feature mining of energy big data and economic data, and constructs features coupling economic and energy data. Targeting the nonlinear variation coupling features in China’s quarterly GDP data and using the long short-term memory (LSTM) neural network model based on deep learning, we employ wavelet analysis technology (WA) to decompose selected macroeconomic variables and construct a prediction model combining LSTM and WA, which is further compared with multiple benchmark models. The research findings show that, in terms of quarterly GDP data prediction, the combined deep learning model and wavelet analysis significantly outperform other methods. When processing structurally complex, nonlinear, and multi-variable data, the LSTM and WA combined prediction model demonstrate better generalization capabilities, with its prediction accuracy generally surpassing other benchmark models

    Catalytic Isomerization of Olefins and Their Derivatives: A Brief Overview

    Get PDF
    Carbon–carbon double bond (CCDB) isomerization is a method for synthesizing new organic compounds from olefins and their derivatives, which was based on C=C migration along carbon chain and cis/trans transform, and it plays a vital role in the fields of organic synthesis, synthesis of daily chemicals, raw oil’s development and synthesis of natural products and so on. In this paper, advances of five types of catalytic methods for CCDB of olefins and their derivatives since the 1960s were discussed in detail; Based on his recent work, the author mainly introduces the application and development of photocatalysis in CCDB of olefins and their derivatives

    Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis

    Full text link
    Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.Comment: CVPR 202
    • …
    corecore