41 research outputs found
MoEC: Mixture of Expert Clusters
Sparsely Mixture of Experts (MoE) has received great interest due to its
promising scaling capability with affordable computational overhead. MoE
converts dense layers into sparse experts, and utilizes a gated routing network
to make experts conditionally activated. However, as the number of experts
grows, MoE with outrageous parameters suffers from overfitting and sparse data
allocation. Such problems are especially severe on tasks with limited data,
thus hindering the progress for MoE models to improve performance by scaling
up. In this work, we propose Mixture of Expert Clusters - a general approach to
enable expert layers to learn more diverse and appropriate knowledge by
imposing variance-based constraints on the routing stage. We further propose a
cluster-level expert dropout strategy specifically designed for the expert
cluster structure. Our experiments reveal that MoEC could improve performance
on machine translation and natural language understanding tasks, and raise the
performance upper bound for scaling up experts under limited data. We also
verify that MoEC plays a positive role in mitigating overfitting and sparse
data allocation
LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection
The increasing volume of log data produced by software-intensive systems
makes it impractical to analyze them manually. Many deep learning-based methods
have been proposed for log-based anomaly detection. These methods face several
challenges such as high-dimensional and noisy log data, class imbalance,
generalization, and model interpretability. Recently, ChatGPT has shown
promising results in various domains. However, there is still a lack of study
on the application of ChatGPT for log-based anomaly detection. In this work, we
proposed LogGPT, a log-based anomaly detection framework based on ChatGPT. By
leveraging the ChatGPT's language interpretation capabilities, LogGPT aims to
explore the transferability of knowledge from large-scale corpora to log-based
anomaly detection. We conduct experiments to evaluate the performance of LogGPT
and compare it with three deep learning-based methods on BGL and Spirit
datasets. LogGPT shows promising results and has good interpretability. This
study provides preliminary insights into prompt-based models, such as ChatGPT,
for the log-based anomaly detection task