612 research outputs found

    Tiresias: Online Anomaly Detection for Hierarchical Operational Network Data

    Full text link
    Operational network data, management data such as customer care call logs and equipment system logs, is a very important source of information for network operators to detect problems in their networks. Unfortunately, there is lack of efficient tools to automatically track and detect anomalous events on operational data, causing ISP operators to rely on manual inspection of this data. While anomaly detection has been widely studied in the context of network data, operational data presents several new challenges, including the volatility and sparseness of data, and the need to perform fast detection (complicating application of schemes that require offline processing or large/stable data sets to converge). To address these challenges, we propose Tiresias, an automated approach to locating anomalous events on hierarchical operational data. Tiresias leverages the hierarchical structure of operational data to identify high-impact aggregates (e.g., locations in the network, failure modes) likely to be associated with anomalous events. To accommodate different kinds of operational network data, Tiresias consists of an online detection algorithm with low time and space complexity, while preserving high detection accuracy. We present results from two case studies using operational data collected at a large commercial IP network operated by a Tier-1 ISP: customer care call logs and set-top box crash logs. By comparing with a reference set verified by the ISP's operational group, we validate that Tiresias can achieve >94% accuracy in locating anomalies. Tiresias also discovered several previously unknown anomalies in the ISP's customer care cases, demonstrating its effectiveness

    Environmental Controls on Multi-Scale Dynamics of Net Carbon Dioxide Exchange From an Alpine Peatland on the Eastern Qinghai-Tibet Plateau

    Get PDF
    Peatlands are characterized by their large carbon storage capacity and play an essential role in the global carbon cycle. However, the future of the carbon stored in peatland ecosystems under a changing climate remains unclear. In this study, based on the eddy covariance technique, we investigated the net ecosystem CO2 exchange (NEE) and its controlling factors of the Hongyuan peatland, which is a part of the Ruoergai peatland on the eastern Qinghai-Tibet Plateau (QTP). Our results show that the Hongyuan alpine peatland was a CO2 sink with an annual NEE of -226.61 and -185.35 g C m(-2) in 2014 and 2015, respectively. While, the non-growing season NEE was 53.35 and 75.08 g C m(-2) in 2014 and 2015, suggesting that non-growing seasons carbon emissions should not be neglected. Clear diurnal variation in NEE was observed during the observation period, with the maximum CO2 uptake appearing at 12:30 (Beijing time, UTC+8). The Q(10) value of the non-growing season in 2014 and 2015 was significantly higher than that in the growing season, which suggested that the CO2 flux in the non-growing season was more sensitive to warming than that in the growing season. We investigated the multi-scale temporal variations in NEE during the growing season using wavelet analysis. On daily timescales, photosynthetically active radiation was the primary driver of NEE. Seasonal variation in NEE was mainly driven by soil temperature. The amount of precipitation was more responsible for annual variation of NEE. The increasing number of precipitation event was associated with increasing annual carbon uptake. This study highlights the need for continuous eddy covariance measurements and time series analysis approaches to deepen our understanding of the temporal variability in NEE and multi-scale correlation between NEE and environmental factors

    Improving Multi-Task Generalization via Regularizing Spurious Correlation

    Full text link
    Multi-Task Learning (MTL) is a powerful learning paradigm to improve generalization performance via knowledge sharing. However, existing studies find that MTL could sometimes hurt generalization, especially when two tasks are less correlated. One possible reason that hurts generalization is spurious correlation, i.e., some knowledge is spurious and not causally related to task labels, but the model could mistakenly utilize them and thus fail when such correlation changes. In MTL setup, there exist several unique challenges of spurious correlation. First, the risk of having non-causal knowledge is higher, as the shared MTL model needs to encode all knowledge from different tasks, and causal knowledge for one task could be potentially spurious to the other. Second, the confounder between task labels brings in a different type of spurious correlation to MTL. We theoretically prove that MTL is more prone to taking non-causal knowledge from other tasks than single-task learning, and thus generalize worse. To solve this problem, we propose Multi-Task Causal Representation Learning framework, aiming to represent multi-task knowledge via disentangled neural modules, and learn which module is causally related to each task via MTL-specific invariant regularization. Experiments show that it could enhance MTL model's performance by 5.5% on average over Multi-MNIST, MovieLens, Taskonomy, CityScape, and NYUv2, via alleviating spurious correlation problem.Comment: Published on NeurIPS 202

    Decoupled Contrastive Learning

    Full text link
    Contrastive learning (CL) is one of the most successful paradigms for self-supervised learning (SSL). In a principled way, it considers two augmented "views" of the same image as positive to be pulled closer, and all other images as negative to be pushed further apart. However, behind the impressive success of CL-based techniques, their formulation often relies on heavy-computation settings, including large sample batches, extensive training epochs, etc. We are thus motivated to tackle these issues and establish a simple, efficient, yet competitive baseline of contrastive learning. Specifically, we identify, from theoretical and empirical studies, a noticeable negative-positive-coupling (NPC) effect in the widely used InfoNCE loss, leading to unsuitable learning efficiency concerning the batch size. By removing the NPC effect, we propose decoupled contrastive learning (DCL) loss, which removes the positive term from the denominator and significantly improves the learning efficiency. DCL achieves competitive performance with less sensitivity to sub-optimal hyperparameters, requiring neither large batches in SimCLR, momentum encoding in MoCo, or large epochs. We demonstrate with various benchmarks while manifesting robustness as much less sensitive to suboptimal hyperparameters. Notably, SimCLR with DCL achieves 68.2% ImageNet-1K top-1 accuracy using batch size 256 within 200 epochs pre-training, outperforming its SimCLR baseline by 6.4%. Further, DCL can be combined with the SOTA contrastive learning method, NNCLR, to achieve 72.3% ImageNet-1K top-1 accuracy with 512 batch size in 400 epochs, which represents a new SOTA in contrastive learning. We believe DCL provides a valuable baseline for future contrastive SSL studies.Comment: Accepted by ECCV202

    Empowering Long-tail Item Recommendation through Cross Decoupling Network (CDN)

    Full text link
    Industry recommender systems usually suffer from highly-skewed long-tail item distributions where a small fraction of the items receives most of the user feedback. This skew hurts recommender quality especially for the item slices without much user feedback. While there have been many research advances made in academia, deploying these methods in production is very difficult and very few improvements have been made in industry. One challenge is that these methods often hurt overall performance; additionally, they could be complex and expensive to train and serve. In this work, we aim to improve tail item recommendations while maintaining the overall performance with less training and serving cost. We first find that the predictions of user preferences are biased under long-tail distributions. The bias comes from the differences between training and serving data in two perspectives: 1) the item distributions, and 2) user's preference given an item. Most existing methods mainly attempt to reduce the bias from the item distribution perspective, ignoring the discrepancy from user preference given an item. This leads to a severe forgetting issue and results in sub-optimal performance. To address the problem, we design a novel Cross Decoupling Network (CDN) (i) decouples the learning process of memorization and generalization on the item side through a mixture-of-expert architecture; (ii) decouples the user samples from different distributions through a regularized bilateral branch network. Finally, a new adapter is introduced to aggregate the decoupled vectors, and softly shift the training attention to tail items. Extensive experimental results show that CDN significantly outperforms state-of-the-art approaches on benchmark datasets. We also demonstrate its effectiveness by a case study of CDN in a large-scale recommendation system at Google.Comment: Accepted by KDD 2023 Applied Data Science (ADS) trac

    Juvenile Dermatomyositis: A 20-year Retrospective Analysis of Treatment and Clinical Outcomes

    Get PDF
    BackgroundJuvenile dermatomyositis is a rare childhood multisystem autoimmune disease involving primarily the skin and muscles, and it may lead to long-term disability. This study aimed to describe the clinical course of juvenile dermatomyositis and determine if any early clinical or laboratory features could predict outcome.MethodsMedical charts of patients aged ≤18 years and diagnosed with juvenile dermatomyositis (according to the criteria of Bohan and Peter) at the Pediatric Department, National Taiwan University Hospital, between 1989 and 2009 were reviewed. The endpoints for disease assessment were complete clinical response and complete clinical remission. Cox's proportional hazards model was fitted to identify important predictors of complete clinical remission.ResultsA total of 39 patients with juvenile dermatomyositis were reviewed. Two-thirds were females, and the mean age at disease onset was 81.97 ± 46.63 months. The most common initial presentations were Gottron's papule (82.1%) and muscle weakness (82.1%). After excluding one patient with an incomplete record, the remaining 31 patients who had muscle weakness were analyzed; among them, 22 (70.97%) achieved complete clinical response, but only six (19.4%) achieved complete clinical remission. Multivariate analysis showed that female sex, negative Gowers' sign at disease onset, and positive photosensitivity at disease onset were favorable factors to achieve complete clinical remission. Moreover, covariate-adjusted survival curves were drawn for making predictions of complete clinical remission. Only 13 (33.33%) patients were symptom free at the end of follow up, whereas the other 26 suffered from different kinds of complications. None of them developed malignancy, but two (5.13%) patients died during the follow-up period.ConclusionFactors such as male sex and Gowers' sign were unlikely to favor the achievement of complete clinical remission in juvenile dermatomyositis. Certain complications cannot be avoided, and thus more effective treatments and monitoring strategies are needed for better control of juvenile dermatomyositis
    • …
    corecore