4,116 research outputs found

    Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

    Get PDF
    Recently, because of increasing amount of data in the society, data stream mining targeting large scale data has attracted attention. The data mining is a technology of discovery new knowledge and patterns from the massive amounts of data, and what the data correspond to data stream is data stream mining. In this paper, we propose the feature selection with online decision tree. At first, we construct online type decision tree to regard credit card transaction data as data stream on data stream mining. At second, we select attributes thought to be important for detection of illegal use. We apply VFDT (Very Fast Decision Tree learner) algorithm to online type decision tree construction

    Hyper-IgE syndrome, 2021 update

    Get PDF
    Clinically and pathologically, the patients with hyper-IgE syndrome present similar skin manifestations to common atopic dermatitis. The original hyper-IgE syndrome is characterized by diminished inflammatory response, in combination with Staphylococcus aureus skin abscess and pneumonia followed by pneumatocele formation. These immunological manifestations are frequently associated with skeletal and connective tissue abnormalities. We previously identified that major causal variants of the hyper-IgE syndrome are dominant negative variants in the STAT3. In addition to the identification of new causative variants for the disorders similar to the original hyper-IgE syndrome, causative variants for new types of hyper-IgE syndrome centered only on atopy, high serum IgE levels, and susceptibility to infection, but not associated with diminished inflammatory response, pneumatocele formation, and connective tissue manifestations, have been identified. Recent discovery identified a novel zinc finger protein that regulates STAT3 transcription. Investigation of IL6ST variants disclosed that IL6ST/IL6R cytokine receptor plays a crucial role for the signal transduction upstream of STAT3 in the pathogenesis of the original hyper-IgE syndrome. Even if the same IL6ST variants are used for the signal transduction of IL-6 family cytokines, the signaling defect is more severe in IL-6/IL-11 and milder in LIF. The fact that the non-immune manifestations of the gain-of-function mutations of TGFBR1 and TGFBR2 are similar to the those of dominant negative mutations of STAT3 provide a clue to elucidate molecular mechanisms of non-immune manifestations of hyper-IgE syndrome. Research on this hereditary atopic syndrome is being actively conducted to elucidate the molecular mechanisms and to develop new therapeutic approaches

    Grokking Tickets: Lottery Tickets Accelerate Grokking

    Full text link
    Grokking is one of the most surprising puzzles in neural network generalization: a network first reaches a memorization solution with perfect training accuracy and poor generalization, but with further training, it reaches a perfectly generalized solution. We aim to analyze the mechanism of grokking from the lottery ticket hypothesis, identifying the process to find the lottery tickets (good sparse subnetworks) as the key to describing the transitional phase between memorization and generalization. We refer to these subnetworks as ''Grokking tickets'', which is identified via magnitude pruning after perfect generalization. First, using ''Grokking tickets'', we show that the lottery tickets drastically accelerate grokking compared to the dense networks on various configurations (MLP and Transformer, and an arithmetic and image classification tasks). Additionally, to verify that ''Grokking ticket'' are a more critical factor than weight norms, we compared the ''good'' subnetworks with a dense network having the same L1 and L2 norms. Results show that the subnetworks generalize faster than the controlled dense model. In further investigations, we discovered that at an appropriate pruning rate, grokking can be achieved even without weight decay. We also show that speedup does not happen when using tickets identified at the memorization solution or transition between memorization and generalization or when pruning networks at the initialization (Random pruning, Grasp, SNIP, and Synflow). The results indicate that the weight norm of network parameters is not enough to explain the process of grokking, but the importance of finding good subnetworks to describe the transition from memorization to generalization. The implementation code can be accessed via this link: \url{https://github.com/gouki510/Grokking-Tickets}
    corecore