139 research outputs found

    NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization

    Full text link
    We study the problem of large-scale network embedding, which aims to learn latent representations for network mining applications. Previous research shows that 1) popular network embedding benchmarks, such as DeepWalk, are in essence implicitly factorizing a matrix with a closed form, and 2)the explicit factorization of such matrix generates more powerful embeddings than existing methods. However, directly constructing and factorizing this matrix---which is dense---is prohibitively expensive in terms of both time and space, making it not scalable for large networks. In this work, we present the algorithm of large-scale network embedding as sparse matrix factorization (NetSMF). NetSMF leverages theories from spectral sparsification to efficiently sparsify the aforementioned dense matrix, enabling significantly improved efficiency in embedding learning. The sparsified matrix is spectrally close to the original dense one with a theoretically bounded approximation error, which helps maintain the representation power of the learned embeddings. We conduct experiments on networks of various scales and types. Results show that among both popular benchmarks and factorization based methods, NetSMF is the only method that achieves both high efficiency and effectiveness. We show that NetSMF requires only 24 hours to generate effective embeddings for a large-scale academic collaboration network with tens of millions of nodes, while it would cost DeepWalk months and is computationally infeasible for the dense matrix factorization solution. The source code of NetSMF is publicly available (https://github.com/xptree/NetSMF).Comment: 11 pages, in Proceedings of the Web Conference 2019 (WWW 19

    Potential of Core-Collapse Supernova Neutrino Detection at JUNO

    Get PDF
    JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve

    Detection of the Diffuse Supernova Neutrino Background with JUNO

    Get PDF
    As an underground multi-purpose neutrino detector with 20 kton liquid scintillator, Jiangmen Underground Neutrino Observatory (JUNO) is competitive with and complementary to the water-Cherenkov detectors on the search for the diffuse supernova neutrino background (DSNB). Typical supernova models predict 2-4 events per year within the optimal observation window in the JUNO detector. The dominant background is from the neutral-current (NC) interaction of atmospheric neutrinos with 12C nuclei, which surpasses the DSNB by more than one order of magnitude. We evaluated the systematic uncertainty of NC background from the spread of a variety of data-driven models and further developed a method to determine NC background within 15\% with {\it{in}} {\it{situ}} measurements after ten years of running. Besides, the NC-like backgrounds can be effectively suppressed by the intrinsic pulse-shape discrimination (PSD) capabilities of liquid scintillators. In this talk, I will present in detail the improvements on NC background uncertainty evaluation, PSD discriminator development, and finally, the potential of DSNB sensitivity in JUNO

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Full text link
    Core-collapse supernova (CCSN) is one of the most energetic astrophysical events in the Universe. The early and prompt detection of neutrinos before (pre-SN) and during the SN burst is a unique opportunity to realize the multi-messenger observation of the CCSN events. In this work, we describe the monitoring concept and present the sensitivity of the system to the pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is a 20 kton liquid scintillator detector under construction in South China. The real-time monitoring system is designed with both the prompt monitors on the electronic board and online monitors at the data acquisition stage, in order to ensure both the alert speed and alert coverage of progenitor stars. By assuming a false alert rate of 1 per year, this monitoring system can be sensitive to the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos up to about 370 (360) kpc for a progenitor mass of 30MM_{\odot} for the case of normal (inverted) mass ordering. The pointing ability of the CCSN is evaluated by using the accumulated event anisotropy of the inverse beta decay interactions from pre-SN or SN neutrinos, which, along with the early alert, can play important roles for the followup multi-messenger observations of the next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Working Set Selection Using Second Order Information for Training Support Vector Machines

    No full text
    Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast con- vergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information

    A Study on Threshold Selection for Multi-label

    No full text
    Classificatio

    Evaluation Criteria for Multi-label Classification

    No full text
    多標籤分類近年來在各種應用中越來越普遍,比如在文件分類或多媒體搜尋系統。為滿足不同應用的需求,許多評分標準被提出。目前最常被用來解決多標籤分類的方法為雙類比對。此方法替每個標籤創造一個判斷函數。對於某些應用而言,調整判斷函數的門檻值會增進效能。在本篇論文中,我們針對門檻值的選擇進行深入探討。並透過真實應用產生的資料來展示這類方法的有用之處。Multi-label classification becomes more and more popular in recent years. It is used in, for example, text categorization or multimedia retrieval systems. Many evaluation criteria are proposed for different application needs. A commonly used approach for multi-label classification is the binary method, which constructs a decision function per label. For some applications, adjusting thresholds in decision functions improves the performance. This thesis gives a comprehensive study on the selection of thresholds. Experiments on several real-world data sets demonstrate the usefulness of some simple selection strategies.口試委員審定書 i 中文摘要 ii ABSTRACT iii LIST OF TABLES vi CHAPTER I. Introduction 1 II. Binary Method and Evaluation Measures 4 2.1 The Binary Method 4 2.2 Evaluation Criteria 5 2.2.1 Exact Match Ratio 6 2.2.2 Macro-average and Micro-average F-measure 6 2.2.3 Ranking Based Measures 7 2.3 Issues on Optimizing Different Measures 9 III. Optimize Measures via Supervised Threshold Setting 14 3.1 Supervised Threshold Setting in Binary Method 14 3.1.1 The SVM.1-type Methods 16 3.2 Real-World Data Sets 21 3.2.1 Yahoo! 22 3.2.2 scene 23 3.2.3 yeast 24 3.2.4 OHSUMED 25 3.2.5 RCV1-V2 25 3.3 Experiments 25 3.3.1 Experimental Settings 26 3.3.2 Optimizing Macro-average F-measure 27 3.3.3 Optimizing Micro-average F-measure 30 3.3.4 Optimizing Exact Match Ratio 32 3.3.5 Discussion and Conclusion 35 IV. Conclusions 38 BIBLIOGRAPHY 3
    corecore