54 research outputs found

    A Splicing Approach to Best Subset of Groups Selection

    Full text link
    Best subset of groups selection (BSGS) is the process of selecting a small part of non-overlapping groups to achieve the best interpretability on the response variable. It has attracted increasing attention and has far-reaching applications in practice. However, due to the computational intractability of BSGS in high-dimensional settings, developing efficient algorithms for solving BSGS remains a research hotspot. In this paper,we propose a group-splicing algorithm that iteratively detects the relevant groups and excludes the irrelevant ones. Moreover, coupled with a novel group information criterion, we develop an adaptive algorithm to determine the optimal model size. Under mild conditions, it is certifiable that our algorithm can identify the optimal subset of groups in polynomial time with high probability. Finally, we demonstrate the efficiency and accuracy of our methods by comparing them with several state-of-the-art algorithms on both synthetic and real-world datasets.Comment: 49 pages, 7 figure

    A SIMPLE Approach to Provably Reconstruct Ising Model with Global Optimality

    Full text link
    Reconstruction of interaction network between random events is a critical problem arising from statistical physics and politics to sociology, biology, and psychology, and beyond. The Ising model lays the foundation for this reconstruction process, but finding the underlying Ising model from the least amount of observed samples in a computationally efficient manner has been historically challenging for half a century. By using the idea of sparsity learning, we present a approach named SIMPLE that has a dominant sample complexity from theoretical limit. Furthermore, a tuning-free algorithm is developed to give a statistically consistent solution of SIMPLE in polynomial time with high probability. On extensive benchmarked cases, the SIMPLE approach provably reconstructs underlying Ising models with global optimality. The application on the U.S. senators voting in the last six congresses reveals that both the Republicans and Democrats noticeably assemble in each congresses; interestingly, the assembling of Democrats is particularly pronounced in the latest congress

    Sciences for The 2.5-meter Wide Field Survey Telescope (WFST)

    Full text link
    The Wide Field Survey Telescope (WFST) is a dedicated photometric survey facility under construction jointly by the University of Science and Technology of China and Purple Mountain Observatory. It is equipped with a primary mirror of 2.5m in diameter, an active optical system, and a mosaic CCD camera of 0.73 Gpix on the main focus plane to achieve high-quality imaging over a field of view of 6.5 square degrees. The installation of WFST in the Lenghu observing site is planned to happen in the summer of 2023, and the operation is scheduled to commence within three months afterward. WFST will scan the northern sky in four optical bands (u, g, r, and i) at cadences from hourly/daily to semi-weekly in the deep high-cadence survey (DHS) and the wide field survey (WFS) programs, respectively. WFS reaches a depth of 22.27, 23.32, 22.84, and 22.31 in AB magnitudes in a nominal 30-second exposure in the four bands during a photometric night, respectively, enabling us to search tremendous amount of transients in the low-z universe and systematically investigate the variability of Galactic and extragalactic objects. Intranight 90s exposures as deep as 23 and 24 mag in u and g bands via DHS provide a unique opportunity to facilitate explorations of energetic transients in demand for high sensitivity, including the electromagnetic counterparts of gravitational-wave events detected by the second/third-generation GW detectors, supernovae within a few hours of their explosions, tidal disruption events and luminous fast optical transients even beyond a redshift of 1. Meanwhile, the final 6-year co-added images, anticipated to reach g about 25.5 mag in WFS or even deeper by 1.5 mag in DHS, will be of significant value to general Galactic and extragalactic sciences. The highly uniform legacy surveys of WFST will also serve as an indispensable complement to those of LSST which monitors the southern sky.Comment: 46 pages, submitted to SCMP

    BTOB: Extending the Biased GWAS to Bivariate GWAS

    No full text
    10.3389/fgene.2021.654821Frontiers in Genetics1265482

    The effects of the E3 ubiquitin–protein ligase UBR7 of Frankliniella occidentalis on the ability of insects to acquire and transmit TSWV

    No full text
    The interactions between plant viruses and insect vectors are very complex. In recent years, RNA sequencing data have been used to elucidate critical genes of Tomato spotted wilt ortho-tospovirus (TSWV) and Frankliniella occidentalis (F. occidentalis). However, very little is known about the essential genes involved in thrips acquisition and transmission of TSWV. Based on transcriptome data of F. occidentalis infected with TSWV, we verified the complete sequence of the E3 ubiquitin-protein ligase UBR7 gene (UBR7), which is closely related to virus transmission. Additionally, we found that UBR7 belongs to the E3 ubiquitin–protein ligase family that is highly expressed in adulthood in F. occidentalis. UBR7 could interfere with virus replication and thus affect the transmission efficiency of F. occidentalis. With low URB7 expression, TSWV transmission efficiency decreased, while TSWV acquisition efficiency was unaffected. Moreover, the direct interaction between UBR7 and the nucleocapsid (N) protein of TSWV was investigated through surface plasmon resonance and GST pull-down. In conclusion, we found that UBR7 is a crucial protein for TSWV transmission by F. occidentalis, as it directly interacts with TSWV N. This study provides a new direction for developing green pesticides targeting E3 ubiquitin to control TSWV and F. occidentalis

    Blockchain-based incentives for secure and collaborative data sharing in multiple clouds

    No full text
    The prosperity of cloud computing has driven an increasing number of enterprises and organizations to store their data on private or public cloud platforms. Due to the limitation of individual data owners in terms of data volume and diversity, data sharing over different cloud platforms would enable third parties to take advantage of big data analysis techniques to provide value-added services, such as providing healthcare services for customers by gathering medical data from multiple hospitals. However, it remains a challenging task to design effective incentives that encourage secure and collaborative data sharing in multiple clouds. In this paper, we propose a reliable collaboration model consisting of three types of participants, which include data owners, miners, and third parties, where the data is shared via blockchain and recorded by a smart contract. In general, these participants may acquire and store the sharing of data using their private or public clouds. We analyze the topological relationships between the participants and develop some Shapley value models from simple to complicate in the process of revenue distribution. We also discuss the incentive effect of sharing security data and rationality of the designed solution through analysis towards distribution rules.This work is partially supported by the Beijing Natural Science Foundation under Grant 4192050, and in part by the National Natural Science Foundation of China under Grants 61972039 and 61872041

    abess: A Fast Best Subset Selection Library in Python and R

    Full text link
    We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the abess certifiably gets the optimal solution within polynomial times with high probability under the linear model. Our efficient implementation allows abess to attain the solution of best-subset selection problems as fast as or even 20x faster than existing competing variable (model) selection toolboxes. Furthermore, it supports common variants like best group subset selection and 2\ell_2 regularized best-subset selection. The core of the library is programmed in C++. For ease of use, a Python library is designed for conveniently integrating with scikit-learn, and it can be installed from the Python library Index. In addition, a user-friendly R library is available at the Comprehensive R Archive Network. The source code is available at: https://github.com/abess-team/abess

    Distributed vibration sensor with a high strain dynamic range by harmonics analysis

    No full text
    Distributed vibration sensors (DVSs) have important applications in industrial production. A large strain dynamic range is very important for DVSs, but difficult to achieve as it often requires a complex sensing system or cumbersome data processing. This study shows that the strain dynamic range of the DVS can be improved by analyzing the harmonic numbers in the vibration response spectrum, and the vibration amplitude can be quantitatively measured with a strain resolution of 0.78 με for each additional harmonic. The system and data analysis method can significantly improve the strain dynamic range in comparison with traditional DVS based on the intensity or polarization information
    corecore