813 research outputs found

    Faster Random Walks By Rewiring Online Social Networks On-The-Fly

    Full text link
    Many online social networks feature restrictive web interfaces which only allow the query of a user's local neighborhood through the interface. To enable analytics over such an online social network through its restrictive web interface, many recent efforts reuse the existing Markov Chain Monte Carlo methods such as random walks to sample the social network and support analytics based on the samples. The problem with such an approach, however, is the large amount of queries often required (i.e., a long "mixing time") for a random walk to reach a desired (stationary) sampling distribution. In this paper, we consider a novel problem of enabling a faster random walk over online social networks by "rewiring" the social network on-the-fly. Specifically, we develop Modified TOpology (MTO)-Sampler which, by using only information exposed by the restrictive web interface, constructs a "virtual" overlay topology of the social network while performing a random walk, and ensures that the random walk follows the modified overlay topology rather than the original one. We show that MTO-Sampler not only provably enhances the efficiency of sampling, but also achieves significant savings on query cost over real-world online social networks such as Google Plus, Epinion etc.Comment: 15 pages, 14 figure, technical report for ICDE2013 paper. Appendix has all the theorems' proofs; ICDE'201

    Effectiveness and chemical pest control of Bt-cotton in the Yangtze River Valley, China

    Get PDF
    The sustainability of Bt-cotton in China, at least along Yellow River Valley, has been questioned, so this paper examines its effectiveness along Yangtze River Valley, where Bt-cotton is also widely sown, to determine what might be the factors beneath the limited or reduced effectiveness being observed. The analysis is based on the data collected for several years from many locations in the Yangtze River Valley Varietal Experiment Network to provide information on the varieties and their agronomic performance, on the control of their GM characteristic, on the expression of the Bt-gene and on chemical control practices against cotton pests.All varieties declared to be Bt-cotton were confirmed to have the Bt-gene, the expression of which was assessed in three ways: through the analysis of Bt-protein production and through indoor and outdoor bioassays. Gene expression varied substantially between varieties and between years for the few varieties which were tested in two subsequent years.The Bt-cotton varieties being sown cannot control bollworms totally even early in the growing season, so surviving larvae could inevitably be observed, and this led farmers (or professionals in charge of supplying technical assistance to farmers) to spray chemicals regardless of the real infestation level. This demonstrates behaviour aimed at eradication of the pests as bollworms seem to be treated chemically more often than is required and far earlier than necessary on the first two generations of H. armigera. The chemical control of the Bt-cotton in Yangtze River Valley hence is not optimal, thus farmers are paying high prices for varieties which are not totally resistant to bollworms and pest control costs are not reduced to the extent that they might expect, lowering the profitability of cotton production. Also chemical protection costs are not decreasing as those pests unaffected by the Bt-gene, mainly but not exclusively sucking ones are requiring more control. This is illustrative of a phenomenon of pest complex shift which deserves more attention in following up the Bt-cotton use.China; Bt; cotton; gene expression; chemical control; effectiveness

    GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification

    Full text link
    Graph neural networks (GNNs) have achieved great success in node classification tasks. However, existing GNNs naturally bias towards the majority classes with more labelled data and ignore those minority classes with relatively few labelled ones. The traditional techniques often resort over-sampling methods, but they may cause overfitting problem. More recently, some works propose to synthesize additional nodes for minority classes from the labelled nodes, however, there is no any guarantee if those generated nodes really stand for the corresponding minority classes. In fact, improperly synthesized nodes may result in insufficient generalization of the algorithm. To resolve the problem, in this paper we seek to automatically augment the minority classes from the massive unlabelled nodes of the graph. Specifically, we propose \textit{GraphSR}, a novel self-training strategy to augment the minority classes with significant diversity of unlabelled nodes, which is based on a Similarity-based selection module and a Reinforcement Learning(RL) selection module. The first module finds a subset of unlabelled nodes which are most similar to those labelled minority nodes, and the second one further determines the representative and reliable nodes from the subset via RL technique. Furthermore, the RL-based module can adaptively determine the sampling scale according to current training data. This strategy is general and can be easily combined with different GNNs models. Our experiments demonstrate the proposed approach outperforms the state-of-the-art baselines on various class-imbalanced datasets.Comment: Accepted by AAAI202

    Open Banking: Credit Market Competition When Borrowers Own the Data

    Get PDF
    Open banking facilitates data sharing consented to by customers who generate the data, with the regulatory goal of promoting competition between traditional banks and challenger fintech entrants. We study lending market competition when sharing banks’ customer transaction data enables better borrower screening. Open banking can make the entire financial industry better off yet leave all borrowers worse off, even if borrowers have the control of whether to share their banking data. We highlight the importance of the equilibrium credit quality inference from borrowers’ endogenous sign-up decisions. We also study extensions with fintech affinities and data sharing on borrower preferences

    Open Banking: Credit Market Competition When Borrowers Own the Data

    Get PDF
    Open banking facilitates data sharing consented by customers who generate the data, with a regulatory goal of promoting competition between traditional banks and challenger fintech entrants. We study lending market competition when sharing banks’ customer data enables better borrower screening or targeting by fintech lenders. Open banking could make the entire financial industry better off yet leave all borrowers worse off, even if borrowers could choose whether to share their data. We highlight the importance of equilibrium credit quality inference from borrowers’ endogenous sign-up decisions. When data sharing triggers privacy concerns by facilitating exploitative targeted loans, the equilibrium sign-up population can grow with the degree of privacy concerns
    • …
    corecore