4,204 research outputs found

    PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs

    Full text link
    {\it SimRank} is a classic measure of the similarities of nodes in a graph. Given a node uu in graph G=(V,E)G =(V, E), a {\em single-source SimRank query} returns the SimRank similarities s(u,v)s(u, v) between node uu and each node vVv \in V. This type of queries has numerous applications in web search and social networks analysis, such as link prediction, web mining, and spam detection. Existing methods for single-source SimRank queries, however, incur query cost at least linear to the number of nodes nn, which renders them inapplicable for real-time and interactive analysis. { This paper proposes \prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries. \prsim uses an index of size O(m)O(m), where mm is the number of edges in the graph, and guarantees a query time that depends on the {\em reverse PageRank} distribution of the input graph. In particular, we prove that \prsim runs in sub-linear time if the degree distribution of the input graph follows the power-law distribution, a property possessed by many real-world graphs. Based on the theoretical analysis, we show that the empirical query time of all existing SimRank algorithms also depends on the reverse PageRank distribution of the graph.} Finally, we present the first experimental study that evaluates the absolute errors of various SimRank algorithms on large graphs, and we show that \prsim outperforms the state of the art in terms of query time, accuracy, index size, and scalability.Comment: ACM SIGMOD 201

    Rapid glycation with D-ribose induces globular amyloid-like aggregations of BSA with high cytotoxicity to SH-SY5Y cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>D-ribose in cells and human serum participates in glycation of proteins resulting in advanced glycation end products (AGEs) that affect cell metabolism and induce cell death. However, the mechanism by which D-ribose-glycated proteins induce cell death is still unclear.</p> <p>Results</p> <p>Here, we incubated D-ribose with bovine serum albumin (BSA) and observed changes in the intensity of fluorescence at 410 nm and 425 nm to monitor the formation of D-ribose-glycated BSA. Comparing glycation of BSA with xylose (a control for furanose), glucose and fructose (controls for pyranose), the rate of glycation with D-ribose was the most rapid. Protein intrinsic fluorescence (335 nm), Nitroblue tetrazolium (NBT) assays and Western blotting with anti-AGEs showed that glycation of BSA incubated with D-ribose occurred faster than for the other reducing sugars. Protein intrinsic fluorescence showed marked conformational changes when BSA was incubated with D-ribose. Importantly, observations with atomic force microscopy showed that D-ribose-glycated BSA appeared in globular polymers. Furthermore, a fluorescent assay with Thioflavin T (ThT) showed a remarkable increase in fluorescence at 485 nm in the presence of D-ribose-glycated BSA. However, ThT fluorescence did not show the same marked increase in the presence of xylose or glucose. This suggests that glycation with D-ribose induced BSA to aggregate into globular amyloid-like deposits. As observed by Hoechst 33258 staining, 3-(4, 5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide (MTT) and cell counting kit-8 (CCK-8) assay, lactate dehydrogenase (LDH) activity assay, flow cytometry using Annexin V and Propidium Iodide staining and reactive oxygen species (ROS) measurements, the amyloid-like aggregation of glycated BSA induced apoptosis in the neurotypic cell line SH-SY5Y.</p> <p>Conclusion</p> <p>Glycation with D-ribose induces BSA to misfold rapidly and form globular amyloid-like aggregations which play an important role in cytotoxicity to neural cells.</p

    Automatically learning topics and difficulty levels of problems in online judge systems

    Get PDF
    Online Judge (OJ) systems have been widely used in many areas, including programming, mathematical problems solving, and job interviews. Unlike other online learning systems, such as Massive Open Online Course, most OJ systems are designed for self-directed learning without the intervention of teachers. Also, in most OJ systems, problems are simply listed in volumes and there is no clear organization of them by topics or difficulty levels. As such, problems in the same volume are mixed in terms of topics or difficulty levels. By analyzing large-scale users’ learning traces, we observe that there are two major learning modes (or patterns). Users either practice problems in a sequential manner from the same volume regardless of their topics or they attempt problems about the same topic, which may spread across multiple volumes. Our observation is consistent with the findings in classic educational psychology. Based on our observation, we propose a novel two-mode Markov topic model to automatically detect the topics of online problems by jointly characterizing the two learning modes. For further predicting the difficulty level of online problems, we propose a competition-based expertise model using the learned topic information. Extensive experiments on three large OJ datasets have demonstrated the effectiveness of our approach in three different tasks, including skill topic extraction, expertise competition prediction and problem recommendation

    Improving multi-hop knowledge base question answering by learning intermediate supervision signals

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding InitiativeThe code is available at https://github.com/RichardHGL/WSDM2021_NSM</p

    Characterizing and Predicting Early Reviewers for Effective Product Marketing on E-Commerce Websites

    Get PDF
    Online reviews have become an important source of information for users before making an informed purchase decision. Early reviews of a product tend to have a high impact on the subsequent product sales. In this paper, we take the initiative to study the behavior characteristics of early reviewers through their posted reviews on two real-world large e-commerce platforms, i.e., Amazon and Yelp. In specific, we divide product lifetime into three consecutive stages, namely early, majority and laggards. A user who has posted a review in the early stage is considered as an early reviewer. We quantitatively characterize early reviewers based on their rating behaviors, the helpfulness scores received from others and the correlation of their reviews with product popularity. We have found that (1) an early reviewer tends to assign a higher average rating score; and (2) an early reviewer tends to post more helpful reviews. Our analysis of product reviews also indicates that early reviewers' ratings and their received helpfulness scores are likely to influence product popularity. By viewing review posting process as a multiplayer competition game, we propose a novel margin-based embedding model for early reviewer prediction. Extensive experiments on two different e-commerce datasets have shown that our proposed approach outperforms a number of competitive baselines

    A computational approach to measuring the correlation between expertise and social media influence for celebrities on microblogs

    Get PDF
    Social media influence analysis, sometimes also called authority detection, aims to rank users based on their influence scores in social media. Existing approaches of social influence analysis usually focus on how to develop effective algorithms to quantize users’ influence scores. They rarely consider a person’s expertise levels which are arguably important to influence measures. In this paper, we propose a computational approach to measuring the correlation between expertise and social media influence, and we take a new perspective to understand social media influence by incorporating expertise into influence analysis. We carefully constructed a large dataset of 13,684 Chinese celebrities from Sina Weibo (literally ”Sina microblogging”). We found that there is a strong correlation between expertise levels and social media influence scores. Our analysis gave a good explanation of the phenomenon of “top across-domain influencers”. In addition, different expertise levels showed influence variation patterns: e.g., (1) high-expertise celebrities have stronger influence on the “audience” in their expertise domains; (2) expertise seems to be more important than relevance and participation for social media influence; (3) the audiences of top expertise celebrities are more likely to forward tweets on topics outside the expertise domains from high-expertise celebrities

    Complex Knowledge Base Question Answering: A Survey

    Full text link
    Knowledge base question answering (KBQA) aims to answer a question over a knowledge base (KB). Early studies mainly focused on answering simple questions over KBs and achieved great success. However, their performance on complex questions is still far from satisfactory. Therefore, in recent years, researchers propose a large number of novel methods, which looked into the challenges of answering complex questions. In this survey, we review recent advances on KBQA with the focus on solving complex questions, which usually contain multiple subjects, express compound relations, or involve numerical operations. In detail, we begin with introducing the complex KBQA task and relevant background. Then, we describe benchmark datasets for complex KBQA task and introduce the construction process of these datasets. Next, we present two mainstream categories of methods for complex KBQA, namely semantic parsing-based (SP-based) methods and information retrieval-based (IR-based) methods. Specifically, we illustrate their procedures with flow designs and discuss their major differences and similarities. After that, we summarize the challenges that these two categories of methods encounter when answering complex questions, and explicate advanced solutions and techniques used in existing work. Finally, we conclude and discuss several promising directions related to complex KBQA for future research.Comment: 20 pages, 4 tables, 7 figures. arXiv admin note: text overlap with arXiv:2105.1164

    Over-expression of human cytomegalovirus miR-US25-2-3p downregulates eIF4A1 and inhibits HCMV replication

    Get PDF
    AbstractIt has been reported that human cytomegalovirus (HCMV) miR-US25-2 reduces DNA viral replication including HCMV. However, the mechanism remains unknown. In our study, eukaryotic translation initiation factor 4A1 (eIF4A1) was identified to be a direct target of miR-US25-2-3p. Small interfering RNA (siRNA) and miR-US25-2-3p mediated eIF4A1 knockdown experiments revealed that high level of miR-US25-2-3p in MRC-5 cells decreased HCMV and host genomic DNA synthesis, and inhibited cap-dependent translation and host cell proliferation. However, eIF4A1 up-regulation induced by miR-US25-2-3p inhibitor increased HCMV copy number. Therefore, the over-expression of miR-US25-2-3p and consequent lower expression of eIF4A1 may contribute to the inhibition of HCMV replication

    Consistent responses of soil microbial taxonomic and functional attributes to mercury pollution across China

    Get PDF
    Background: The ecological consequences of mercury (Hg) pollution—one of the major pollutants worldwide—on microbial taxonomic and functional attributes remain poorly understood and largely unexplored. Using soils from two typical Hg-impacted regions across China, here, we evaluated the role of Hg pollution in regulating bacterial abundance, diversity, and co-occurrence network. We also investigated the associations between Hg contents and the relative abundance of microbial functional genes by analyzing the soil metagenomes from a subset of those sites. Results: We found that soil Hg largely influenced the taxonomic and functional attributes of microbial communities in the two studied regions. In general, Hg pollution was negatively related to bacterial abundance, but positively related to the diversity of bacteria in two separate regions. We also found some consistent associations between soil Hg contents and the community composition of bacteria. For example, soil total Hg content was positively related to the relative abundance of Firmicutes and Bacteroidetes in both paddy and upland soils. In contrast, the methylmercury (MeHg) concentration was negatively correlated to the relative abundance of Nitrospirae in the two types of soils. Increases in soil Hg pollution correlated with drastic changes in the relative abundance of ecological clusters within the co-occurrence network of bacterial communities for the two regions. Using metagenomic data, we were also able to detect the effect of Hg pollution on multiple functional genes relevant to key soil processes such as element cycles and Hg transformations (e.g., methylation and reduction). Conclusions: Together, our study provides solid evidence that Hg pollution has predictable and significant effects on multiple taxonomic and functional attributes including bacterial abundance, diversity, and the relative abundance of ecological clusters and functional genes. Our results suggest an increase in soil Hg pollution linked to human activities will lead to predictable shifts in the taxonomic and functional attributes in the Hg-impacted areas, with potential implications for sustainable management of agricultural ecosystems and elsewhere
    corecore