3,309 research outputs found

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    FS^3: A Sampling based method for top-k Frequent Subgraph Mining

    Get PDF
    Mining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task which is computationally expensive, so they suffer from the lack of scalability problem when the graphs in the input database are large. In this work, we propose FS^3, which is a sampling based method. It mines a small collection of subgraphs that are most frequent in the probabilistic sense. FS^3 performs a Markov Chain Monte Carlo (MCMC) sampling over the space of a fixed-size subgraphs such that the potentially frequent subgraphs are sampled more often. Besides, FS^3 is equipped with an innovative queue manager. It stores the sampled subgraph in a finite queue over the course of mining in such a manner that the top-k positions in the queue contain the most frequent subgraphs. Our experiments on database of large graphs show that FS^3 is efficient, and it obtains subgraphs that are the most frequent amongst the subgraphs of a given size

    Con-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec

    Get PDF
    We present a novel approach to learn distributed representation of sentences from unlabeled data by modeling both content and context of a sentence. The content model learns sentence representation by predicting its words. On the other hand, the context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. We evaluate the models in a setup, where contextual information is available. The experimental results on tasks involving classification, clustering, and ranking of sentences show that our model outperforms the best existing models by a wide margin across multiple datasets

    Name Disambiguation from link data in a collaboration graph using temporal and topological features

    Get PDF
    In a social community, multiple persons may share the same name, phone number or some other identifying attributes. This, along with other phenomena, such as name abbreviation, name misspelling, and human error leads to erroneous aggregation of records of multiple persons under a single reference. Such mistakes affect the performance of document retrieval, web search, database integration, and more importantly, improper attribution of credit (or blame). The task of entity disambiguation partitions the records belonging to multiple persons with the objective that each decomposed partition is composed of records of a unique person. Existing solutions to this task use either biographical attributes, or auxiliary features that are collected from external sources, such as Wikipedia. However, for many scenarios, such auxiliary features are not available, or they are costly to obtain. Besides, the attempt of collecting biographical or external data sustains the risk of privacy violation. In this work, we propose a method for solving entity disambiguation task from link information obtained from a collaboration network. Our method is non-intrusive of privacy as it uses only the time-stamped graph topology of an anonymized network. Experimental results on two real-life academic collaboration networks show that the proposed method has satisfactory performance.Comment: The short version of this paper has been accepted to ASONAM 201

    Incremental eigenpair computation for graph Laplacian matrices: theory and applications

    Get PDF
    The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications, the number of clusters or communities (say, K) is generally unknown a priori. Consequently, the majority of the existing methods either choose K heuristically or they repeat the clustering method with different choices of K and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the Kth smallest eigenpair of the Laplacian matrix given a collection of all previously compute

    Honey bee foraging: persistence to non-rewarding feeding locations and waggle dance communication

    Get PDF
    The honey bee, Apis mellifera, is important in agriculture and also as a model species in scientific research. This Master’s thesis is focused on honey bee foraging behaviour. It contains two independent experiments, each on a different subject within the area of foraging. Both use a behavioural ecology approach, with one investigating foraging behaviour and the other foraging communication. These form chapters 2 and 3 of the thesis, after an introductory chapter. Chapter 2. Experiment 1: Persistence to unrewarding feeding locations by forager honey bees (Apis mellifera): the effects of experience, resource profitability, and season This study shows that the persistence of honey bee foragers to unrewarding food sources, measured both in duration and number of visits, was greater to locations that previously offered sucrose solution of higher concentration (2 versus 1molar) or were closer to the hive (20 versus 450m). Persistence was also greater in bees which had longer access at the feeder before the syrup was terminated (2 versus 0.5h). These results indicate that persistence is greater for more rewarding locations. However, persistence was not higher in the season of lowest nectar availability in the environment. Chapter 3. Experiment 2: Honey bee waggle dance communication: signal meaning and signal noise affect dance follower behaviour This study shows that honey bee foragers follow fewer waggle runs as the distance to the food source, that is advertised by the dance, increases, but invest more time in following these dances. This is because waggle run duration increases with increasing foraging distance. The number of waggle runs followed for distant food sources was further reduced by increased angular noise among waggle runs within a dance. The number of dance followers per dancing bee was affected by the time of year and varied among colonies. Both noise in the message, that is variation in the direction component, and the message itself, that is the distance of the advertised food location, affect dance following. These results indicate that dance followers pay attention to the costs and benefits associated with using dance information

    The Effectiveness of Educational Games on Scientific Concepts Acquisition in First Grade Students in Science

    Get PDF
    This study aimed at investigating the effectiveness of educational games on scientific concepts acquisition by the first grade students. The sample of the study consisted of (53) male and female students distributed into two groups: experimental group (n=26) which taught by educational games, and control group (n=27) which taught by traditional method. To achieve the purpose of the study, the researcher developed a teaching guide included eight educational games, and a test to measure scientific concepts acquisition. Results showed that there were statistically significant differences in students’ scientific concepts acquisition due to the method of teaching in favor of the experimental group. Also, there were no statistically significant differences in students’ scientific concepts acquisition due to the gender or the interaction between method of teaching and gender. The study recommended using educational games in teaching science in primary education. Keywords: Educational Games, Scientific Concepts, Science

    Neural‑Brane: Neural Bayesian Personalized Ranking for Attributed Network Embedding

    Get PDF
    Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes. Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node pair and a dissimilar node pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods
    corecore