395 research outputs found

    Performance Evaluation and Optimization of Math-Similarity Search

    Full text link
    Similarity search in math is to find mathematical expressions that are similar to a user's query. We conceptualized the similarity factors between mathematical expressions, and proposed an approach to math similarity search (MSS) by defining metrics based on those similarity factors [11]. Our preliminary implementation indicated the advantage of MSS compared to non-similarity based search. In order to more effectively and efficiently search similar math expressions, MSS is further optimized. This paper focuses on performance evaluation and optimization of MSS. Our results show that the proposed optimization process significantly improved the performance of MSS with respect to both relevance ranking and recall.Comment: 15 pages, 8 figure

    A Markov chain model for changes in users’ assessment of search results

    Get PDF
    Previous research shows that users tend to change their assessment of search results over time. This is a first study that investigates the factors and reasons for these changes, and describes a stochastic model of user behaviour that may explain these changes. In particular, we hypothesise that most of the changes are local, i.e. between results with similar or close relevance to the query, and thus belong to the same ”coarse” relevance category. According to the theory of coarse beliefs and categorical thinking, humans tend to divide the range of values under consideration into coarse categories, and are thus able to distinguish only between cross-category values but not within them. To test this hypothesis we conducted five experiments with about 120 subjects divided into 3 groups. Each student in every group was asked to rank and assign relevance scores to the same set of search results over two or three rounds, with a period of three to nine weeks between each round. The subjects of the last three-round experiment were then exposed to the differences in their judgements and were asked to explain them. We make use of a Markov chain model to measure change in users’ judgments between the different rounds. The Markov chain demonstrates that the changes converge, and that a majority of the changes are local to a neighbouring relevance category. We found that most of the subjects were satisfied with their changes, and did not perceive them as mistakes but rather as a legitimate phenomenon, since they believe that time has influenced their relevance assessment. Both our quantitative analysis and user comments support the hypothesis of the existence of coarse relevance categories resulting from categorical thinking in the context of user evaluation of search results

    Notions of Connectivity in Overlay Networks

    Get PDF
    International audience" How well connected is the network? " This is one of the most fundamental questions one would ask when facing the challenge of designing a communication network. Three major notions of connectivity have been considered in the literature, but in the context of traditional (single-layer) networks, they turn out to be equivalent. This paper introduces a model for studying the three notions of connectivity in multi-layer networks. Using this model, it is easy to demonstrate that in multi-layer networks the three notions may differ dramatically. Unfortunately, in contrast to the single-layer case, where the values of the three connectivity notions can be computed efficiently, it has been recently shown in the context of WDM networks (results that can be easily translated to our model) that the values of two of these notions of connectivity are hard to compute or even approximate in multi-layer networks. The current paper shed some positive light into the multi-layer connectivity topic: we show that the value of the third connectivity notion can be computed in polynomial time and develop an approximation for the construction of well connected overlay networks

    Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History

    Full text link
    Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201

    The hw-rank: an h-index variant for ranking web pages

    Get PDF
    We introduce a novel ranking of search results based on a variant of the h-index for directed information networks such as the Web. The h-index was originally introduced to measure an individual researcher’s scientific output and influence, but here a variant of it is applied to assess the ‘‘importance’’ of web pages. Like PageRank, the‘‘importance’’ of a page is defined by the ‘‘importance’’ of the pages linking to it. However, unlike the computation of PageRank which involves the whole web graph, computing the h-index for web pages (the hw-rank) is based on a local computation and only the neighbors of the neighbors of the given node are considered. Preliminary results show a strong correlation between ranking with the hw-rank and PageRank, and moreover its computation is simpler and less complex than computation of the PageRank. Further, larger scale experiments are needed in order to assess the applicability of the method

    Publication and patent analysis of European researchers in the field of production technology and manufacturing systems

    Get PDF
    This paper develops a structured comparison among a sample of European researchers in the field of Production Technology and Manufacturing Systems, on the basis of scientific publications and patents. Researchers are evaluated and compared by a variegated set of indicators concerning (1) the output of individual researchers and (2) that of groups of researchers from the same country. While not claiming to be exhaustive, the results of this preliminary study provide a rough indication of the publishing and patenting activity of researchers in the field of interest, identifying (dis)similarities between different countries. Of particular interest is a proposal for aggregating analysis results by means of maps based on publication and patent indicators. A large amount of empirical data are presented and discusse

    Relationship among research collaboration, number of documents and number of citations. A case study in Spanish computer science production in 2000-2009.

    Get PDF
    This paper analyzes the relationship among research collaboration, number of documents and number of citations of computer science research activity. It analyzes the number of documents and citations and how they vary by number of authors. They are also analyzed (according to author set cardinality) under different circumstances, that is, when documents are written in different types of collaboration, when documents are published in different document types, when documents are published in different computer science subdisciplines, and, finally, when documents are published by journals with different impact factor quartiles. To investigate the above relationships, this paper analyzes the publications listed in the Web of Science and produced by active Spanish university professors between 2000 and 2009, working in the computer science field. Analyzing all documents, we show that the highest percentage of documents are published by three authors, whereas single-authored documents account for the lowest percentage. By number of citations, there is no positive association between the author cardinality and citation impact. Statistical tests show that documents written by two authors receive more citations per document and year than documents published by more authors. In contrast, results do not show statistically significant differences between documents published by two authors and one author. The research findings suggest that international collaboration results on average in publications with higher citation rates than national and institutional collaborations. We also find differences regarding citation rates between journals and conferences, across different computer science subdisciplines and journal quartiles as expected. Finally, our impression is that the collaborative level (number of authors per document) will increase in the coming years, and documents published by three or four authors will be the trend in computer science literature

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

    Network Archaeology: Uncovering Ancient Networks from Present-day Interactions

    Get PDF
    Often questions arise about old or extinct networks. What proteins interacted in a long-extinct ancestor species of yeast? Who were the central players in the Last.fm social network 3 years ago? Our ability to answer such questions has been limited by the unavailability of past versions of networks. To overcome these limitations, we propose several algorithms for reconstructing a network's history of growth given only the network as it exists today and a generative model by which the network is believed to have evolved. Our likelihood-based method finds a probable previous state of the network by reversing the forward growth model. This approach retains node identities so that the history of individual nodes can be tracked. We apply these algorithms to uncover older, non-extant biological and social networks believed to have grown via several models, including duplication-mutation with complementarity, forest fire, and preferential attachment. Through experiments on both synthetic and real-world data, we find that our algorithms can estimate node arrival times, identify anchor nodes from which new nodes copy links, and can reveal significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure

    The role of secondary char oxidation in the transition from smoldering to flaming

    Get PDF
    The transition from forward smoldering to flaming in polyurethane foam is observed using in-depth thermocouples and ultrasound probing. The experiments are conducted with small parallelepiped samples vertically placed in an upward wind tunnel. Three of the vertical sample sides are maintained at elevated temperature and the fourth is exposed to an upward oxidizer flow and a radiant heat flux. An ultrasound probing technique is used to measure the line-of-sight average permeability of the sample at the same heights as the thermocouples. The smolder front propagation is tracked by both the thermocouples and ultrasound data, which show an increase in temperature and permeability upon passage of the smolder front. The permeability data also show that the transition to flaming is preceded by rapid fluctuations in permeability in the char region below the smolder front, indicating the formation of pores by secondary char oxidation. The pores provide locations for the onset of gas-phase ignition (i.e. transition to flaming). The results from all the tests indicate that the formation of pores is a necessary but not sufficient condition for the transition to flaming. Two novel measures of the intensity of the secondary char oxidation are introduced: the time derivative of permeability, and the secondary char oxidation velocity. The time derivative of permeability, which provides a measure of the pore formation rate, is found to increase as the oxygen concentration and/or radiant heat flux increase, and to indicate the likelihood of the transition to flaming. The permeability data offers a means to track the propagation of the secondary char oxidation, and to calculate the secondary char oxidation velocity, which is found to be strongly correlated to the transition to flaming. A simplified energy balance model is able to predict the dependence of the secondary char oxidation velocity on oxygen concentration and radiant heat flux
    corecore