451 research outputs found

    Performance Evaluation and Optimization of Math-Similarity Search

    Full text link
    Similarity search in math is to find mathematical expressions that are similar to a user's query. We conceptualized the similarity factors between mathematical expressions, and proposed an approach to math similarity search (MSS) by defining metrics based on those similarity factors [11]. Our preliminary implementation indicated the advantage of MSS compared to non-similarity based search. In order to more effectively and efficiently search similar math expressions, MSS is further optimized. This paper focuses on performance evaluation and optimization of MSS. Our results show that the proposed optimization process significantly improved the performance of MSS with respect to both relevance ranking and recall.Comment: 15 pages, 8 figure

    Web citations in patents: Evidence of technological impact?

    Get PDF
    This is an accepted manuscript of an article published by Wiley Blackwell in Journal of the Association for Information Science and Technology on 17/07/2017, available online: https://doi.org/10.1002/asi.23821 The accepted version of the publication may differ from the final published version.Patents sometimes cite web pages either as general background to the problem being addressed or to identify prior publications that will limit the scope of the patent granted. Counts of the number of patents citing an organisation’s website may therefore provide an indicator of its technological capacity or relevance. This article introduces methods to extract URL citations from patents and evaluates the usefulness of counts of patent web citations as a technology indicator. An analysis of patents citing 200 US universities or 177 UK universities found computer science and engineering departments to be frequently cited, as well as research-related web pages, such as Wikipedia, YouTube or Internet Archive. Overall, however, patent URL citations seem to be frequent enough to be useful for ranking major US and the top few UK universities if popular hosted subdomains are filtered out, but the hit count estimates on the first search engine results page should not be relied upon for accuracy

    Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History

    Full text link
    Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201

    Search Engine Similarity Analysis: A Combined Content and Rankings Approach

    Full text link
    How different are search engines? The search engine wars are a favorite topic of on-line analysts, as two of the biggest companies in the world, Google and Microsoft, battle for prevalence of the web search space. Differences in search engine popularity can be explained by their effectiveness or other factors, such as familiarity with the most popular first engine, peer imitation, or force of habit. In this work we present a thorough analysis of the affinity of the two major search engines, Google and Bing, along with DuckDuckGo, which goes to great lengths to emphasize its privacy-friendly credentials. To do so, we collected search results using a comprehensive set of 300 unique queries for two time periods in 2016 and 2019, and developed a new similarity metric that leverages both the content and the ranking of search responses. We evaluated the characteristics of the metric against other metrics and approaches that have been proposed in the literature, and used it to (1) investigate the similarities of search engine results, (2) the evolution of their affinity over time, (3) what aspects of the results influence similarity, and (4) how the metric differs over different kinds of search services. We found that Google stands apart, but Bing and DuckDuckGo are largely indistinguishable from each other.Comment: Shorter version of this paper was accepted in the 21st International Conference on Web Information Systems Engineering (WISE 2020). The final authenticated version is available online at https://doi.org/10.1007/978-3-030-62008-0_

    Publication and patent analysis of European researchers in the field of production technology and manufacturing systems

    Get PDF
    This paper develops a structured comparison among a sample of European researchers in the field of Production Technology and Manufacturing Systems, on the basis of scientific publications and patents. Researchers are evaluated and compared by a variegated set of indicators concerning (1) the output of individual researchers and (2) that of groups of researchers from the same country. While not claiming to be exhaustive, the results of this preliminary study provide a rough indication of the publishing and patenting activity of researchers in the field of interest, identifying (dis)similarities between different countries. Of particular interest is a proposal for aggregating analysis results by means of maps based on publication and patent indicators. A large amount of empirical data are presented and discusse

    The hw-rank: an h-index variant for ranking web pages

    Get PDF
    We introduce a novel ranking of search results based on a variant of the h-index for directed information networks such as the Web. The h-index was originally introduced to measure an individual researcher’s scientific output and influence, but here a variant of it is applied to assess the ‘‘importance’’ of web pages. Like PageRank, the‘‘importance’’ of a page is defined by the ‘‘importance’’ of the pages linking to it. However, unlike the computation of PageRank which involves the whole web graph, computing the h-index for web pages (the hw-rank) is based on a local computation and only the neighbors of the neighbors of the given node are considered. Preliminary results show a strong correlation between ranking with the hw-rank and PageRank, and moreover its computation is simpler and less complex than computation of the PageRank. Further, larger scale experiments are needed in order to assess the applicability of the method

    Relationship among research collaboration, number of documents and number of citations. A case study in Spanish computer science production in 2000-2009.

    Get PDF
    This paper analyzes the relationship among research collaboration, number of documents and number of citations of computer science research activity. It analyzes the number of documents and citations and how they vary by number of authors. They are also analyzed (according to author set cardinality) under different circumstances, that is, when documents are written in different types of collaboration, when documents are published in different document types, when documents are published in different computer science subdisciplines, and, finally, when documents are published by journals with different impact factor quartiles. To investigate the above relationships, this paper analyzes the publications listed in the Web of Science and produced by active Spanish university professors between 2000 and 2009, working in the computer science field. Analyzing all documents, we show that the highest percentage of documents are published by three authors, whereas single-authored documents account for the lowest percentage. By number of citations, there is no positive association between the author cardinality and citation impact. Statistical tests show that documents written by two authors receive more citations per document and year than documents published by more authors. In contrast, results do not show statistically significant differences between documents published by two authors and one author. The research findings suggest that international collaboration results on average in publications with higher citation rates than national and institutional collaborations. We also find differences regarding citation rates between journals and conferences, across different computer science subdisciplines and journal quartiles as expected. Finally, our impression is that the collaborative level (number of authors per document) will increase in the coming years, and documents published by three or four authors will be the trend in computer science literature
    corecore