9,977 research outputs found

    Effective and Efficient Similarity Index for Link Prediction of Complex Networks

    Get PDF
    Predictions of missing links of incomplete networks like protein-protein interaction networks or very likely but not yet existent links in evolutionary networks like friendship networks in web society can be considered as a guideline for further experiments or valuable information for web users. In this paper, we introduce a local path index to estimate the likelihood of the existence of a link between two nodes. We propose a network model with controllable density and noise strength in generating links, as well as collect data of six real networks. Extensive numerical simulations on both modeled networks and real networks demonstrated the high effectiveness and efficiency of the local path index compared with two well-known and widely used indices, the common neighbors and the Katz index. Indeed, the local path index provides competitively accurate predictions as the Katz index while requires much less CPU time and memory space, which is therefore a strong candidate for potential practical applications in data mining of huge-size networks.Comment: 8 pages, 5 figures, 3 table

    COSMOS-7: Video-oriented MPEG-7 scheme for modelling and filtering of semantic content

    Get PDF
    MPEG-7 prescribes a format for semantic content models for multimedia to ensure interoperability across a multitude of platforms and application domains. However, the standard leaves it open as to how the models should be used and how their content should be filtered. Filtering is a technique used to retrieve only content relevant to user requirements, thereby reducing the necessary content-sifting effort of the user. This paper proposes an MPEG-7 scheme that can be deployed for semantic content modelling and filtering of digital video. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user

    iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The iRefIndex consolidates protein interaction data from ten databases in a rigorous manner using sequence-based hash keys. Working with consolidated interaction data comes with distinct challenges: data are redundant, overlapping, highly interconnected and may be collected and represented using different curation practices. These phenomena were quantified in our previous studies.</p> <p>Results</p> <p>The iRefScape plug-in for the Cytoscape graphical viewer addresses these challenges. We show how these factors impact on data-mining tasks and how our solutions resolve them in a simple and efficient manner. A uniform accession space is used to limit redundancy and support search expansion and searching on multiple accession types. Multiple node and edge features support data filtering and mining. Node colours and features supply information about search result provenance. Overlapping evidence is presented using a multi-graph and a bi-partite representation is used to distinguish binary and n-ary source data. Searching for interactions between sets of proteins is supported and specifically includes searches on disease-related genes found in OMIM. Finally, a synchronized adjacency-matrix view facilitates visualization of relationships between sets of user defined groups.</p> <p>Conclusions</p> <p>The iRefScape plug-in will be of interest to advanced users of interaction data. The plug-in provides access to a consolidated data set in a uniform accession space while remaining faithful to the underlying source data. Tools are provided to facilitate a range of tasks from a simple search to knowledge discovery. The plug-in uses a number of strategies that will be of interest to other plug-in developers.</p

    The Stellar Abundances for Galactic Archeology (SAGA) Database - Compilation of the Characteristics of Known Extremely Metal-Poor Stars

    Full text link
    We describe the construction of a database of extremely metal-poor (EMP) stars in the Galactic halo whose elemental abundances have been determined. Our database contains detailed elemental abundances, reported equivalent widths, atmospheric parameters, photometry, and binarity status, compiled from papers in the recent literature that report studies of EMP halo stars with [Fe/H] < -2.5. The compilation procedures for this database have been designed to assemble the data effectively from electronic tables available from online journals. We have also developed a data retrieval system that enables data searches by various criteria, and permits the user to explore relationships between the stored variables graphically. Currently, our sample includes 1212 unique stars (many of which are studied by more than one group) with more than 15000 individual reported elemental abundances, covering all of the relevant papers published by December 2007. We discuss the global characteristics of the present database, as revealed by the EMP stars observed to date. For stars with [Fe/H] < -2.5, the number of giants with reported abundances is larger than that of dwarfs by a factor of two. The fraction of carbon-rich stars (among the sample for which the carbon abundance is reported) amount to ~30 % for [Fe/H] < -2.5. We find that known binaries exhibit different distributions of orbital period, according to whether they are giants or dwarfs, and also as a function of metallicity, although the total sample of such stars is still quite small.Comment: 24 pages, 10 figures, accepted by PASJ, final version. The SAGA database is available at http://saga.sci.hokudai.ac.j

    Distributed memory compiler methods for irregular problems: Data copy reuse and runtime partitioning

    Get PDF
    Outlined here are two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations remain unmodified, we show that we can effectively reuse stored off-processor data. We present experimental data from a 3-D unstructured Euler solver run on iPSC/860 to demonstrate the usefulness of our methods

    Recommender Systems

    Get PDF
    The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
    corecore