9,977 research outputs found
Effective and Efficient Similarity Index for Link Prediction of Complex Networks
Predictions of missing links of incomplete networks like protein-protein
interaction networks or very likely but not yet existent links in evolutionary
networks like friendship networks in web society can be considered as a
guideline for further experiments or valuable information for web users. In
this paper, we introduce a local path index to estimate the likelihood of the
existence of a link between two nodes. We propose a network model with
controllable density and noise strength in generating links, as well as collect
data of six real networks. Extensive numerical simulations on both modeled
networks and real networks demonstrated the high effectiveness and efficiency
of the local path index compared with two well-known and widely used indices,
the common neighbors and the Katz index. Indeed, the local path index provides
competitively accurate predictions as the Katz index while requires much less
CPU time and memory space, which is therefore a strong candidate for potential
practical applications in data mining of huge-size networks.Comment: 8 pages, 5 figures, 3 table
COSMOS-7: Video-oriented MPEG-7 scheme for modelling and filtering of semantic content
MPEG-7 prescribes a format for semantic content models for multimedia to ensure interoperability across a multitude of platforms and application domains. However, the standard leaves it open as to how the models should be used and how their content should be filtered. Filtering is a technique used to retrieve only content relevant to user requirements, thereby reducing the necessary content-sifting effort of the user. This paper proposes an MPEG-7 scheme that can be deployed for semantic content modelling and filtering of digital video. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user
iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex
<p>Abstract</p> <p>Background</p> <p>The iRefIndex consolidates protein interaction data from ten databases in a rigorous manner using sequence-based hash keys. Working with consolidated interaction data comes with distinct challenges: data are redundant, overlapping, highly interconnected and may be collected and represented using different curation practices. These phenomena were quantified in our previous studies.</p> <p>Results</p> <p>The iRefScape plug-in for the Cytoscape graphical viewer addresses these challenges. We show how these factors impact on data-mining tasks and how our solutions resolve them in a simple and efficient manner. A uniform accession space is used to limit redundancy and support search expansion and searching on multiple accession types. Multiple node and edge features support data filtering and mining. Node colours and features supply information about search result provenance. Overlapping evidence is presented using a multi-graph and a bi-partite representation is used to distinguish binary and n-ary source data. Searching for interactions between sets of proteins is supported and specifically includes searches on disease-related genes found in OMIM. Finally, a synchronized adjacency-matrix view facilitates visualization of relationships between sets of user defined groups.</p> <p>Conclusions</p> <p>The iRefScape plug-in will be of interest to advanced users of interaction data. The plug-in provides access to a consolidated data set in a uniform accession space while remaining faithful to the underlying source data. Tools are provided to facilitate a range of tasks from a simple search to knowledge discovery. The plug-in uses a number of strategies that will be of interest to other plug-in developers.</p
The Stellar Abundances for Galactic Archeology (SAGA) Database - Compilation of the Characteristics of Known Extremely Metal-Poor Stars
We describe the construction of a database of extremely metal-poor (EMP)
stars in the Galactic halo whose elemental abundances have been determined. Our
database contains detailed elemental abundances, reported equivalent widths,
atmospheric parameters, photometry, and binarity status, compiled from papers
in the recent literature that report studies of EMP halo stars with [Fe/H] <
-2.5. The compilation procedures for this database have been designed to
assemble the data effectively from electronic tables available from online
journals. We have also developed a data retrieval system that enables data
searches by various criteria, and permits the user to explore relationships
between the stored variables graphically. Currently, our sample includes 1212
unique stars (many of which are studied by more than one group) with more than
15000 individual reported elemental abundances, covering all of the relevant
papers published by December 2007. We discuss the global characteristics of the
present database, as revealed by the EMP stars observed to date. For stars with
[Fe/H] < -2.5, the number of giants with reported abundances is larger than
that of dwarfs by a factor of two. The fraction of carbon-rich stars (among the
sample for which the carbon abundance is reported) amount to ~30 % for [Fe/H] <
-2.5. We find that known binaries exhibit different distributions of orbital
period, according to whether they are giants or dwarfs, and also as a function
of metallicity, although the total sample of such stars is still quite small.Comment: 24 pages, 10 figures, accepted by PASJ, final version. The SAGA
database is available at http://saga.sci.hokudai.ac.j
Distributed memory compiler methods for irregular problems: Data copy reuse and runtime partitioning
Outlined here are two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations remain unmodified, we show that we can effectively reuse stored off-processor data. We present experimental data from a 3-D unstructured Euler solver run on iPSC/860 to demonstrate the usefulness of our methods
Recommended from our members
Component processes of early reading, spelling, and narrative writing skills in Turkish: a longitudinal study
The study examined: (a) the role of phonological, grammatical, and rapid automatized naming (RAN) skills in reading and spelling development; and (b) the component processes of early narrative writing skills. Fifty-seven Turkish-speaking children were followed from Grade 1 to Grade 2. RAN was the most powerful longitudinal predictor of reading speed and its effect was evident even when previous reading skills were taken into account. Broadly, the phonological and grammatical skills made reliable contributions to spelling performance but their effects were completely mediated by previous spelling skills. Different aspects of the narrative writing skills were related to different processing skills. While handwriting speed predicted writing fluency, spelling accuracy predicted spelling error rate. Vocabulary and working memory were the only reliable longitudinal predictors of the quality of composition content. The overall model, however, failed to explain any reliable variance in the structural quality of the composition
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
- …