129 research outputs found
Existence and approximation of Hunt processes associated with generalized Dirichlet forms
We show that any strictly quasi-regular generalized Dirichlet form that
satisfies the mild structural condition D3 is associated to a Hunt process, and
that the associated Hunt process can be approximated by a sequence of
multivariate Poisson processes. This also gives a new proof for the existence
of a Hunt process associated to a strictly quasi-regular generalized Dirichlet
form that satisfies SD3 and extends all previous results.Comment: Revised, shortened and improved versio
Proposing Ties in a Dense Hypergraph of Academics
Nearly all personal relationships exhibit a multiplexity where people relate to one another in many different ways. Using a set of faculty CVs from multiple research institutions, we mined a hypergraph of researchers connected by co-occurring named entities (people, places and organizations). This results in an edge-sparse, link-dense structure with weighted connections that accurately encodes faculty department structure. We introduce a novel model that generates dyadic proposals of how well two nodes should be connected based on both the mass and distributional similarity of links through shared neighbors. Similar link prediction tasks have been primarily explored in unipartite settings, but for hypergraphs where hyper-edges out-number nodes 25-to-1, accounting for link similarity is crucial. Our model is tested by using its proposals to recover link strengths from four systematically lesioned versions of the graph. The model is also compared to other link prediction methods in a static setting. Our results show the model is able to recover a majority of link mass in various settings and that it out-performs other link prediction methods. Overall, the results support the descriptive fidelity of our text-mined, named entity hypergraph of multi-faceted relationships and underscore the importance of link similarity in analyzing link-dense multiplexitous relationships
Features generated for computational splice-site prediction correspond to functional elements
<p>Abstract</p> <p>Background</p> <p>Accurate selection of splice sites during the splicing of precursors to messenger RNA requires both relatively well-characterized signals at the splice sites and auxiliary signals in the adjacent exons and introns. We previously described a feature generation algorithm (FGA) that is capable of achieving high classification accuracy on human 3' splice sites. In this paper, we extend the splice-site prediction to 5' splice sites and explore the generated features for biologically meaningful splicing signals.</p> <p>Results</p> <p>We present examples from the observed features that correspond to known signals, both core signals (including the branch site and pyrimidine tract) and auxiliary signals (including GGG triplets and exon splicing enhancers). We present evidence that features identified by FGA include splicing signals not found by other methods.</p> <p>Conclusion</p> <p>Our generated features capture known biological signals in the expected sequence interval flanking splice sites. The method can be easily applied to other species and to similar classification problems, such as tissue-specific regulatory elements, polyadenylation sites, promoters, etc.</p
Link prediction in complex networks: a local na\"{\i}ve Bayes model
Common-neighbor-based method is simple yet effective to predict missing
links, which assume that two nodes are more likely to be connected if they have
more common neighbors. In such method, each common neighbor of two nodes
contributes equally to the connection likelihood. In this Letter, we argue that
different common neighbors may play different roles and thus lead to different
contributions, and propose a local na\"{\i}ve Bayes model accordingly.
Extensive experiments were carried out on eight real networks. Compared with
the common-neighbor-based methods, the present method can provide more accurate
predictions. Finally, we gave a detailed case study on the US air
transportation network.Comment: 6 pages, 2 figures, 2 table
Recommended from our members
Data Sciences Technology for Homeland Security Information Management and Knowledge Discovery
The Department of Homeland Security (DHS) has vast amounts of data available, but its ultimate value cannot be realized without powerful technologies for knowledge discovery to enable better decision making by analysts. Past evidence has shown that terrorist activities leave detectable footprints, but these footprints generally have not been discovered until the opportunity for maximum benefit has passed. The challenge faced by the DHS is to discover the money transfers, border crossings, and other activities in advance of an attack and use that information to identify potential threats and vulnerabilities. The data to be analyzed by DHS comes from many sources ranging from news feeds, to raw sensors, to intelligence reports, and more. The amount of data is staggering; some estimates place the number of entities to be processed at 1015. The uses for the data are varied as well, including entity tracking over space and time, identifying complex and evolving relationships between entities, and identifying organization structure, to name a few. Because they are ideal for representing relationship and linkage information, semantic graphs have emerged as a key technology for fusing and organizing DHS data. A semantic graph organizes relational data by using nodes to represent entities and edges to connect related entities. Hidden relationships in the data are then uncovered by examining the structure and properties of the semantic graph
- …