131,229 research outputs found
Two-fold Semantic Web service matchmaking – applying ontology mapping for service discovery
Semantic Web Services (SWS) aim at the automated discovery and orchestration of Web services on the basis of comprehensive, machine-interpretable semantic descriptions. Since SWS annotations usually are created by distinct SWS providers, semantic-level mediation, i.e. mediation between concurrent semantic representations, is a key requirement for SWS discovery. Since semantic-level mediation aims at enabling interoperability across heterogeneous semantic representations, it can be perceived as a particular instantiation of the ontology mapping problem. While recent SWS matchmakers usually rely on manual alignments or subscription to a common ontology, we propose a two-fold SWS matchmaking approach, consisting of (a) a general-purpose semantic-level mediator and (b) comparison and matchmaking of SWS capabilities. Our semantic-level mediation approach enables the implicit representation of similarities across distinct SWS by grounding service descriptions in so-called Mediation Spaces (MS). Given a set of SWS and their respective grounding, a SWS matchmaker automatically computes instance similarities across distinct SWS ontologies and matches the request to the most suitable SWS. A prototypical application illustrates our approach
When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors
Finding similar user pairs is a fundamental task in social networks, with
numerous applications in ranking and personalization tasks such as link
prediction and tie strength detection. A common manifestation of user
similarity is based upon network structure: each user is represented by a
vector that represents the user's network connections, where pairwise cosine
similarity among these vectors defines user similarity. The predominant task
for user similarity applications is to discover all similar pairs that have a
pairwise cosine similarity value larger than a given threshold . In
contrast to previous work where is assumed to be quite close to 1, we
focus on recommendation applications where is small, but still
meaningful. The all pairs cosine similarity problem is computationally
challenging on networks with billions of edges, and especially so for settings
with small . To the best of our knowledge, there is no practical solution
for computing all user pairs with, say on large social networks,
even using the power of distributed algorithms.
Our work directly addresses this challenge by introducing a new algorithm ---
WHIMP --- that solves this problem efficiently in the MapReduce model. The key
insight in WHIMP is to combine the "wedge-sampling" approach of Cohen-Lewis for
approximate matrix multiplication with the SimHash random projection techniques
of Charikar. We provide a theoretical analysis of WHIMP, proving that it has
near optimal communication costs while maintaining computation cost comparable
with the state of the art. We also empirically demonstrate WHIMP's scalability
by computing all highly similar pairs on four massive data sets, and show that
it accurately finds high similarity pairs. In particular, we note that WHIMP
successfully processes the entire Twitter network, which has tens of billions
of edges
- …