102 research outputs found
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases
The Internet has enabled the creation of a growing number of large-scale
knowledge bases in a variety of domains containing complementary information.
Tools for automatically aligning these knowledge bases would make it possible
to unify many sources of structured knowledge and answer complex queries.
However, the efficient alignment of large-scale knowledge bases still poses a
considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a
simple algorithm for aligning knowledge bases with millions of entities and
facts. SiGMa is an iterative propagation algorithm which leverages both the
structural information from the relationship graph as well as flexible
similarity measures between entity properties in a greedy local search, thus
making it scalable. Despite its greedy nature, our experiments indicate that
SiGMa can efficiently match some of the world's largest knowledge bases with
high precision. We provide additional experiments on benchmark datasets which
demonstrate that SiGMa can outperform state-of-the-art approaches both in
accuracy and efficiency.Comment: 10 pages + 2 pages appendix; 5 figures -- initial preprin
Recommended from our members
Soft topographic map for clustering and classification of bacteria
In this work a new method for clustering and building a
topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different
type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria
class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification
or erroneous annotations in the database
Confidential Boosting with Random Linear Classifiers for Outsourced User-generated Data
User-generated data is crucial to predictive modeling in many applications.
With a web/mobile/wearable interface, a data owner can continuously record data
generated by distributed users and build various predictive models from the
data to improve their operations, services, and revenue. Due to the large size
and evolving nature of users data, data owners may rely on public cloud service
providers (Cloud) for storage and computation scalability. Exposing sensitive
user-generated data and advanced analytic models to Cloud raises privacy
concerns. We present a confidential learning framework, SecureBoost, for data
owners that want to learn predictive models from aggregated user-generated data
but offload the storage and computational burden to Cloud without having to
worry about protecting the sensitive data. SecureBoost allows users to submit
encrypted or randomly masked data to designated Cloud directly. Our framework
utilizes random linear classifiers (RLCs) as the base classifiers in the
boosting framework to dramatically simplify the design of the proposed
confidential boosting protocols, yet still preserve the model quality. A
Cryptographic Service Provider (CSP) is used to assist the Cloud's processing,
reducing the complexity of the protocol constructions. We present two
constructions of SecureBoost: HE+GC and SecSh+GC, using combinations of
homomorphic encryption, garbled circuits, and random masking to achieve both
security and efficiency. For a boosted model, Cloud learns only the RLCs and
the CSP learns only the weights of the RLCs. Finally, the data owner collects
the two parts to get the complete model. We conduct extensive experiments to
understand the quality of the RLC-based boosting and the cost distribution of
the constructions. Our results show that SecureBoost can efficiently learn
high-quality boosting models from protected user-generated data
Managing Risk of Bidding in Display Advertising
In this paper, we deal with the uncertainty of bidding for display
advertising. Similar to the financial market trading, real-time bidding (RTB)
based display advertising employs an auction mechanism to automate the
impression level media buying; and running a campaign is no different than an
investment of acquiring new customers in return for obtaining additional
converted sales. Thus, how to optimally bid on an ad impression to drive the
profit and return-on-investment becomes essential. However, the large
randomness of the user behaviors and the cost uncertainty caused by the auction
competition may result in a significant risk from the campaign performance
estimation. In this paper, we explicitly model the uncertainty of user
click-through rate estimation and auction competition to capture the risk. We
borrow an idea from finance and derive the value at risk for each ad display
opportunity. Our formulation results in two risk-aware bidding strategies that
penalize risky ad impressions and focus more on the ones with higher expected
return and lower risk. The empirical study on real-world data demonstrates the
effectiveness of our proposed risk-aware bidding strategies: yielding profit
gains of 15.4% in offline experiments and up to 17.5% in an online A/B test on
a commercial RTB platform over the widely applied bidding strategies
Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems
In nature, group behaviours such as flocking as well as cross-species symbiotic partnerships are observed in vastly different forms and circumstances. We hypothesize that such strategies can arise in response to generic predator-prey pressures in a spatial environment with range-limited sensation and action. We evaluate whether these forms of coordination can emerge by independent multi-agent reinforcement learning in simple multiple-species ecosystems. In contrast to prior work, we avoid hand-crafted shaping rewards, specific actions, or dynamics that would directly encourage coordination across agents. Instead we test whether coordination emerges as a consequence of adaptation without encouraging these specific forms of coordination, which only has indirect benefit. Our simulated ecosystems consist of a generic food chain involving three trophic levels: apex predator, mid-level predator, and prey. We conduct experiments on two different platforms, a 3D physics engine with tens of agents as well as in a 2D grid world with up to thousands. The results clearly confirm our hypothesis and show substantial coordination both within and across species. To obtain these results, we leverage and adapt recent advances in deep reinforcement learning within an ecosystem training protocol featuring homogeneous groups of independent agents from different species (sets of policies), acting in many different random combinations in parallel habitats. The policies utilize neural network architectures that are invariant to agent individuality but not type (species) and that generalize across varying numbers of observed other agents. While the emergence of complexity in artificial ecosystems have long been studied in the artificial life community, the focus has been more on individual complexity and genetic algorithms or explicit modelling, and less on group complexity and reinforcement learning emphasized in this article. Unlike what the name and intuition suggests, reinforcement learning adapts over evolutionary history rather than a life-time and is here addressing the sequential optimization of fitness that is usually approached by genetic algorithms in the artificial life community. We utilize a shift from procedures to objectives, allowing us to bring new powerful machinery to bare, and we see emergence of complex behaviour from a sequence of simple optimization problems
The endonasal midline inferior intercavernous approach to the cavernous sinus: technical note, cadaveric step-by-step illustration, and case presentation
Purpose Traditional endoscopic endonasal approaches to the cavernous sinus (CS) open the anterior CS wall just medial to the internal carotid artery (ICA), posing risk of vascular injury. This work describes a potentially safer midline cellar entry point for accessing the CS utilizing its connection with the inferior intercavernous sinus (IICS) when anatomically present.Methods The technique for the midline intercavernous dural access is described and depicted with cadaveric dissections and a clinical case.Results An endoscopic endonasal approach exposed the periosteal dural layer of anterior sella and CS. The IICS was opened sharply in midline through its periosteal layer. The feather knife was inserted and advanced laterally within the IICS toward the anterior CS wall, thereby gradually incising the periosteal layer of the IICS. The knife was turned superiorly then inferiorly in a vertical direction to open the anterior CS wall. This provided excellent access to the CS compartments, maintained the meningeal layer of the IICS and the medial CS wall, and avoided an initial dural incision immediately adjacent to the ICA.Conclusion The midline intercavernous dural access to the CS assisted by a 90 degrees dissector-blade is an effective modification to previously described techniques, with potentially lower risk to the ICA
Order-Revealing Encryption and the Hardness of Private Learning
An order-revealing encryption scheme gives a public procedure by which two
ciphertexts can be compared to reveal the ordering of their underlying
plaintexts. We show how to use order-revealing encryption to separate
computationally efficient PAC learning from efficient -differentially private PAC learning. That is, we construct a concept
class that is efficiently PAC learnable, but for which every efficient learner
fails to be differentially private. This answers a question of Kasiviswanathan
et al. (FOCS '08, SIAM J. Comput. '11).
To prove our result, we give a generic transformation from an order-revealing
encryption scheme into one with strongly correct comparison, which enables the
consistent comparison of ciphertexts that are not obtained as the valid
encryption of any message. We believe this construction may be of independent
interest.Comment: 28 page
Differentiable Game Mechanics
Deep learning is built on the foundational guarantee that gradient descent on an objective function converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, that exhibit multiple interacting losses. The behavior of gradient-based methods in games is not well understood and is becoming increasingly important as adversarial and multi-objective architectures proliferate. In this paper, we develop new tools to understand and control the dynamics in n-player differentiable games. The key result is to decompose the game Jacobian into two components. The first, symmetric component, is related to potential games, which reduce to gradient descent on an implicit function. The second, antisymmetric component, relates to Hamiltonian games, a new class of games that obey a conservation law akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games. Basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs – while at the same time being applicable to, and having guarantees in, much more general cases
A Phase I/IIA Clinical Study With A Chimeric Mouse-Human Monoclonal Antibody To The V3 Loop Of Human Immunodeficiency Virus Type 1 Gp120
A phase I/IIA clinical trial with the chimeric mouse-human monoclonal antibody CGP 47 439 to the principal neutralization determinant in the V3 region of human immunodeficiency virus type 1 (HIV-1) strain IIIB envelope protein gp 120 is reported. The trial was an uncontrolled single-center, open-label, multidose tolerability, immunogenicity, and pharmacokinetic study in homosexual men with advanced HIV disease. Patient groups were formed on the basis of the reactivity of the antibody with the gp 120 of their HIV-1 isolates. Intravenous infusions of 1, 10, and 25 mg of antibody were followed by seven escalated doses of 50, 100, and 200 mg, every 3 weeks. The antibody was well tolerated; no toxicity was observed. Some patients showed a transient but insignificant antibody response to the antibody with no apparent adverse reactions or accelerated elimination of it. Substantial serum levels of the antibody were maintained with a mean t1/2β of 8-16 days. A virus burden reduction was observed in some patient
- …