29 research outputs found
Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective
Existing approaches defend against backdoor attacks in federated learning
(FL) mainly through a) mitigating the impact of infected models, or b)
excluding infected models. The former negatively impacts model accuracy, while
the latter usually relies on globally clear boundaries between benign and
infected model updates. However, model updates are easy to be mixed and
scattered throughout in reality due to the diverse distributions of local data.
This work focuses on excluding infected models in FL. Unlike previous
perspectives from a global view, we propose Snowball, a novel anti-backdoor FL
framework through bidirectional elections from an individual perspective
inspired by one principle deduced by us and two principles in FL and deep
learning. It is characterized by a) bottom-up election, where each candidate
model update votes to several peer ones such that a few model updates are
elected as selectees for aggregation; and b) top-down election, where selectees
progressively enlarge themselves through picking up from the candidates. We
compare Snowball with state-of-the-art defenses to backdoor attacks in FL on
five real-world datasets, demonstrating its superior resistance to backdoor
attacks and slight impact on the accuracy of the global model
An Empirical Study of the Landscape of Open Source Projects in Baidu, Alibaba, and Tencent
Open source software has drawn more and more attention from researchers,
developers and companies nowadays. Meanwhile, many Chinese technology companies
are embracing open source and choosing to open source their projects.
Nevertheless, most previous studies are concentrated on international companies
such as Microsoft or Google, while the practical values of open source projects
of Chinese technology companies remain unclear. To address this issue, we
conduct a mixed-method study to investigate the landscape of projects open
sourced by three large Chinese technology companies, namely Baidu, Alibaba, and
Tencent (BAT). We study the categories and characteristics of open source
projects, the developer's perceptions towards open sourcing effort for these
companies, and the internationalization effort of their open source projects.
We collected 1,000 open source projects that were open sourced by BAT in GitHub
and performed an online survey that received 101 responses from developers of
these projects. Some key findings include: 1) BAT prefer to open source
frontend development projects, 2) 88\% of the respondents are positive towards
open sourcing software projects in their respective companies, 3) 64\% of the
respondents reveal that the most common motivations for BAT to open source
their projects are the desire to gain fame, expand their influence and gain
recruitment advantage, 4) respondents believe that the most common
internationalization effort is "providing an English version of readme files",
5) projects with more internationalization effort (i.e., include an English
readme file) are more popular. Our findings provide directions for software
engineering researchers and provide practical suggestions to software
developers and Chinese technology companies
Learning to Dispatch Multi-Server Jobs in Bipartite Graphs with Unknown Service Rates
Multi-server jobs are imperative in modern cloud computing systems. A
multi-server job has multiple components and requests multiple servers for
being served. How to allocate restricted computing devices to jobs is a topic
of great concern, which leads to the job scheduling and load balancing
algorithms thriving. However, current job dispatching algorithms require the
service rates to be changeless and knowable, which is difficult to realize in
production systems. Besides, for multi-server jobs, the dispatching decision
for each job component follows the All-or-Nothing property under service
locality constraints and resource capacity limits, which is not well supported
by mainstream algorithms. In this paper, we propose a dispatching algorithm for
multi-server jobs that learns the unknown service rates and simultaneously
maximizes the expected Accumulative Social Welfare (Asw). We formulate the Asw
as the sum of utilities of jobs and servers achieved over each time slot. The
utility of a job is proportional to the valuation for being served, which is
mainly impacted by the fluctuating but unknown service rates. We maximize the
Asw without knowing the exact valuations, but approximate them with
exploration-exploitation. From this, we bring in several evolving statistics
and maximize the statistical Asw with dynamic programming. The proposed
algorithm is proved to have a polynomial complexity and a State-of-the-Art
regret. We validate it with extensive simulations and the results show that the
proposed algorithm outperforms several benchmark policies with improvements by
up to 73%, 36%, and 28%, respectively
Towards Semantic e-Science for Traditional Chinese Medicine
<p>Abstract</p> <p>Background</p> <p>Recent advances in Web and information technologies with the increasing decentralization of organizational structures have resulted in massive amounts of information resources and domain-specific services in Traditional Chinese Medicine. The massive volume and diversity of information and services available have made it difficult to achieve seamless and interoperable e-Science for knowledge-intensive disciplines like TCM. Therefore, information integration and service coordination are two major challenges in e-Science for TCM. We still lack sophisticated approaches to integrate scientific data and services for TCM e-Science.</p> <p>Results</p> <p>We present a comprehensive approach to build dynamic and extendable e-Science applications for knowledge-intensive disciplines like TCM based on semantic and knowledge-based techniques. The semantic e-Science infrastructure for TCM supports large-scale database integration and service coordination in a virtual organization. We use domain ontologies to integrate TCM database resources and services in a semantic cyberspace and deliver a semantically superior experience including browsing, searching, querying and knowledge discovering to users. We have developed a collection of semantic-based toolkits to facilitate TCM scientists and researchers in information sharing and collaborative research.</p> <p>Conclusion</p> <p>Semantic and knowledge-based techniques are suitable to knowledge-intensive disciplines like TCM. It's possible to build on-demand e-Science system for TCM based on existing semantic and knowledge-based techniques. The presented approach in the paper integrates heterogeneous distributed TCM databases and services, and provides scientists with semantically superior experience to support collaborative research in TCM discipline.</p
Trust-based Service Recommendation in Social Network
With the number of Web services increasing constantly on the Internet, how to recommend personalized Web services for users has become more and more important. At present, there emerged some service recommendation systems utilizing influence ranking and collaborative filtering algorithms in service recommendation. However, they neither considered trust relationships among users, nor deal with the cold start problem very well. Fortunately, the popularity of social network in nowadays brings a good alternative for service recommendation to avoid those. In this study, we propose a social network-based service-recommendation method, which considers users’ history service invocation behaviors, users preferences as well as trust relationships among users implied in social network and users comments/reviews on services. We have applied this method in a data set extracted from www.epinions.com. A series of experiments on 86,719 users, 604,190 user trust-relationships and 963,591 reviews on 292,713 services/produces show that this recommendation method get better recall rate, precision, f-measure and rank score
Virtual Workflow Management System in Grid Environment
Abstract. Building a workflow management system (WFMS) is a large project. However, with the development of grid technology, it becomes easily to do that. In Open Grid Service Architecture (OGSA), everything is a service. Thus, a workflow management system can be easily built upon a set of services. Some workflow management services and important components are proposed in this paper. And the definition of virtual workflow management system (VWFMS) is presented. Some related issues like service registration and discovery will be discussed in detail.