249,067 research outputs found
Examination of optimizing information flow in networks
The central role of the Internet and the World-Wide-Web in global communications has refocused much attention on problems involving optimizing information flow through networks. The most basic formulation of the question is called the "max flow" optimization problem: given a set of channels with prescribed capacities that connect a set of nodes in a network, how should the materials or information be distributed among the various routes to maximize the total flow rate from the source to the destination. Theory in linear programming has been well developed to solve the classic max flow problem. Modern contexts have demanded the examination of more complicated variations of the max flow problem to take new factors or constraints into consideration; these changes lead to more difficult problems where linear programming is insufficient.
In the workshop we examined models for information flow on networks that considered trade-offs between the overall network utility (or flow rate) and path diversity to ensure balanced usage of all parts of the network (and to ensure stability and robustness against local disruptions in parts of the network).
While the linear programming solution of the basic max flow problem cannot handle the current problem, the approaches primal/dual formulation for describing the constrained optimization problem can be applied to the current generation of problems, called network utility maximization (NUM) problems. In particular, primal/dual formulations have been used extensively in studies of such networks.
A key feature of the traffic-routing model we are considering is its formulation as an economic system, governed by principles of supply and demand. Considering channel capacities as a commodity of limited supply, we might suspect that a system that regulates traffic via a pricing scheme would assign prices to channels in a manner inversely proportional to their respective capacities.
Once an appropriate network optimization problem has been formulated, it remains to solve the optimization problem; this will need to be done numerically, but the process can greatly benefit from simplifications and reductions that follow from analysis of the problem. Ideally the form of the numerical solution scheme can give insight on the design of a distributed algorithm for a Transmission Control Protocol (TCP) that can be directly implemented on the network.
At the workshop we considered the optimization problems for two small prototype network topologies: the two-link network and the diamond network. These examples are small enough to be tractable during the workshop, but retain some of the key features relevant to larger networks (competing routes with different capacities from the source to the destination, and routes with overlapping channels, respectively). We have studied a gradient descent method for solving obtaining the optimal solution via the dual problem. The numerical method was implemented in MATLAB and further analysis of the dual problem and properties of the gradient method were carried out. Another thrust of the group's work was in direct simulations of information flow in these small networks via Monte Carlo simulations as a means of directly testing the efficiencies of various allocation strategies
Federated Class-Incremental Learning with Prompting
As Web technology continues to develop, it has become increasingly common to
use data stored on different clients. At the same time, federated learning has
received widespread attention due to its ability to protect data privacy when
let models learn from data which is distributed across various clients.
However, most existing works assume that the client's data are fixed. In
real-world scenarios, such an assumption is most likely not true as data may be
continuously generated and new classes may also appear. To this end, we focus
on the practical and challenging federated class-incremental learning (FCIL)
problem. For FCIL, the local and global models may suffer from catastrophic
forgetting on old classes caused by the arrival of new classes and the data
distributions of clients are non-independent and identically distributed
(non-iid).
In this paper, we propose a novel method called Federated Class-Incremental
Learning with PrompTing (FCILPT). Given the privacy and limited memory, FCILPT
does not use a rehearsal-based buffer to keep exemplars of old data. We choose
to use prompts to ease the catastrophic forgetting of the old classes.
Specifically, we encode the task-relevant and task-irrelevant knowledge into
prompts, preserving the old and new knowledge of the local clients and solving
the problem of catastrophic forgetting. We first sort the task information in
the prompt pool in the local clients to align the task information on different
clients before global aggregation. It ensures that the same task's knowledge
are fully integrated, solving the problem of non-iid caused by the lack of
classes among different clients in the same incremental task. Experiments on
CIFAR-100, Mini-ImageNet, and Tiny-ImageNet demonstrate that FCILPT achieves
significant accuracy improvements over the state-of-the-art methods
Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach
We design a new technique for the distributional semantic modeling with a
neural network-based approach to learn distributed term representations (or
term embeddings) - term vector space models as a result, inspired by the recent
ontology-related approach (using different types of contextual knowledge such
as syntactic knowledge, terminological knowledge, semantic knowledge, etc.) to
the identification of terms (term extraction) and relations between them
(relation extraction) called semantic pre-processing technology - SPT. Our
method relies on automatic term extraction from the natural language texts and
subsequent formation of the problem-oriented or application-oriented (also
deeply annotated) text corpora where the fundamental entity is the term
(includes non-compositional and compositional terms). This gives us an
opportunity to changeover from distributed word representations (or word
embeddings) to distributed term representations (or term embeddings). This
transition will allow to generate more accurate semantic maps of different
subject domains (also, of relations between input terms - it is useful to
explore clusters and oppositions, or to test your hypotheses about them). The
semantic map can be represented as a graph using Vec2graph - a Python library
for visualizing word embeddings (term embeddings in our case) as dynamic and
interactive graphs. The Vec2graph library coupled with term embeddings will not
only improve accuracy in solving standard NLP tasks, but also update the
conventional concept of automated ontology development. The main practical
result of our work is the development kit (set of toolkits represented as web
service APIs and web application), which provides all necessary routines for
the basic linguistic pre-processing and the semantic pre-processing of the
natural language texts in Ukrainian for future training of term vector space
models.Comment: In English, 9 pages, 2 figures. Not published yet. Prepared for
special issue (UkrPROG 2020 conference) of the scientific journal "Problems
in programming" (Founder: National Academy of Sciences of Ukraine, Institute
of Software Systems of NAS Ukraine
Recommended from our members
Leveraging legacy codes to distributed problem solving environments: A web service approach
This paper describes techniques used to leverage high performance legacy codes as CORBA components to a distributed problem solving environment. It first briefly introduces the software architecture adopted by the environment. Then it presents a CORBA oriented wrapper generator (COWG) which can be used to automatically wrap high performance legacy codes as CORBA components. Two legacy codes have been wrapped with COWG. One is an MPI-based molecular dynamic simulation (MDS) code, the other is a finite element based computational fluid dynamics (CFD) code for simulating incompressible Navier-Stokes flows. Performance comparisons between runs of the MDS CORBA component and the original MDS legacy code on a cluster of workstations and on a parallel computer are also presented. Wrapped as CORBA components, these legacy codes can be reused in a distributed computing environment. The first case shows that high performance can be maintained with the wrapped MDS component. The second case shows that a Web user can submit a task to the wrapped CFD component through a Web page without knowing the exact implementation of the component. In this way, a user’s desktop computing environment can be extended to a high performance computing environment using a cluster of workstations or a parallel computer
A Framework for Design and Composition of Semantic Web Services
Semantic Web Services (SWS) are Web Services (WS)
whose description is semantically enhanced with markup
languages (e.g., OWL-S). This semantic description will enable external agents and programs to discover, compose and
invoke SWSs. However, as a previous step to the specification of SWSs in a language, it must be designed at a conceptual level to guarantee its correctness and avoid
inconsistencies among its internal components. In this
paper, we present a framework for design and (semi)
automatic composition of SWSs at a language-independent
and knowledge level. This framework is based on a stack of
ontologies that (1) describe the different parts of a SWS;
and (2) contain a set of axioms that are really design rules to be verified by the ontology instances. Based on these ontologies, design and composition of SWSs can be viewed as the correct instantiation of the ontologies themselves. Once these instances have been created they will be exported to SWS languages such as OWL-S
Distributed human computation framework for linked data co-reference resolution
Distributed Human Computation (DHC) is a technique used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI is envisioned to be a decentralised world-wide information space for sharing machine-readable data with minimal integration costs. There are many research problems in the Semantic Web that are considered as AI-complete problems. An example is co-reference resolution, which involves determining whether different URIs refer to the same entity. This is considered to be a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution in the Semantic Web when integrating distributed datasets. The traditional way to solve this problem is to design machine-learning algorithms. However, they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co-reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are published to the Linked Data Cloud
- …