5,408 research outputs found
Congestion control in multi-serviced heterogeneous wireless networks using dynamic pricing
Includes bibliographical references.Service providers, (or operators) employ pricing schemes to help provide desired QoS to subscribers and to maintain profitability among competitors. An economically efficient pricing scheme, which will seamlessly integrate usersâ preferences as well as service providersâ preferences, is therefore needed. Else, pricing schemes can be viewed as promoting social unfairness in the dynamically priced network. However, earlier investigations have shown that the existing dynamic pricing schemes do not consider the usersâ willingness to pay (WTP) before the price of services is determined. WTP is the amount a user is willing to pay based on the worth attached to the service requested. There are different WTP levels for different subscribers due to the differences in the value attached to the services requested and demographics. This research has addressed congestion control in the heterogeneous wireless network (HWN) by developing a dynamic pricing scheme that efficiently incentivises users to utilize radio resources. The proposed Collaborative Dynamic Pricing Scheme (CDPS), which identifies the users and operatorsâ preference in determining the price of services, uses an intelligent approach for controlling congestion and enhancing both the usersâ and operatorsâ utility. Thus, the CDPS addresses the congestion problem by firstly obtaining the users WTP from usersâ historical response to price changes and incorporating the WTP factor to evaluate the service price. Secondly, it uses a reinforcement learning technique to illustrate how a price policy can be obtained for the enhancement of both users and operatorsâ utility, as total utility reward obtained increases towards a defined âgoal stateâ
Cooperation in Multi-Agent Reinforcement Learning
As progress in reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, society needs to anticipate a possible future in which multiple RL agents must learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? When agents belong to self-interested principals with imperfectly-aligned objectives, how can cooperation emerge from fully-decentralized learning? This dissertation addresses both questions by proposing novel methods for multi-agent reinforcement learning (MARL) and demonstrating the empirical effectiveness of these methods in high-dimensional simulated environments.
To address the first case, we propose new algorithms for fully-cooperative MARL in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to discover and learn interpretable and useful skills for a multi-agent team to optimize a single team objective. Extensive experiments with ablations show the strengths of our approaches over state-of-the-art baselines.
To address the second case, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We propose the design of a new agent who is equipped with the new ability to incentivize other RL agents and explicitly account for the other agents' learning process. This agent overcomes the challenging limitation of fully-decentralized training and generates emergent cooperation in difficult social dilemmas. Then, we extend and apply this technique to the problem of incentive design, where a central incentive designer explicitly optimizes a global objective only by intervening on the rewards of a population of independent RL agents. Experiments on the problem of optimal taxation in a simulated market economy demonstrate the effectiveness of this approach.Ph.D
Utilizing Reinforcement Learning and Computer Vision in a Pick-And-Place Operation for Sorting Objects in Motion
This master's thesis studies the implementation of advanced machine learning (ML) techniques in industrial automation systems, focusing on applying machine learning to enable and evolve autonomous sorting capabilities in robotic manipulators. In particular, Inverse Kinematics (IK) and Reinforcement Learning (RL) are investigated as methods for controlling a UR10e robotic arm for pick-and-place of moving objects on a conveyor belt within a small-scale sorting facility. A camera-based computer vision system applying YOLOv8 is used for real-time object detection and instance segmentation. Perception data is utilized to ascertain optimal grip points, specifically through an implemented algorithm that outputs optimal grip position, angle, and width. As the implemented system includes testing and evaluation on a physical system, the intricacies of hardware control, specifically the reverse engineering of an OnRobot RG6 gripper is elaborated as part of this study.
The system is implemented on the Robotic Operating System (ROS), and its design is in particular driven by high modularity and scalability in mind. The camera-based vision system serves as the primary input, while the robot control is the output. The implemented system design allows for the evaluation of motion control employing both IK and RL. Computation of IK is conducted via MoveIt2, while the RL model is trained and computed in NVIDIA Isaac Sim.
The high-level control of the robotic manipulator was accomplished with use of Proximal Policy Optimization (PPO). The main result of the research is a novel reward function for the pick-and-place operation that takes into account distance and orientation from the target object. In addition, the provided system administers task control by independently initializing pick-and-place operation phases for each environment. The findings demonstrate that PPO was able to significantly enhance the velocity, accuracy, and adaptability of industrial automation. Our research shows that accurate control of the robot arm can be reached by training the PPO Model purely by applying a digital twin simulation
Recommended from our members
An Emergent Architecture for Scaling Decentralized Communication Systems (DCS)
With recent technological advancements now accelerating the mobile and wireless Internet solution space, a ubiquitous computing Internet is well within the research and industrial community's design reach - a decentralized system design, which is not solely driven by static physical models and sound engineering principals, but more dynamically, perhaps sub-optimally at initial deployment and socially-influenced in its evolution. To complement today's Internet system, this thesis proposes a Decentralized Communication System (DCS) architecture with the following characteristics: flat physical topologies with numerous compute oriented and communication intensive nodes in the network with many of these nodes operating in multiple functional roles; self-organizing virtual structures formed through alternative mobility scenarios and capable of serving ad hoc networking formations; emergent operations and control with limited dependency on centralized control and management administration. Today, decentralized systems are not commercially scalable or viable for broad adoption in the same way we have to come to rely on the Internet or telephony systems. The premise in this thesis is that DCS can reach high levels of resilience, usefulness, scale that the industry has come to experience with traditional centralized systems by exploiting the following properties: (i.) network density and topological diversity; (ii.) self-organization and emergent attributes; (iii.) cooperative and dynamic infrastructure; and (iv.) node role diversity. This thesis delivers key contributions towards advancing the current state of the art in decentralized systems. First, we present the vision and a conceptual framework for DCS. Second, the thesis demonstrates that such a framework and concept architecture is feasible by prototyping a DCS platform that exhibits the above properties or minimally, demonstrates that these properties are feasible through prototyped network services. Third, this work expands on an alternative approach to network clustering using hierarchical virtual clusters (HVC) to facilitate self-organizing network structures. With increasing network complexity, decentralized systems can generally lead to unreliable and irregular service quality, especially given unpredictable node mobility and traffic dynamics. The HVC framework is an architectural strategy to address organizational disorder associated with traditional decentralized systems. The proposed HVC architecture along with the associated promotional methodology organizes distributed control and management services by leveraging alternative organizational models (e.g., peer-to-peer (P2P), centralized or tiered) in hierarchical and virtual fashion. Through simulation and analytical modeling, we demonstrate HVC efficiencies in DCS structural scalability and resilience by comparing static and dynamic HVC node configurations against traditional physical configurations based on P2P, centralized or tiered structures. Next, an emergent management architecture for DCS exploiting HVC for self-organization, introduces emergence as an operational approach to scaling DCS services for state management and policy control. In this thesis, emergence scales in hierarchical fashion using virtual clustering to create multiple tiers of local and global separation for aggregation, distribution and network control. Emergence is an architectural objective, which HVC introduces into the proposed self-management design for scaling and stability purposes. Since HVC expands the clustering model hierarchically and virtually, a clusterhead (CH) node, positioned as a proxy for a specific cluster or grouped DCS nodes, can also operate in a micro-capacity as a peer member of an organized cluster in a higher tier. As the HVC promotional process continues through the hierarchy, each tier of the hierarchy exhibits emergent behavior. With HVC as the self-organizing structural framework, a multi-tiered, emergent architecture enables the decentralized management strategy to improve scaling objectives that traditionally challenge decentralized systems. The HVC organizational concept and the emergence properties align with and the view of the human brain's neocortex layering structure of sensory storage, prediction and intelligence. It is the position in this thesis, that for DCS to scale and maintain broad stability, network control and management must strive towards an emergent or natural approach. While today's models for network control and management have proven to lack scalability and responsiveness based on pure centralized models, it is unlikely that singular organizational models can withstand the operational complexities associated with DCS. In this work, we integrate emergence and learning-based methods in a cooperative computing manner towards realizing DCS self-management. However, unlike many existing work in these areas which break down with increased network complexity and dynamics, the proposed HVC framework is utilized to offset these issues through effective separation, aggregation and asynchronous processing of both distributed state and policy. Using modeling techniques, we demonstrate that such architecture is feasible and can improve the operational robustness of DCS. The modeling emphasis focuses on demonstrating the operational advantages of an HVC-based organizational strategy for emergent management services (i.e., reachability, availability or performance). By integrating the two approaches, the DCS architecture forms a scalable system to address the challenges associated with traditional decentralized systems. The hypothesis is that the emergent management system architecture will improve the operational scaling properties of DCS-based applications and services. Additionally, we demonstrate structural flexibility of HVC as an underlying service infrastructure to build and deploy DCS applications and layered services. The modeling results demonstrate that an HVC-based emergent management and control system operationally outperforms traditional structural organizational models. In summary, this thesis brings together the above contributions towards delivering a scalable, decentralized system for Internet mobile computing and communications
Machine Learning Techniques and Stochastic Modeling in Mathematical Oncology
The cancer stem cell hypothesis claims that tumor growth and progression are driven
by a (typically) small niche of the total cancer cell population called cancer stem cells
(CSCs). These CSCs can go through symmetric or asymmetric divisions to differentiate
into specialised, progenitor cells or reproduce new CSCs. While it was once held that this
differentiation pathway was unidirectional, recent research has demonstrated that differenti-
ated cells are more plastic than initially considered. In particular, differentiated cells can
de-differentiate and recover their stem-like capacity. Two recent papers have considered how
this rate of plasticity affects the evolutionary dynamic of an invasive, malignant population
of stem cells and differentiated cells into existing tissue [64, 109]. These papers arrive at
seemingly opposing conclusions, one claiming that increased plasticity results in increased
invasive potential, and the other that increased plasticity decreases invasive potential. Here,
we show that what is most important, when determining the effect on invasive potential,
is how one distributes this increased plasticity between the compartments of resident and
mutant-type cells. We also demonstrate how these results vary, producing non-monotone
fixation probability curves, as inter-compartmental plasticity changes when differentiated
cell compartments are allowed to continue proliferating, highlighting a fundamental dif-
ference between the two models. We conclude by demonstrating the stability of these
qualitative results over various parameter ranges.
Imaging flow cytometry is a tool that uses the high-throughput capabilities of conven-
tional flow cytometry for the purposes of producing single cell images. We demonstrate
the label free prediction of mitotic cell cycle phases in Jurkat cells by utilizing brightfield
and darkfield images from an imaging flow cytometer. The method is a non destructive
method that relies upon images only and does not introduce (potentially confounding) dies
or biomarkers to the cell cycles. By utilizing deep convolutional neural networks regularized
by generated, synthetic images in the presence of severe class imbalance we are able to
produce an estimator that outperforms the previous state of the art on the dataset by
10-15%.
The in-silico development of a chemotherapeutic dosing schedule for treating cancer relies
upon a parameterization of a particular tumour growth model to describe the dynamics
of the cancer in response to the dose of the drug. In practice, it is often prohibitively
difficult to ensure the validity of patient-specific parameterizations of these models for any
particular patient. As a result, sensitivities to these particular parameters can result in
therapeutic dosing schedules that are optimal in principle not performing well on particular
patients. In this study, we demonstrate that chemotherapeutic dosing strategies learned
via reinforcement learning methods are more robust to perturbations in patient-specific
parameter values than those learned via classical optimal control methods. By training a
reinforcement learning agent on mean-value parameters and allowing the agent periodic
access to a more easily measurable metric, relative bone marrow density, for the purpose of
optimizing dose schedule while reducing drug toxicity, we are able to develop drug dosing
schedules that outperform schedules learned via classical optimal control methods, even
when such methods are allowed to leverage the same bone marrow measurements
Marketing Orientation, Customer Satisfaction and Retention: the Case of the Telecommunications Services Market in Jordan
A great deal of attention has been devoted by researchers to examine different aspects of the relationship between marketing orientation (MO) and competitive advantage, mostly within causal relationship style research. However, the mechanisms and intermediate variables underlying this relationship remain vague and poorly investigated.
Drawing upon mixed method research utilising qualitative and quantitative techniques, this study aims to offer further insight into this relationship within Jordanâs telecommunications market, focusing on customer satisfaction and customer retention as two prominent performance indicators in this market. Hence, this research set out to investigate the mechanisms and interrelationships that link marketing orientation and organisational performance, the issue that seems to be highly justified in the matured and competitive market where consumers have more choices, switching cost are decreasing and retention of the market base is becoming more and more difficult.
As a case study undertaken in Jordanâs telecommunications market, the main four telecommunications operators in the market were represented. Quantitative data analysis was used to determine the variations between the main operators in the market regarding their adopted levels of marketing orientation. On the other hand, the qualitative technique - namely semi-structured interviews - represents the main instrument the study utilises to gain an in-depth insight into the relationship between marketing orientation (MO) and organisational performance. This qualitative tool enabled the researcher to construct a rich picture of the mechanisms and ways by which firms manage the different attitudinal dimensions of customer satisfaction and the behavioural dimensions of customer retention.
Results of the research confirm significant variations between high- and low-marketing orientation telecommunications operators with regard to the approaches, drivers and mechanisms by which firms manage their capabilities to achieve customer satisfaction and customer retention. Thus, two different patterns were indicated which were associated with the adopted level of marketing orientation of these firms.
The most important finding to come out of the research was that genuine marketing orientation is an integrated attitudinal-behavioural perspective. Hence, any deficiencies or even ignoring of any aspect will weaken a firmâs overall value creation capability, the main mission of the marketing-oriented firm. In addition, internal culture emerged as a critical success factor for marketing-oriented firms. It serves as the glue that ensures a firmâs values are adhered to, and also allows a clearer understanding of a firmâs vision and mission, which in turn resulted in the fact that these firms are more capable to translate their attitudes into practice on the ground.
Moreover, the role of marketing orientation was substantial as it worked as a supportive environment that stimulates a firmâs capabilities to integrate and coordinate its resources and competencies into new ones in such a way as to enhance its overall performance as well as to achieve congruence with the changing business environment.
The importance of this research stems from its nature and approach in studying the relationship between marketing orientation and organisational performance. The main issue being evaluated is different from the bulk of marketing orientation works that have focused on examining different aspects of marketing orientation and organisational performance within causal relationship-style research, and mostly within a short-run view. In contrast, this study is concerned with gaining in-depth understanding of this relationship through evaluating its mechanisms and interrelationships, the aspect that was treated as a black box in prior research
A Reinforcement Learning based Cognitive Approach for Quality of Experience Management in the Future Internet
This thesis aims at providing an innovative contribution to the definition of the Future Internet Core Platform, in the frame of the "La Sapienza" University research activities on the EU FP7 FI-WARE project. The thesis introduces and designs an innovative "Cognitive Application Interface" in charge of deriving key parameters driving the Network Control elements to meet personalised Application Quality of Experience Requirements. The thesis proposes the innovative concept of a dynamic association between Applications and Classes of Service. A Reinforcement Learning based approach is followed. A solution based on a standard Q-learning algorithm is proposed. Simulation results obtained using the OPNET simulation tool are described. Preliminary work on an alternative solution based on a Foe Q-Learning algorithm is also illustrated. The proposed framework is very flexible, allows QoE personalization, requires low processing capabilities and entails a very limited signalling overhead
- âŚ