Search CORE

97 research outputs found

Soft Merging of Experts with Adaptive Routing

Author: Liu Haokun
Muqeeth Mohammed
Raffel Colin
Publication venue
Publication date: 06/06/2023
Field of study

Sparsely activated neural networks with conditional computation learn to route their inputs through different "expert" subnetworks, providing a form of modularity that densely activated models lack. Despite their possible benefits, models with learned routing often underperform their parameter-matched densely activated counterparts as well as models that use non-learned heuristic routing strategies. In this paper, we hypothesize that these shortcomings stem from the gradient estimation techniques used to train sparsely activated models that use non-differentiable discrete routing decisions. To address this issue, we introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids discrete routing by using a single "merged" expert constructed via a weighted average of all of the experts' parameters. By routing activations through a single merged expert, SMEAR does not incur a significant increase in computational costs and enables standard gradient-based training. We empirically validate that models using SMEAR outperform models that route based on metadata or learn sparse routing through gradient estimation. Furthermore, we provide qualitative analysis demonstrating that the experts learned via SMEAR exhibit a significant amount of specialization. All of the code used in our experiments is publicly available

arXiv.org e-Print Archive

Socio-technical transitions in the logistics sector: how companies manage their innovation in the era of digitalisation

Author: Liu Haokun
Publication venue
Publication date
Field of study

There is a growing theoretical interest in the role of firms in socio-technical transitions studies. In doing so, transitions scholars start to explore business model innovation in the context of sustainability transitions. However, due to a lack of a firm-level perspective in transitions theory, it is still unclear how business model innovation unfolds in a socio-technical system context over time. This thesis adopts an S-D logic, service ecosystems view on the role of firms to extend the transitions theory. Besides, logistics innovations have not been widely explored in the transitions literature. As one of the first studies to explore logistics innovations from a socio-technical transitions perspective, this thesis also contributes to understanding the system reconfiguration of production and consumption systems from a logistics perspective. This thesis employs a multiple-case studies design to investigate firms’ innovation activities and their impact at the micro-, meso-, and macro-levels of service ecosystems. Qualitative research methods have been used to investigate firms’ innovation projects. The thesis then examines the interrelation between innovation activities, logistics strategies, and logistics trends to explain not only why firms have to balance their innovation efforts at the micro-level of the service ecosystem, but also how logistics innovations unfold at the meso- and meso-levels by the enactment of these value propositions. In light of the investigation, this thesis adopts a balanced view of the relationship between firms and socio-technical systems, which helps overcome the ‘niche-actor’ and ‘regime-actor’ dichotomy in transitions studies. Also, this thesis provides a conceptual framework that bridges production and consumption by using logistics innovations as an empirical lens. Finally, due to the exploratory nature of this thesis, these findings may provide valuable insights for future research that integrate these two research streams for a better understanding of the firm-level innovation activities in socio-technical transitions

Online Research @ Cardiff

Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting

Author: Cash Benjamin
Liu Haokun
Orlova Elena
Rossellini Raphael
Willett Rebecca
Publication venue
Publication date: 02/11/2023
Field of study

Producing high-quality forecasts of key climate variables such as temperature and precipitation on subseasonal time scales has long been a gap in operational forecasting. Recent studies have shown promising results using machine learning (ML) models to advance subseasonal forecasting (SSF), but several open questions remain. First, several past approaches use the average of an ensemble of physics-based forecasts as an input feature of these models. However, ensemble forecasts contain information that can aid prediction beyond only the ensemble mean. Second, past methods have focused on average performance, whereas forecasts of extreme events are far more important for planning and mitigation purposes. Third, climate forecasts correspond to a spatially-varying collection of forecasts, and different methods account for spatial variability in the response differently. Trade-offs between different approaches may be mitigated with model stacking. This paper describes the application of a variety of ML methods used to predict monthly average precipitation and two meter temperature using physics-based predictions (ensemble forecasts) and observational data such as relative humidity, pressure at sea level, or geopotential height, two weeks in advance for the whole continental United States. Regression, quantile regression, and tercile classification tasks using linear models, random forests, convolutional neural networks, and stacked models are considered. The proposed models outperform common baselines such as historical averages (or quantiles) and ensemble averages (or quantiles). This paper further includes an investigation of feature importance, trade-offs between using the full ensemble or only the ensemble average, and different modes of accounting for spatial variability

arXiv.org e-Print Archive

LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks

Author: Aoyama Tadayoshi
Hasegawa Yasuhisa
Kato Kenji
Kondo Izumi
Liu Haokun
Zhu Yaonan
Publication venue
Publication date: 28/08/2023
Field of study

This paper presents a novel approach to enhance autonomous robotic manipulation using the Large Language Model (LLM) for logical inference, converting high-level language commands into sequences of executable motion functions. The proposed system combines the advantage of LLM with YOLO-based environmental perception to enable robots to autonomously make reasonable decisions and task planning based on the given commands. Additionally, to address the potential inaccuracies or illogical actions arising from LLM, a combination of teleoperation and Dynamic Movement Primitives (DMP) is employed for action correction. This integration aims to improve the practicality and generalizability of the LLM-based human-robot collaboration system.Comment: IEEE MHS 202

arXiv.org e-Print Archive

Developing logistics value propositions: drawing Insights from a distributed manufacturing solution

Author: Liu Haokun
Mason Robert
Purvis Laura
Wells Peter
Publication venue: 'Elsevier BV'
Publication date: 01/08/2020
Field of study

With a focus on supply chains as ecosystems of service exchange, our paper aims to explore how value propositions are developed and evolve via combinations of service innovation. A single longitudinal case study is presented. The units of analysis are different projects along a logistics service provider (LSP)’s innovation journey. The study explores how the case company identified innovation in logistics as a gap and developed a distributed manufacturing strategy with a unique business model involving a reallocation of production functions across a global supply network. Our contribution is two-fold. In terms of theory, we adopt a service-dominant logic perspective to investigate how companies' value propositions evolve over time. In terms of managerial contributions, our paper provides insights into how service providers can strategically integrate their resources with service ecosystem partners to provide competitive business propositions

Roehampton University Research Repository

Online Research @ Cardiff