294 research outputs found
Soft Merging of Experts with Adaptive Routing
Sparsely activated neural networks with conditional computation learn to
route their inputs through different "expert" subnetworks, providing a form of
modularity that densely activated models lack. Despite their possible benefits,
models with learned routing often underperform their parameter-matched densely
activated counterparts as well as models that use non-learned heuristic routing
strategies. In this paper, we hypothesize that these shortcomings stem from the
gradient estimation techniques used to train sparsely activated models that use
non-differentiable discrete routing decisions. To address this issue, we
introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids
discrete routing by using a single "merged" expert constructed via a weighted
average of all of the experts' parameters. By routing activations through a
single merged expert, SMEAR does not incur a significant increase in
computational costs and enables standard gradient-based training. We
empirically validate that models using SMEAR outperform models that route based
on metadata or learn sparse routing through gradient estimation. Furthermore,
we provide qualitative analysis demonstrating that the experts learned via
SMEAR exhibit a significant amount of specialization. All of the code used in
our experiments is publicly available
Visualizing the equilibrium manifold
The characterisation of the competitive equilibrium set as a smooth manifold using a differentiable approach is a major contribution in economic theory. While there is a detailed characterisation of the equilibrium manifold using differentiable topology and geometry, there are no worked example to illustrate the various properties that have been derived. This paper presents several examples, one with a globally unique equilibrium, and three where there can be multiplicity of equilibria. The algorithm to compute the equilibrium is given and can be used for other economies of interest
Visualizing the equilibrium manifold
The characterisation of the competitive equilibrium set as a smooth manifold using a differentiable approach is a major contribution in economic theory. While there is a detailed characterisation of the equilibrium manifold using differentiable topology and geometry, there are no worked example to illustrate the various properties that have been derived. This paper presents several examples, one with a globally unique equilibrium, and three where there can be multiplicity of equilibria. The algorithm to compute the equilibrium is given and can be used for other economies of interest
Socio-technical transitions in the logistics sector: how companies manage their innovation in the era of digitalisation
There is a growing theoretical interest in the role of firms in socio-technical transitions studies. In doing so, transitions scholars start to explore business model innovation in the context of sustainability transitions. However, due to a lack of a firm-level perspective in transitions theory, it is still unclear how business model innovation unfolds in a socio-technical system context over time. This thesis adopts an S-D logic, service ecosystems view on the role of firms to extend the transitions theory. Besides, logistics innovations have not been widely explored in the transitions literature. As one of the first studies to explore logistics innovations from a socio-technical transitions perspective, this thesis also contributes to understanding the system reconfiguration of production and consumption systems from a logistics perspective.
This thesis employs a multiple-case studies design to investigate firms’ innovation activities and their impact at the micro-, meso-, and macro-levels of service ecosystems. Qualitative research methods have been used to investigate firms’ innovation projects. The thesis then examines the interrelation between innovation activities, logistics strategies, and logistics trends to explain not only why firms have to balance their innovation efforts at the micro-level of the service ecosystem, but also how logistics innovations unfold at the meso- and meso-levels by the enactment of these value propositions.
In light of the investigation, this thesis adopts a balanced view of the relationship between firms and socio-technical systems, which helps overcome the ‘niche-actor’ and ‘regime-actor’ dichotomy in transitions studies. Also, this thesis provides a conceptual framework that bridges production and consumption by using logistics innovations as an empirical lens. Finally, due to the exploratory nature of this thesis, these findings may provide valuable insights for future research that integrate these two research streams for a better understanding of the firm-level innovation activities in socio-technical transitions
Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting
Producing high-quality forecasts of key climate variables such as temperature
and precipitation on subseasonal time scales has long been a gap in operational
forecasting. Recent studies have shown promising results using machine learning
(ML) models to advance subseasonal forecasting (SSF), but several open
questions remain. First, several past approaches use the average of an ensemble
of physics-based forecasts as an input feature of these models. However,
ensemble forecasts contain information that can aid prediction beyond only the
ensemble mean. Second, past methods have focused on average performance,
whereas forecasts of extreme events are far more important for planning and
mitigation purposes. Third, climate forecasts correspond to a spatially-varying
collection of forecasts, and different methods account for spatial variability
in the response differently. Trade-offs between different approaches may be
mitigated with model stacking. This paper describes the application of a
variety of ML methods used to predict monthly average precipitation and two
meter temperature using physics-based predictions (ensemble forecasts) and
observational data such as relative humidity, pressure at sea level, or
geopotential height, two weeks in advance for the whole continental United
States. Regression, quantile regression, and tercile classification tasks using
linear models, random forests, convolutional neural networks, and stacked
models are considered. The proposed models outperform common baselines such as
historical averages (or quantiles) and ensemble averages (or quantiles). This
paper further includes an investigation of feature importance, trade-offs
between using the full ensemble or only the ensemble average, and different
modes of accounting for spatial variability
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks
This paper presents a novel approach to enhance autonomous robotic
manipulation using the Large Language Model (LLM) for logical inference,
converting high-level language commands into sequences of executable motion
functions. The proposed system combines the advantage of LLM with YOLO-based
environmental perception to enable robots to autonomously make reasonable
decisions and task planning based on the given commands. Additionally, to
address the potential inaccuracies or illogical actions arising from LLM, a
combination of teleoperation and Dynamic Movement Primitives (DMP) is employed
for action correction. This integration aims to improve the practicality and
generalizability of the LLM-based human-robot collaboration system.Comment: IEEE MHS 202
- …