Search CORE

12 research outputs found

Least singular value and condition number of a square random matrix with i.i.d. rows

Author: Gregoratti Matteo
Maran Davide
Publication venue
Publication date: 14/04/2020
Field of study

We consider a square random matrix made by i.i.d. rows with any distribution and prove that, for any given dimension, the probability for the least singular value to be in [0;

\epsilon

) is at least of order

\epsilon

. This allows us to generalize a result about the expectation of the condition number that was proved in the case of centered gaussian i.i.d. entries: such an expectation is always infinite. Moreover, we get some additional results for some well-known random matrix ensembles, in particular for the isotropic log-concave case, which is proved to have the best behaving in terms of the well conditioning

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Tight Performance Guarantees of Imitator Policies with Continuous Actions

Author: Maran Davide
Metelli Alberto Maria
Restelli Marcello
Publication venue
Publication date: 01/01/2023
Field of study

Behavioral Cloning (BC) aims at learning a policy that mimics the behavior demonstrated by an expert. The current theoretical understanding of BC is limited to the case of finite actions. In this paper, we study BC with the goal of providing theoretical guarantees on the performance of the imitator policy in the case of continuous actions. We start by deriving a novel bound on the performance gap based on Wasserstein distance, applicable for continuous-action experts, holding under the assumption that the value function is Lipschitz continuous. Since this latter condition is hardy fulfilled in practice, even for Lipschitz Markov Decision Processes and policies, we propose a relaxed setting, proving that value function is always H\"older continuous. This result is of independent interest and allows obtaining in BC a general bound for the performance of the imitator policy. Finally, we analyze noise injection, a common practice in which the expert’s action is executed in the environment after the application of a noise kernel. We show that this practice allows deriving stronger performance guarantees, at the price of a bias due to the noise addition

Archivio istituzionale della ricerca - Politecnico di Milano

Autoregressive Bandits

Author: Bacchiocchi Francesco
Gatti Nicola
Genalti Gianmarco
Maran Davide
Metelli Alberto Maria
Mussi Marco
Restelli Marcello
Publication venue
Publication date: 12/12/2022
Field of study

Autoregressive processes naturally arise in a large variety of real-world scenarios, including e.g., stock markets, sell forecasting, weather prediction, advertising, and pricing. When addressing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for converge to the optimal decision policy. In this work, we propose a novel online learning setting, named Autoregressive Bandits (ARBs), in which the observed reward follows an autoregressive process of order

k

, whose parameters depend on the action the agent chooses, within a finite set of

n

actions. Then, we devise an optimistic regret minimization algorithm AutoRegressive Upper Confidence Bounds (AR-UCB) that suffers regret of order

\widetilde{\mathcal{O}} \left( \frac{(k+1)^{3/2}\sqrt{nT}}{(1-\Gamma)^2} \right)

, being

T

the optimization horizon and

\Gamma < 1

an index of the stability of the system. Finally, we present a numerical validation in several synthetic and one real-world setting, in comparison with general and specific purpose bandit baselines showing the advantages of the proposed approach

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Tight Performance Guarantees of Imitator Policies with Continuous Actions

Author: Maran Davide
Metelli Alberto Maria
Restelli Marcello
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

Behavioral Cloning (BC) aims at learning a policy that mimics the behavior demonstrated by an expert. The current theoretical understanding of BC is limited to the case of finite actions. In this paper, we study BC with the goal of providing theoretical guarantees on the performance of the imitator policy in the case of continuous actions. We start by deriving a novel bound on the performance gap based on Wasserstein distance, applicable for continuous-action experts, holding under the assumption that the value function is Lipschitz continuous. Since this latter condition is hardy fulfilled in practice, even for Lipschitz Markov Decision Processes and policies, we propose a relaxed setting, proving that value function is always H\"older continuous. This result is of independent interest and allows obtaining in BC a general bound for the performance of the imitator policy. Finally, we analyze noise injection, a common practice in which the expert's action is executed in the environment after the application of a noise kernel. We show that this practice allows deriving stronger performance guarantees, at the price of a bias due to the noise addition

Association for the Advancement of Artificial Intelligence: AAAI Publications

Delayed Reinforcement Learning by Imitation

Author: Bisi Lorenzo
Liotet Pierre
Maran Davide
Restelli Marcello
Publication venue
Publication date: 01/01/2022
Field of study

When the agent's observations or interactions are delayed, classic reinforcement learning tools usually fail. In this paper, we propose a simple yet new and efficient solution to this problem. We assume that, in the undelayed environment, an efficient policy is known or can be easily learned, but the task may suffer from delays in practice and we thus want to take them into account. We present a novel algorithm, Delayed Imitation with Dataset Aggregation (DIDA), which builds upon imitation learning methods to learn how to act in a delayed environment from undelayed demonstrations. We provide a theoretical analysis of the approach that will guide the practical design of DIDA. These results are also of general interest in the delayed reinforcement learning literature by providing bounds on the performance between delayed and undelayed tasks, under smoothness conditions. We show empirically that DIDA obtains high performances with a remarkable sample efficiency on a variety of tasks, including robotic locomotion, classic control, and trading

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Development of a new extraction technique and HPLC method for the analysis of non-psychoactive cannabinoids in fibre-type Cannabis sativa L. (hemp)

Author: Benvenuti Stefania
Brighenti Virginia
Maran Davide
Pellati Federica
Steinbach Marleen
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

The present work was aimed at the development and validation of a new, efficient and reliable technique for the analysis of the main non-psychoactive cannabinoids in fibre-type Cannabis sativa L. (hemp) inflorescences belonging to different varieties. This study was designed to identify samples with a high content of bioactive compounds, with a view to underscoring the importance of quality control in derived products as well. Different extraction methods, including dynamic maceration (DM), ultrasound-assisted extraction (UAE), microwave-assisted extraction (MAE) and supercritical-fluid extraction (SFE) were applied and compared in order to obtain a high yield of the target analytes from hemp. Dynamic maceration for 45min with ethanol (EtOH) at room temperature proved to be the most suitable technique for the extraction of cannabinoids in hemp samples. The analysis of the target analytes in hemp extracts was carried out by developing a new reversed-phase high-performance liquid chromatography (HPLC) method coupled with diode array (UV/DAD) and electrospray ionization-mass spectrometry (ESI-MS) detection, by using an ion trap mass analyser. An Ascentis Express C18 column (150mm 73.0mm I.D., 2.7\u3bcm) was selected for the HPLC analysis, with a mobile phase composed of 0.1% formic acid in both water and acetonitrile, under gradient elution. The application of the fused-core technology allowed us to obtain a significant improvement of the HPLC performance compared with that of conventional particulate stationary phases, with a shorter analysis time and a remarkable reduction of solvent usage. The analytical method optimized in this study was fully validated to show compliance with international requirements. Furthermore, it was applied to the characterization of nine hemp samples and six hemp-based pharmaceutical products. As such, it was demonstrated to be a very useful tool for the analysis of cannabinoids in both the plant material and its derivatives for pharmaceutical and nutraceutical applications

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Autoregressive Bandits

Author: Alberto Maria Metelli
Davide Maran
Francesco Bacchiocchi
Gianmarco Genalti
Marcello Restelli
Marco Mussi
Nicola Gatti
Publication venue: PMLR
Publication date: 01/01/2024
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

When Love Just Ends: An Investigation of the Relationship Between Dysfunctional Behaviors, Attachment Styles, Gender, and Education Shortly After a Relationship Dissolution

Author: Cristina Civilotti
Cristina Civilotti
Daniela Acquadro Maran
Davide Margola
John Lawrence Dennis
John Lawrence Dennis
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2021
Field of study

Much information is known about the long-term consequences of separation and divorce, whereas there is a paucity of studies about the short-term consequences of such experiences. This study investigates the adoption of dysfunctional behaviors (e.g., insistent telephone calls and text messages, verbal threats, and sending unwanted objects) shortly after a relationship dissolution. A total of 136 participants who declared to have been left by their former partner in the previous 6 months were included in this study (i.e., females: n = 84; males: n = 52; mean age = 30.38; SD = 4.19). Attachment styles were evaluated as explanatory variables when facing a relationship dissolution, in connection with a set of (1) demographic variables (i.e., gender, education, and current marital/relationship status), (2) dysfunctional behaviors, and (3) motivations on the basis of those behaviors. Results showed that a secure or dismissing attachment style, a higher education, and currently married (but awaiting separation) status were the protective factors in adopting such dysfunctional behaviors, while the preoccupied and fearful-avoidant subjects, especially females, tended to adopt dysfunctional behaviors (i.e., communication attempts and defamation) and reported fear of abandonment and need for attention as underlying motivations. Future study on longitudinal aspects of the relationship dissolution processes is required to have deeper insights into this phenomenon. This study sheds light on the relationship between adult attachment styles and the motivations behind the adoption of dysfunctional behaviors after a relationship dissolution

Directory of Open Access Journals

Extraction of alginate from Sargassum muticum: process optimization and study of its functional activities

Author: A Jensen
A Zykwinska
Anupriya Mazumder
CDL Martins
D Qiao
Davide De Francisci
DJ McHugh
EV Vinogradov
G Hernández-Carmona
G Hernández-Carmona
G Hernández-Carmona
G Klöck
G Klöck
GT Grant
H Grasdalen
H Uludag
H. N. Mishra
Irini Angelidaki
J Dusseault
JP Maran
JY Kim
KI Draget
KS Farvin
L Liu
M Campos
M Murata
M Oyaizu
M Zubia
MA Bezerra
Merlin Alvarado-Morales
MR Torres
N Blanco-Pascual
N González-López
N Heffernan
P Vauchel
Q Zhang
RV Muralidhar
S-L Hii
Susan Løvstad Holdt
T Salomonsen
TA Davis
TA Davis
TA Fenoradosoa
VL Singleton
X Zhao
X Zhao
Y Ge
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref