Search CORE

116 research outputs found

Recommended from our members

State-based Policy Representation for Deep Policy Learning

Author: Liu Fangchen
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Reinforcement Learning has achieved noticeable success in many fields, such as video game playing, continuous control, and the game of Go. One the other hand, current approaches usually require large sample complexity, and also lack the transferability to similar tasks. Imitation learning, also known as ``learning from demonstrations'', is possible to mitigate the former problem by providing successful experiences. However, current methods usually assume the expert and imitator are the same, which lack flexibility and robustness when the dynamics change.Generalizability is the core of artificial intelligence. An agent should be able to apply its knowledge for novel tasks after training in similar environments, or providing related demonstrations. Given current observation, it should have the ability to predict what can happen (modeling), and what needs to happen (planning). This brings out challenges on how to represent the knowledge and how to utilize the knowledge by learning from interactions or demonstrations.In this thesis, we will systematically study two important problems, the universal goal-reaching problem and the cross-morphology imitation learning problem, which are representative challenges in the field of reinforcement learning and imitation learning. Laying out our research work that attends to these challenging tasks unfolds our roadmap towards the holy-grail goal: make the agent generalizable by learning from observations and model the world

eScholarship - University of California

Make the smell dancing : designing an experimental interactive purifying device near an open fire for frequent users in Finnish Lapland

Author: Dai Fangchen
Publication venue: fi=Lapin yliopisto|en=University of Lapland|
Publication date: 01/01/2024
Field of study

In Finnish Lapland, a “laavu” is a lean-to structure that is used by the Finnish people in the open air to get warmth and rest. It is a cultural heritage that evokes the positive feeling of relaxing in a safe place and enjoying nature with family and friends. Often people who use the laavu do not mind the smell of the smoke even though there are a few airborne pollutants generated by burning logs. In my research, I am approaching the dilemma of the cultural and simultaneously pollutant phenomenon of smoke by designing an interactive purifying device using negative ionization technology to cluster the pollutant to the ground for the benefit of the users and the environment. Besides, previous studies show that applying a negative ionizer has a positive effect on depression and stress. The thermoelectric effect is used to power the device as a clean energy source. The research methods are Design-Based Research, Service Design and the Agile process method, and Art-Based Action Research. The data was collected on site by questionnaires and the researcher’s observation diary, and from interviews in the University of Lapland. The artistic production as the result of the research project is presented by several interactive prototypes, the structure of which aims to integrate modern technology and interactive art into the traditional experience of using a laavu. The comparison and conflict between the diverse individual feelings towards fire, smoke, and artistic technology applied to nature and the environment is also probed and discussed

Lauda

Hybrid model and structured sparsity for under-determined convolutive audio source separation

Author: Feng Fangchen
Kowalski Matthieu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/05/2014
Field of study

International audienceWe consider the problem of extracting the source signals from an under-determined convolutive mixture, assuming known filters. We start from its formulation as a minimization of a convex functional, combining a classical

\ell_2

discrepancy term between the observed mixture and the one reconstructed from the estimated sources, and a sparse regularization term of source coefficients in a time-frequency domain. We then introduce a first kind of structure, using a hybrid model. Finally, we embed the previously introduced Windowed-Group-Lasso operator into the iterative thresholding/shrinkage algorithm, in order to take into account some structures inside each layers of time-frequency representations. Intensive numerical studies confirm the benefits of such an approach

HAL-CentraleSupelec

Crossref

HAL-Rennes 1

Toxicological effects of cadmium and lead during pregnancy -- in tobacco smoke, secondhand smoke, and thirdhand smoke

Author: Yu Fangchen
Publication venue: 'University Library System, University of Pittsburgh'
Publication date: 01/01/2017
Field of study

Tobacco smoking during pregnancy is a major public health concern. It is well known that smoking before and during pregnancy can cause illness and death to both mothers and infants; furthermore, smoking after delivery can affect infants’ health conditions via secondhand smoke (SHS) and thirdhand smoke (THS) exposure. Total tobacco smoke exposure is the cumulative involuntary exposure to tobacco smoke pollutants during and after smoking, which means that secondhand smoke and thirdhand smoke are also dangerous risk factors that could affect pregnant women and their babies. This essay introduces some basic information about the adverse health effects of tobacco smoke, secondhand smoke, and thirdhand smoke. Furthermore, this essay focuses on two chemical components of tobacco – cadmium and lead, comparing the specific health effects they could lead to in pregnant women and infants through exposure to firsthand (direct) smoke, secondhand smoke, and thirdhand smoke

D-Scholarship@Pitt

Recommended from our members

A graphical editor for OSU v3.0

Author: Lin Fangchen
Publication venue: 'Oregon State University'
Publication date
Field of study

Development of graphical user interface (GUI) applications is difficult since the process can be both complicated and tedious. We propose a solution directed at reducing programming time and effort required to build a GUI application. Our solution is based on the Petri Network, the Oregon SpeedCode Universe (OSU) Application Framework, and the OSU Browser (v. 3.0). A Petri Network is a visual programming language which is used represent the sequencing of objects and messages. The Application Framework provides reusable components in the form of objects. The Browser provides a visual way to examine a system in search of reusable components. A Petri Net editor was constructed which incorporates a code generator and browser. This editor uses direct-manipulation to simplify coding tasks, accepting specifications from the developer and generating the internal representations of the Petri Net. The internal representation is input to the Code Generator, thus generating an OSU Application Framework-based C++ program as output. Using the Petri Net editor to generate four application programs ; 1) drawing program, 2) a help system, 3) a calculator, and 4) a record query system, it is estimated that programming time has been reduced by 90% and programming effort has been reduced by 79%

ScholarsArchive@OSU

Masked Autoencoding for Scalable and Generalizable Decision Making

Author: Abbeel Pieter
Grover Aditya
Liu Fangchen
Liu Hao
Publication venue
Publication date: 27/05/2023
Field of study

We are interested in learning scalable agents for reinforcement learning that can learn from large-scale, diverse sequential data similar to current large vision and language models. To this end, this paper presents masked decision prediction (MaskDP), a simple and scalable self-supervised pretraining method for reinforcement learning (RL) and behavioral cloning (BC). In our MaskDP approach, we employ a masked autoencoder (MAE) to state-action trajectories, wherein we randomly mask state and action tokens and reconstruct the missing data. By doing so, the model is required to infer masked-out states and actions and extract information about dynamics. We find that masking different proportions of the input sequence significantly helps with learning a better model that generalizes well to multiple downstream tasks. In our empirical study, we find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching, and it can zero-shot infer skills from a few example transitions. In addition, MaskDP transfers well to offline RL and shows promising scaling behavior w.r.t. to model size. It is amenable to data-efficient finetuning, achieving competitive results with prior methods based on autoregressive pretraining

arXiv.org e-Print Archive