17,996 research outputs found
-ARM: Network Sparsification via Stochastic Binary Optimization
We consider network sparsification as an -norm regularized binary
optimization problem, where each unit of a neural network (e.g., weight,
neuron, or channel, etc.) is attached with a stochastic binary gate, whose
parameters are jointly optimized with original network parameters. The
Augment-Reinforce-Merge (ARM), a recently proposed unbiased gradient estimator,
is investigated for this binary optimization problem. Compared to the hard
concrete gradient estimator from Louizos et al., ARM demonstrates superior
performance of pruning network architectures while retaining almost the same
accuracies of baseline methods. Similar to the hard concrete estimator, ARM
also enables conditional computation during model training but with improved
effectiveness due to the exact binary stochasticity. Thanks to the flexibility
of ARM, many smooth or non-smooth parametric functions, such as scaled sigmoid
or hard sigmoid, can be used to parameterize this binary optimization problem
and the unbiasness of the ARM estimator is retained, while the hard concrete
estimator has to rely on the hard sigmoid function to achieve conditional
computation and thus accelerated training. Extensive experiments on multiple
public datasets demonstrate state-of-the-art pruning rates with almost the same
accuracies of baseline methods. The resulting algorithm -ARM sparsifies
the Wide-ResNet models on CIFAR-10 and CIFAR-100 while the hard concrete
estimator cannot. The code is public available at
https://github.com/leo-yangli/l0-arm.Comment: Published as a conference paper at ECML 201
Recommended from our members
Approaches to violent extremist offenders and countering radicalisation in prisons and probation
This paper aims to provide policy-makers, prison
governors and prison and probation staff with
information on current practice and issues relevant to
managing Violent Extremist Offenders (VEOs) and
individuals considered at risk of engaging in violent
extremism in a prison and probation context. The
paper is structured around these two contexts. While in
practice this distinction may not exist in some EU
Member States, it serves to identify key issues: prison
conditions and reintegration strategies, risk assessment,
prison regime choices, rehabilitation and reintegration
initiatives, and staff training.European Commissio
Recommended from our members
Expertise and (In)Security: Lessons from Prison and Probation Contexts on Counter-terrorism, Trust, and Citizenship
With the revelations that many ISIS recruits are ex-offenders (Cottee 2016), prison and probation settings are on the frontline of counter-terrorism practice. The latest policy developments in Europe on managing radicalization and convicted terrorist-offenders in prison and post-release settings show some perhaps surprising recommendations: those built on a foundation of seeking to build trust, to recognize human dignity and equality, and a broader vision to reform offenders as citizens (Council of Europe 2016; United Nations 2016; Williams 2017).Data collection was supported by the Economic and Social Research Council [Ref #: ES/L003120/1]. The writing of this article was generously supported by the Social Sciences and Humanities Research Council of Canada through a post-doctoral fellowship [Award Number 756-2014-0647] and the Centre of Islamic Studies, University of Cambridge
Gastric perforation and pancreatitis manifesting after an inadvertent nissen fundoplication in a patient with superior mesenteric artery syndrome.
Superior mesenteric artery (SMA) syndrome is an uncommon but well-recognized clinical entity. It can lead to proximal small bowel obstruction and severe morbidity and mortality in lieu of late diagnosis and concomitant existing comorbidities. We report a 54-year-old female, with SMA syndrome which manifested itself after Nissen fundoplication along with two major complications. The diagnosis of SMA was established by clinical symptoms and radiological findings
Insulin response and changes in composition of non-esterified fatty acids in blood plasma of middle-aged men following isoenergetic fatty and carbohydrate breakfasts
It was previously shown that a high plasma concentration of non-esterified fatty acids (NEFA) persisted after a fatty breakfast, but not after an isoenergetic carbohydrate breakfast, adversely affecting glucose tolerance. The higher concentration after the fatty breakfast may in part have been a result of different mobilization rates of fatty acids. This factor can be investigated as NEFA mobilized from tissues are monounsaturated to a greater extent than those deposited from a typical meal. Twenty-four middle-aged healthy Caucasian men were given oral glucose tolerance tests (OGTT), and for 28 d isoenergetic breakfasts of similar fat composition but of low (L) or moderate (M) fat content. The composition of NEFA in fasting and postprandial plasma was determined on days 1 and 29. No significant treatment differences in fasting NEFA composition occurred on day 29. During the OGTT and 0-1 h following breakfast there was an increase in plasma long-chain saturated NEFA but a decrease in monounsaturated NEFA (mug/100 mug total NEFA; Pg/100 mug total NEFA; P<0.05), expressed as an increase in 18:1 and decreases in 16:0 and 17:0 in treatment M relative to treatment L (P<0.05). Serum insulin attained 35 and 65 mU/l in treatments M and L respectively during this period. Negative correlations were found between 16:0 in fasting plasma and both waist:hip circumference (P=0.0009) and insulin response curve area during OGTT (within treatment M, P=0.0001). It is concluded that a normal postprandial insulin response is associated with a rapid change in plasma saturated:monounsaturated NEFA. It is proposed that this change is the result of a variable suppression of fat mobilization, which may partly account for a large difference in postprandial total plasma NEFA between fatty and carbohydrate meals
Visual Rationalizations in Deep Reinforcement Learning for Atari Games
Due to the capability of deep learning to perform well in high dimensional
problems, deep reinforcement learning agents perform well in challenging tasks
such as Atari 2600 games. However, clearly explaining why a certain action is
taken by the agent can be as important as the decision itself. Deep
reinforcement learning models, as other deep learning models, tend to be opaque
in their decision-making process. In this work, we propose to make deep
reinforcement learning more transparent by visualizing the evidence on which
the agent bases its decision. In this work, we emphasize the importance of
producing a justification for an observed action, which could be applied to a
black-box decision agent.Comment: presented as oral talk at BNAIC 201
The ultimate wearable: Connecting prosthetic limbs to the IoPH
A new wearable device called the 'Ubi-Sleeve' is currently being developed that enables prosthesis wearers and other stakeholders to review temperature, humidity and prosthesis slippage behavior during everyday prosthesis wear. A combination of custom 3D printed strain sensors and off the shelf temperature and humidity sensors will be integrated into an unobtrusive sleeve to create a device that enables a deeper level of understanding of heat and sweat issues. To create the device, a series of experiments are in progress that will quantify changes in heat, humidity and slippage that negatively affect the prosthesis experience. Interviews and focus groups are also being conducted to gain a deeper understanding of the human side of prosthesis wear and to also ensure that data are presented in a way that is effective, useful and easy to understand
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Value-based reinforcement-learning algorithms provide state-of-the-art
results in model-free discrete-action settings, and tend to outperform
actor-critic algorithms. We argue that actor-critic algorithms are limited by
their need for an on-policy critic. We propose Bootstrapped Dual Policy
Iteration (BDPI), a novel model-free reinforcement-learning algorithm for
continuous states and discrete actions, with an actor and several off-policy
critics. Off-policy critics are compatible with experience replay, ensuring
high sample-efficiency, without the need for off-policy corrections. The actor,
by slowly imitating the average greedy policy of the critics, leads to
high-quality and state-specific exploration, which we compare to Thompson
sampling. Because the actor and critics are fully decoupled, BDPI is remarkably
stable, and unusually robust to its hyper-parameters. BDPI is significantly
more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete,
continuous and pixel-based tasks. Source code:
https://github.com/vub-ai-lab/bdpi.Comment: Accepted at the European Conference on Machine Learning 2019 (ECML
- …