52 research outputs found
Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors
We propose a very simple, and well principled wayofcomputing the optimal step size in gradient descent algorithms. The on-line version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivativematrix (Hessian), which does not require to even calculate the Hessian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters
Avoiding overfitting of multilayer perceptrons by training derivatives
Resistance to overfitting is observed for neural networks trained with
extended backpropagation algorithm. In addition to target values, its cost
function uses derivatives of those up to the order. For
common applications of neural networks, high order derivatives are not readily
available, so simpler cases are considered: training network to approximate
analytical function inside 2D and 5D domains and solving Poisson equation
inside a 2D circle. For function approximation, the cost is a sum of squared
differences between output and target as well as their derivatives with respect
to the input. Differential equations are usually solved by putting a multilayer
perceptron in place of unknown function and training its weights, so that
equation holds within some margin of error. Commonly used cost is the
equation's residual squared. Added terms are squared derivatives of said
residual with respect to the independent variables. To investigate overfitting,
the cost is minimized for points of regular grids with various spacing, and its
root mean is compared with its value on much denser test set. Fully connected
perceptrons with six hidden layers and , and
weights in total are trained with Rprop until cost changes by
less than 10% for last 1000 epochs, or when the epoch is
reached. Training the network with weights to represent simple
2D function using 10 points with 8 extra derivatives in each produces cost test
to train ratio of , whereas for classical backpropagation in comparable
conditions this ratio is
Improving Sparse Representation-Based Classification Using Local Principal Component Analysis
Sparse representation-based classification (SRC), proposed by Wright et al.,
seeks the sparsest decomposition of a test sample over the dictionary of
training samples, with classification to the most-contributing class. Because
it assumes test samples can be written as linear combinations of their
same-class training samples, the success of SRC depends on the size and
representativeness of the training set. Our proposed classification algorithm
enlarges the training set by using local principal component analysis to
approximate the basis vectors of the tangent hyperplane of the class manifold
at each training sample. The dictionary in SRC is replaced by a local
dictionary that adapts to the test sample and includes training samples and
their corresponding tangent basis vectors. We use a synthetic data set and
three face databases to demonstrate that this method can achieve higher
classification accuracy than SRC in cases of sparse sampling, nonlinear class
manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition,"
editors Shyi-Ming Chen and Witold Pedrycz. The original publication is
available at http://www.springerlink.co
Owl Eyes: Spotting UI Display Issues via Visual Understanding
Graphical User Interface (GUI) provides a visual bridge between a software
application and end users, through which they can interact with each other.
With the development of technology and aesthetics, the visual effects of the
GUI are more and more attracting. However, such GUI complexity posts a great
challenge to the GUI implementation. According to our pilot study of
crowdtesting bug reports, display issues such as text overlap, blurred screen,
missing image always occur during GUI rendering on different devices due to the
software or hardware compatibility. They negatively influence the app
usability, resulting in poor user experience. To detect these issues, we
propose a novel approach, OwlEye, based on deep learning for modelling visual
information of the GUI screenshot. Therefore, OwlEye can detect GUIs with
display issues and also locate the detailed region of the issue in the given
GUI for guiding developers to fix the bug. We manually construct a large-scale
labelled dataset with 4,470 GUI screenshots with UI display issues and develop
a heuristics-based data augmentation method for boosting the performance of our
OwlEye. The evaluation demonstrates that our OwlEye can achieve 85% precision
and 84% recall in detecting UI display issues, and 90% accuracy in localizing
these issues. We also evaluate OwlEye with popular Android apps on Google Play
and F-droid, and successfully uncover 57 previously-undetected UI display
issues with 26 of them being confirmed or fixed so far.Comment: Accepted to 35th IEEE/ACM International Conference on Automated
Software Engineering (ASE 20
Crowdsourcing the Perception of Machine Teaching
Teachable interfaces can empower end-users to attune machine learning systems
to their idiosyncratic characteristics and environment by explicitly providing
pertinent training examples. While facilitating control, their effectiveness
can be hindered by the lack of expertise or misconceptions. We investigate how
users may conceptualize, experience, and reflect on their engagement in machine
teaching by deploying a mobile teachable testbed in Amazon Mechanical Turk.
Using a performance-based payment scheme, Mechanical Turkers (N = 100) are
called to train, test, and re-train a robust recognition model in real-time
with a few snapshots taken in their environment. We find that participants
incorporate diversity in their examples drawing from parallels to how humans
recognize objects independent of size, viewpoint, location, and illumination.
Many of their misconceptions relate to consistency and model capabilities for
reasoning. With limited variation and edge cases in testing, the majority of
them do not change strategies on a second training attempt.Comment: 10 pages, 8 figures, 5 tables, CHI2020 conferenc
Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015 : a systematic analysis for the Global Burden of Disease Study 2015
Background Improving survival and extending the longevity of life for all populations requires timely, robust evidence on local mortality levels and trends. The Global Burden of Disease 2015 Study (GBD 2015) provides a comprehensive assessment of all-cause and cause-specific mortality for 249 causes in 195 countries and territories from 1980 to 2015. These results informed an in-depth investigation of observed and expected mortality patterns based on sociodemographic measures. Methods We estimated all-cause mortality by age, sex, geography, and year using an improved analytical approach originally developed for GBD 2013 and GBD 2010. Improvements included refinements to the estimation of child and adult mortality and corresponding uncertainty, parameter selection for under-5 mortality synthesis by spatiotemporal Gaussian process regression, and sibling history data processing. We also expanded the database of vital registration, survey, and census data to 14 294 geography-year datapoints. For GBD 2015, eight causes, including Ebola virus disease, were added to the previous GBD cause list for mortality. We used six modelling approaches to assess cause-specific mortality, with the Cause of Death Ensemble Model (CODEm) generating estimates for most causes. We used a series of novel analyses to systematically quantify the drivers of trends in mortality across geographies. First, we assessed observed and expected levels and trends of cause-specific mortality as they relate to the Socio-demographic Index (SDI), a summary indicator derived from measures of income per capita, educational attainment, and fertility. Second, we examined factors affecting total mortality patterns through a series of counterfactual scenarios, testing the magnitude by which population growth, population age structures, and epidemiological changes contributed to shifts in mortality. Finally, we attributed changes in life expectancy to changes in cause of death. We documented each step of the GBD 2015 estimation processes, as well as data sources, in accordance with Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER). Findings Globally, life expectancy from birth increased from 61.7 years (95% uncertainty interval 61.4-61.9) in 1980 to 71.8 years (71.5-72.2) in 2015. Several countries in sub-Saharan Africa had very large gains in life expectancy from 2005 to 2015, rebounding from an era of exceedingly high loss of life due to HIV/AIDS. At the same time, many geographies saw life expectancy stagnate or decline, particularly for men and in countries with rising mortality from war or interpersonal violence. From 2005 to 2015, male life expectancy in Syria dropped by 11.3 years (3.7-17.4), to 62.6 years (56.5-70.2). Total deaths increased by 4.1% (2.6-5.6) from 2005 to 2015, rising to 55.8 million (54.9 million to 56.6 million) in 2015, but age-standardised death rates fell by 17.0% (15.8-18.1) during this time, underscoring changes in population growth and shifts in global age structures. The result was similar for non-communicable diseases (NCDs), with total deaths from these causes increasing by 14.1% (12.6-16.0) to 39.8 million (39.2 million to 40.5 million) in 2015, whereas age-standardised rates decreased by 13.1% (11.9-14.3). Globally, this mortality pattern emerged for several NCDs, including several types of cancer, ischaemic heart disease, cirrhosis, and Alzheimer's disease and other dementias. By contrast, both total deaths and age-standardised death rates due to communicable, maternal, neonatal, and nutritional conditions significantly declined from 2005 to 2015, gains largely attributable to decreases in mortality rates due to HIV/AIDS (42.1%, 39.1-44.6), malaria (43.1%, 34.7-51.8), neonatal preterm birth complications (29.8%, 24.8-34.9), and maternal disorders (29.1%, 19.3-37.1). Progress was slower for several causes, such as lower respiratory infections and nutritional deficiencies, whereas deaths increased for others, including dengue and drug use disorders. Age-standardised death rates due to injuries significantly declined from 2005 to 2015, yet interpersonal violence and war claimed increasingly more lives in some regions, particularly in the Middle East. In 2015, rotaviral enteritis (rotavirus) was the leading cause of under-5 deaths due to diarrhoea (146 000 deaths, 118 000-183 000) and pneumococcal pneumonia was the leading cause of under-5 deaths due to lower respiratory infections (393 000 deaths, 228 000-532 000), although pathogen-specific mortality varied by region. Globally, the effects of population growth, ageing, and changes in age-standardised death rates substantially differed by cause. Our analyses on the expected associations between cause-specific mortality and SDI show the regular shifts in cause of death composition and population age structure with rising SDI. Country patterns of premature mortality (measured as years of life lost [YLLs]) and how they differ from the level expected on the basis of SDI alone revealed distinct but highly heterogeneous patterns by region and country or territory. Ischaemic heart disease, stroke, and diabetes were among the leading causes of YLLs in most regions, but in many cases, intraregional results sharply diverged for ratios of observed and expected YLLs based on SDI. Communicable, maternal, neonatal, and nutritional diseases caused the most YLLs throughout sub-Saharan Africa, with observed YLLs far exceeding expected YLLs for countries in which malaria or HIV/AIDS remained the leading causes of early death. Interpretation At the global scale, age-specific mortality has steadily improved over the past 35 years; this pattern of general progress continued in the past decade. Progress has been faster in most countries than expected on the basis of development measured by the SDI. Against this background of progress, some countries have seen falls in life expectancy, and age-standardised death rates for some causes are increasing. Despite progress in reducing age-standardised death rates, population growth and ageing mean that the number of deaths from most non-communicable causes are increasing in most countries, putting increased demands on health systems. Copyright (C) The Author(s). Published by Elsevier Ltd.Peer reviewe
Using machine learning to break visual human interaction proofs (HIPs
Machine learning is often used to automatically solve human tasks. In this paper, we look for tasks where machine learning algorithms are not as good as humans with the hope of gaining insight into their current limitations. We studied various Human Interactive Proofs (HIPs) on the market, because they are systems designed to tell computers and humans apart by posing challenges presumably too hard for computers. We found that most HIPs are pure recognition tasks which can easily be broken using machine learning. The harder HIPs use a combination of segmentation and recognition tasks. From this observation, we found that building segmentation tasks is the most effective way to confuse machine learning algorithms. This has enabled us to build effective HIPs (which we deployed in MSN Passport), as well as design challenging segmentation tasks for machine learning algorithms.
Learning State Space Dynamics in Recurrent Networks
Ph.D.Thesis, Computer Science Dept., U Rochester; Dana H. Ballard, thesis advisor; simultaneously published in the Technical Report series.Fully recurrent (asymmetrical) networks can be used to learn temporal trajectories. The network is unfolded in time, and backpropagation is used to train the weights. The presense of recurrent connections creates internal states in the system which vary as a function of time. The resulting dynamics can provide interesting additional computing power but learning is made more difficult by the existence of internal memories. This study first exhibits the properties of recurrent networks in terms of convergence when the internal states of the system are unknown. A new energy functional is provided to change the weights of the units in order to the control the stability of the fixed points of the network's dynamics. The power of the resultant algorithm is illustrated with the simulation of a content addressable memory. Next, the more general case of time trajectories on a recurrent network is studied. An application is proposed in which trajectories are generated to draw letters as a function of an input. In another application of recurrent systems, a neural network certain temporal properties observed in human callosally sectioned brains. Finally the proposed algorithm for stabilizing dynamics around fixed points is extended to one for stabilizing dynamics around time trajectories. Its effects are illustrated on a network which generates Lisajous curves
Boxlets: a Fast Convolution Algorithm for Signal Processing and Neural Networks
Signal processing and pattern recognition algorithms make extensive use of convolution. In many cases, computational accuracy is not as important as computational speed. In feature extraction, for instance, the features of interest in a signal are usually quite distorted. This form of noise justifies some level of quantization in order to achieve faster feature extraction. Our approach consists of approximating regions of the signal with low degree polynomials, and then differentiating the resulting signals in order to obtain impulse functions (or derivatives of impulse functions). With this representation, convolution becomes extremely simple and can be implemented quite effectively. The true convolution can be recovered by integrating the result of the convolution. This method yields substantial speed up in feature extraction and is applicable to convolutional neural networks. 1 Introduction In pattern recognition, convolution is an important tool because of its transl..
Hardware implementation of the backpropagation without multiplication
The back propagation algorithm has been modi ed to work without any multiplications and to tolerate computations with a low resolution, which makes it more attractive for a hardware implementation. Numbers are represented in oating-point format with 1 bit mantissa and 2 bits in the exponent for the states, and 1 bit mantissa and 4 bit exponent for the gradients, while the weights are 16 bit xed-point numbers. In this way, all the computations can be executed with shift and add operations. Large networks with over 100,000 weights were trained and demonstrated the same performance as networks computed with full precision. An estimate of a circuit implementation shows that a large network can be placed on a single chip, reaching more than 1 billion weight updates per second. A speedup is also obtained onany machine where a multiplication is slower than a shift operation.
- âŠ