311 research outputs found
Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks
Deep neural networks (DNNs) have demonstrated success for many supervised
learning tasks, ranging from voice recognition, object detection, to image
classification. However, their increasing complexity might yield poor
generalization error that make them hard to be deployed on edge devices.
Quantization is an effective approach to compress DNNs in order to meet these
constraints. Using a quasiconvex base function in order to construct a binary
quantizer helps training binary neural networks (BNNs) and adding noise to the
input data or using a concrete regularization function helps to improve
generalization error. Here we introduce foothill function, an infinitely
differentiable quasiconvex function. This regularizer is flexible enough to
deform towards and penalties. Foothill can be used as a binary
quantizer, as a regularizer, or as a loss. In particular, we show this
regularizer reduces the accuracy gap between BNNs and their full-precision
counterpart for image classification on ImageNet.Comment: Accepted in 16th International Conference of Image Analysis and
Recognition (ICIAR 2019
Generative discriminative models for multivariate inference and statistical mapping in medical imaging
This paper presents a general framework for obtaining interpretable
multivariate discriminative models that allow efficient statistical inference
for neuroimage analysis. The framework, termed generative discriminative
machine (GDM), augments discriminative models with a generative regularization
term. We demonstrate that the proposed formulation can be optimized in closed
form and in dual space, allowing efficient computation for high dimensional
neuroimaging datasets. Furthermore, we provide an analytic estimation of the
null distribution of the model parameters, which enables efficient statistical
inference and p-value computation without the need for permutation testing. We
compared the proposed method with both purely generative and discriminative
learning methods in two large structural magnetic resonance imaging (sMRI)
datasets of Alzheimer's disease (AD) (n=415) and Schizophrenia (n=853). Using
the AD dataset, we demonstrated the ability of GDM to robustly handle
confounding variations. Using Schizophrenia dataset, we demonstrated the
ability of GDM to handle multi-site studies. Taken together, the results
underline the potential of the proposed approach for neuroimaging analyses.Comment: To appear in MICCAI 2018 proceeding
Evolving Spatially Aggregated Features from Satellite Imagery for Regional Modeling
Satellite imagery and remote sensing provide explanatory variables at
relatively high resolutions for modeling geospatial phenomena, yet regional
summaries are often desirable for analysis and actionable insight. In this
paper, we propose a novel method of inducing spatial aggregations as a
component of the machine learning process, yielding regional model features
whose construction is driven by model prediction performance rather than prior
assumptions. Our results demonstrate that Genetic Programming is particularly
well suited to this type of feature construction because it can automatically
synthesize appropriate aggregations, as well as better incorporate them into
predictive models compared to other regression methods we tested. In our
experiments we consider a specific problem instance and real-world dataset
relevant to predicting snow properties in high-mountain Asia
VIP-STB farm: scale-up village to county/province level to support science and technology at backyard (STB) program.
In this paper, we introduce a new concept in VIP-STB, a funded project through Agri-Tech in China: Newton Network+ (ATCNN), in developing feasible solutions towards scaling-up STB from village level to upper level via some generic models and systems. There are three tasks in this project, i.e. normalized difference vegetation index (NDVI) estimation, wheat density estimation and household-based small farms (HBSF) engagement. In the first task, several machine learning models have been used to evaluate the performance of NDVI estimation. In the second task, integrated software via Python and Twilio is developed to improve communication services and engagement for HBSFs, and provides technical capabilities. In the third task, crop density/population is predicted by conventional image processing techniques. The objectives and strategy for VIP-STB are described, experimental results on each task are presented, and more details on each model that has been implemented are also provided with future development guidance
Classification tools for carotenoid content estimation in Manihot esculenta via metabolomics and machine learning
Cassava genotypes (Manihot esculenta Crantz) with high pro-vitamin A activity have been identified as a strategy to reduce the prevalence of deficiency of this vitamin. The color variability of cassava roots, which can vary from white to red, is related to the presence of several carotenoid pigments. The present study has shown how CIELAB color measurement on cassava roots tissue can be used as a non-destructive and very fast technique to quantify the levels of carotenoids in cassava root samples, avoiding the use of more expensive analytical techniques for compound quantification, such as UV-visible spectrophotometry and the HPLC. For this, we used machine learning techniques, associating the colorimetric data (CIELAB) with the data obtained by UV-vis and HPLC, to obtain models of prediction of carotenoids for this type of biomass. Best values of R2 (above 90%) were observed for the predictive variable TCC determined by UV-vis spectrophotometry. When we tested the machine learning models using the CIELAB values as inputs, for the total carotenoids contents quantified by HPLC, the Partial Least Squares (PLS), Support Vector Machines, and Elastic Net models presented the best values of R2 (above 40%) and Root-Mean-Square Error (RMSE). For the carotenoid quantification by UV-vis spectrophotometry, R2 (around 60%) and RMSE values (around 6.5) are more satisfactory. Ridge regression and Elastic Network showed the best results. It can be concluded that the use colorimetric technique (CIELAB) associated with UV-vis/HPLC and statistical techniques of prognostic analysis through machine learning can predict the content of total carotenoids in these samples, with good precision and accuracy.CAPES -Coordenação de Aperfeiçoamento de Pessoal de Nível Superior(407323/2013-9)info:eu-repo/semantics/publishedVersio
Clothes size prediction from dressed-human silhouettes
© 2017, Springer International Publishing AG. We propose an effective and efficient way to automatically predict clothes size for users to buy clothes online. We take human height and dressed-human silhouettes in front and side views as input, and estimate 3D body sizes with a data-driven method. We adopt 20 body sizes which are closely related to clothes size, and use such 3D body sizes to get clothes size by searching corresponding size chart. Previous image-based methods need to calibrate camera to estimate 3D information from 2D images, because the same person has different appearances of silhouettes (e.g. size and shape) when the camera configuration (intrinsic and extrinsic parameters) is different. Our method avoids camera calibration, which is much more convenient. We set up our virtual camera and train the relationship between human height and silhouette size under this camera configuration. After estimating silhouette size, we regress the positions of 2D body landmarks. We define 2D body sizes as the distances between corresponding 2D body landmarks. Finally, we learn the relationship between 2D body sizes and 3D body sizes. The training samples for each regression process come from a database of 3D naked and dressed bodies created by previous work. We evaluate the whole procedure and each process of our framework. We also compare the performance with several regression models. The total time-consumption for clothes size prediction is less than 0.1, s and the average estimation error of body sizes is 0.824, cm, which can satisfy the tolerance for customers to shop clothes online
Selection of tuning parameters in bridge regression models via Bayesian information criterion
We consider the bridge linear regression modeling, which can produce a sparse
or non-sparse model. A crucial point in the model building process is the
selection of adjusted parameters including a regularization parameter and a
tuning parameter in bridge regression models. The choice of the adjusted
parameters can be viewed as a model selection and evaluation problem. We
propose a model selection criterion for evaluating bridge regression models in
terms of Bayesian approach. This selection criterion enables us to select the
adjusted parameters objectively. We investigate the effectiveness of our
proposed modeling strategy through some numerical examples.Comment: 20 pages, 5 figure
Differential expression analysis with global network adjustment
<p>Background: Large-scale chromosomal deletions or other non-specific perturbations of the transcriptome can alter the expression of hundreds or thousands of genes, and it is of biological interest to understand which genes are most profoundly affected. We present a method for predicting a gene’s expression as a function of other genes thereby accounting for the effect of transcriptional regulation that confounds the identification of genes differentially expressed relative to a regulatory network. The challenge in constructing such models is that the number of possible regulator transcripts within a global network is on the order of thousands, and the number of biological samples is typically on the order of 10. Nevertheless, there are large gene expression databases that can be used to construct networks that could be helpful in modeling transcriptional regulation in smaller experiments.</p>
<p>Results: We demonstrate a type of penalized regression model that can be estimated from large gene expression databases, and then applied to smaller experiments. The ridge parameter is selected by minimizing the cross-validation error of the predictions in the independent out-sample. This tends to increase the model stability and leads to a much greater degree of parameter shrinkage, but the resulting biased estimation is mitigated by a second round of regression. Nevertheless, the proposed computationally efficient “over-shrinkage” method outperforms previously used LASSO-based techniques. In two independent datasets, we find that the median proportion of explained variability in expression is approximately 25%, and this results in a substantial increase in the signal-to-noise ratio allowing more powerful inferences on differential gene expression leading to biologically intuitive findings. We also show that a large proportion of gene dependencies are conditional on the biological state, which would be impossible with standard differential expression methods.</p>
<p>Conclusions: By adjusting for the effects of the global network on individual genes, both the sensitivity and reliability of differential expression measures are greatly improved.</p>
A Regularized Graph Layout Framework for Dynamic Network Visualization
Many real-world networks, including social and information networks, are
dynamic structures that evolve over time. Such dynamic networks are typically
visualized using a sequence of static graph layouts. In addition to providing a
visual representation of the network structure at each time step, the sequence
should preserve the mental map between layouts of consecutive time steps to
allow a human to interpret the temporal evolution of the network. In this
paper, we propose a framework for dynamic network visualization in the on-line
setting where only present and past graph snapshots are available to create the
present layout. The proposed framework creates regularized graph layouts by
augmenting the cost function of a static graph layout algorithm with a grouping
penalty, which discourages nodes from deviating too far from other nodes
belonging to the same group, and a temporal penalty, which discourages large
node movements between consecutive time steps. The penalties increase the
stability of the layout sequence, thus preserving the mental map. We introduce
two dynamic layout algorithms within the proposed framework, namely dynamic
multidimensional scaling (DMDS) and dynamic graph Laplacian layout (DGLL). We
apply these algorithms on several data sets to illustrate the importance of
both grouping and temporal regularization for producing interpretable
visualizations of dynamic networks.Comment: To appear in Data Mining and Knowledge Discovery, supporting material
(animations and MATLAB toolbox) available at
http://tbayes.eecs.umich.edu/xukevin/visualization_dmkd_201
Significance testing in ridge regression for genetic data.
Published versio
- …