1,494 research outputs found
Real-time human detection for electricity conservation using pruned-SSD and arduino
Electricity conservation techniques have gained more importance in recent years. Many smart techniques are invented to save electricity with the help of assisted devices like sensors. Though it saves electricity, it adds an additional sensor cost to the system. This work aims to develop a system that manages the electric power supply, only when it is actually needed i.e., the system enables the power supply when a human is present in the location and disables it otherwise. The system avoids any additional costs by using the closed circuit television, which is installed in most of the places for security reasons. Human detection is done by a Modified-single shot detection with a specific hyperparameter tuning method. Further the model is pruned to reduce the computational cost of the framework which in turn reduces the processing speed of the network drastically. The model yields the output to the Arduino micro-controller to enable the power supply in and around the location only when a human is detected and disables it when the human exits. The model is evaluated on CHOKEPOINT dataset and real-time video surveillance footage. Experimental results have shown an average accuracy of 85.82% with 2.1 seconds of processing time per frame
BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees
The rising volume of datasets has made training machine learning (ML) models
a major computational cost in the enterprise. Given the iterative nature of
model and parameter tuning, many analysts use a small sample of their entire
data during their initial stage of analysis to make quick decisions (e.g., what
features or hyperparameters to use) and use the entire dataset only in later
stages (i.e., when they have converged to a specific model). This sampling,
however, is performed in an ad-hoc fashion. Most practitioners cannot precisely
capture the effect of sampling on the quality of their model, and eventually on
their decision-making process during the tuning phase. Moreover, without
systematic support for sampling operators, many optimizations and reuse
opportunities are lost.
In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML
training. BlinkML allows users to make error-computation tradeoffs: instead of
training a model on their full data (i.e., full model), BlinkML can quickly
train an approximate model with quality guarantees using a sample. The quality
guarantees ensure that, with high probability, the approximate model makes the
same predictions as the full model. BlinkML currently supports any ML model
that relies on maximum likelihood estimation (MLE), which includes Generalized
Linear Models (e.g., linear regression, logistic regression, max entropy
classifier, Poisson regression) as well as PPCA (Probabilistic Principal
Component Analysis). Our experiments show that BlinkML can speed up the
training of large-scale ML tasks by 6.26x-629x while guaranteeing the same
predictions, with 95% probability, as the full model.Comment: 22 pages, SIGMOD 201
Linear-scaling kernels for protein sequences and small molecules outperform deep learning while providing uncertainty quantitation and improved interpretability
Gaussian process (GP) is a Bayesian model which provides several advantages
for regression tasks in machine learning such as reliable quantitation of
uncertainty and improved interpretability. Their adoption has been precluded by
their excessive computational cost and by the difficulty in adapting them for
analyzing sequences (e.g. amino acid and nucleotide sequences) and graphs (e.g.
ones representing small molecules). In this study, we develop efficient and
scalable approaches for fitting GP models as well as fast convolution kernels
which scale linearly with graph or sequence size. We implement these
improvements by building an open-source Python library called xGPR. We compare
the performance of xGPR with the reported performance of various deep learning
models on 20 benchmarks, including small molecule, protein sequence and tabular
data. We show that xGRP achieves highly competitive performance with much
shorter training time. Furthermore, we also develop new kernels for sequence
and graph data and show that xGPR generally outperforms convolutional neural
networks on predicting key properties of proteins and small molecules.
Importantly, xGPR provides uncertainty information not available from typical
deep learning models. Additionally, xGPR provides a representation of the input
data that can be used for clustering and data visualization. These results
demonstrate that xGPR provides a powerful and generic tool that can be broadly
useful in protein engineering and drug discovery.Comment: This is a revised version of the original manuscript with additional
experiment
Machine Learning Model Optimization with Hyper Parameter Tuning Approach
Hyper-parameters tuning is a key step to find the optimal machine learning parameters. Determining the best hyper-parameters takes a good deal of time, especially when the objective functions are costly to determine, or a large number of parameters are required to be tuned. In contrast to the conventional machine learning algorithms, Neural Network requires tuning hyperparameters more because it has to process a lot of parameters together, and depending on the fine tuning, the accuracy of the model can be varied in between 25%-90%. A few of the most effective techniques for tuning hyper-parameters in the Deep learning methods are: Grid search, Random forest, Bayesian optimization, etc. Every method has some advantages and disadvantages over others. For example: Grid search has proven to be an effective technique to tune hyper-parameters, along with drawbacks like trying too many combinations, and performing poorly when it is required to tune many parameters at a time. In our work, we will determine, show and analyze the efficiencies of a real-world synthetic polymer dataset for different parameters and tuning methods
- …