2,944 research outputs found
Energy efficient mining on a quantum-enabled blockchain using light
We outline a quantum-enabled blockchain architecture based on a consortium of
quantum servers. The network is hybridised, utilising digital systems for
sharing and processing classical information combined with a fibre--optic
infrastructure and quantum devices for transmitting and processing quantum
information. We deliver an energy efficient interactive mining protocol enacted
between clients and servers which uses quantum information encoded in light and
removes the need for trust in network infrastructure. Instead, clients on the
network need only trust the transparent network code, and that their devices
adhere to the rules of quantum physics. To demonstrate the energy efficiency of
the mining protocol, we elaborate upon the results of two previous experiments
(one performed over 1km of optical fibre) as applied to this work. Finally, we
address some key vulnerabilities, explore open questions, and observe
forward--compatibility with the quantum internet and quantum computing
technologies.Comment: 25 pages, 5 figure
Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks
Big data analytics frameworks (BDAFs) have been widely used for data
processing applications. These frameworks provide a large number of
configuration parameters to users, which leads to a tuning issue that
overwhelms users. To address this issue, many automatic tuning approaches have
been proposed. However, it remains a critical challenge to generate enough
samples in a high-dimensional parameter space within a time constraint. In this
paper, we present AutoTune--an automatic parameter tuning system that aims to
optimize application execution time on BDAFs. AutoTune first constructs a
smaller-scale testbed from the production system so that it can generate more
samples, and thus train a better prediction model, under a given time
constraint. Furthermore, the AutoTune algorithm produces a set of samples that
can provide a wide coverage over the high-dimensional parameter space, and
searches for more promising configurations using the trained prediction model.
AutoTune is implemented and evaluated using the Spark framework and HiBench
benchmark deployed on a public cloud. Extensive experimental results illustrate
that AutoTune improves on default configurations by 63.70% on average, and on
the five state-of-the-art tuning algorithms by 6%-23%.Comment: 12 pages, submitted to IEEE BigData 201
Meta-learning Performance Prediction of Highly Configurable Systems: A Cost-oriented Approach
A key challenge of the development and maintenance of configurable systems is to predict the performance of individual system variants based on the features selected. It is usually infeasible to measure the performance of all possible variants, due to feature combinatorics. Previous approaches predict performance based on small samples of measured variants, but it is still open how to dynamically determine an ideal sample that balances prediction accuracy and measurement effort. In this work, we adapt two widely-used sampling strategies for performance prediction to the domain of configurable systems and evaluate them in terms of sampling cost, which considers prediction accuracy and measurement effort simultaneously. To generate an initial sample, we develop two sampling algorithms. One based on a traditional method of t-way feature coverage, and another based on a new heuristic of feature-frequencies. Using empirical data from six real-world systems, we evaluate the two sampling algorithms and discuss trade-offs. Furthermore, we conduct extensive sensitivity analysis of the cost model metric we use for evaluation, and analyze stability of learning behavior of the subject systems
Early stopping by correlating online indicators in neural networks
Financiado para publicación en acceso aberto: Universidade de Vigo/CISUGinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-85160-C2-2-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDOinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113230RB-C22/ES/SEQUENCE LABELING MULTITASK MODELS FOR LINGUISTICALLY ENRICHED NER: SEMANTICS AND DOMAIN ADAPTATION (SCANNER-UVIGO)In order to minimize the generalization error in neural networks, a novel technique to identify
overfitting phenomena when training the learner is formally introduced. This enables support of a
reliable and trustworthy early stopping condition, thus improving the predictive power of that type
of modeling. Our proposal exploits the correlation over time in a collection of online indicators,
namely characteristic functions for indicating if a set of hypotheses are met, associated with a range of
independent stopping conditions built from a canary judgment to evaluate the presence of overfitting.
That way, we provide a formal basis for decision making in terms of interrupting the learning process.
As opposed to previous approaches focused on a single criterion, we take advantage of subsidiarities
between independent assessments, thus seeking both a wider operating range and greater diagnostic
reliability. With a view to illustrating the effectiveness of the halting condition described, we choose
to work in the sphere of natural language processing, an operational continuum increasingly based on
machine learning. As a case study, we focus on parser generation, one of the most demanding and
complex tasks in the domain. The selection of cross-validation as a canary function enables an actual
comparison with the most representative early stopping conditions based on overfitting identification,
pointing to a promising start toward an optimal bias and variance control.Agencia Estatal de Investigación | Ref. TIN2017-85160-C2-2-RAgencia Estatal de Investigación | Ref. PID2020-113230RB-C22Xunta de Galicia | Ref. ED431C 2018/5
Adaptive scheduling for adaptive sampling in pos taggers construction
We introduce an adaptive scheduling for adaptive sampling as a novel way of machine learning in the construction of part-of-speech taggers. The goal is to speed up the training on large data sets, without significant loss of performance with regard to an optimal configuration. In contrast to previous methods using a random, fixed or regularly rising spacing between the instances, ours analyzes the shape of the learning curve geometrically in conjunction with a functional model to increase or decrease it at any time. The algorithm proves to be formally correct regarding our working hypotheses. Namely, given a case, the following one is the nearest ensuring a net gain of learning ability from the former, it being possible to modulate the level of requirement for this condition. We also improve the robustness of sampling by paying greater attention to those regions of the training data base subject to a temporary inflation in performance, thus preventing the learning from stopping prematurely. The proposal has been evaluated on the basis of its reliability to identify the convergence of models, corroborating our expectations. While a concrete halting condition is used for testing, users can choose any condition whatsoever to suit their own specific needs.Agencia Estatal de Investigación | Ref. TIN2017-85160-C2-1-RAgencia Estatal de Investigación | Ref. TIN2017-85160-C2-2-RXunta de Galicia | Ref. ED431C 2018/50Xunta de Galicia | Ref. ED431D 2017/1
Utilizing Skylab data in on-going resources management programs in the state of Ohio
The author has identified the following significant results. The use of Skylab imagery for total area woodland surveys was found to be more accurate and cheaper than conventional surveys using aerial photo-plot techniques. Machine-aided (primarily density slicing) analyses of Skylab 190A and 190B color and infrared color photography demonstrated the feasibility of using such data for differentiating major timber classes including pines, hardwoods, mixed, cut, and brushland providing such analyses are made at scales of 1:24,000 and larger. Manual and machine-assisted image analysis indicated that spectral and spatial capabilities of Skylab EREP photography are adequate to distinguish most parameters of current, coal surface mining concern associated with: (1) active mining, (2) orphan lands, (3) reclaimed lands, and (4) active reclamation. Excellent results were achieved when comparing Skylab and aerial photographic interpretations of detailed surface mining features. Skylab photographs when combined with other data bases (e.g., census, agricultural land productivity, and transportation networks), provide a comprehensive, meaningful, and integrated view of major elements involved in the urbanization/encroachment process
- …