23,173 research outputs found
Lie and conditional symmetries of a class of nonlinear (1+2)-dimensional boundary value problems
A new definition of conditional invariance for boundary value problems
involving a wide range of boundary conditions (including initial value problems
as a special case) is proposed. It is shown that other definitions worked out
in order to find Lie symmetries of boundary value problems with standard
boundary conditions, follow as particular cases from our definition. Simple
examples of direct applicability to the nonlinear problems arising in
applications are demonstrated. Moreover, the successful application of the
definition for the Lie and conditional symmetry classification of a class of
(1+2)-dimensional nonlinear boundary value problems governed by the nonlinear
diffusion equation in a semi-infinite domain is realised. In particular, it is
proved that there is a special exponent, , for the power diffusivity
when the problem in question with non-vanishing flux on the boundary
admits additional Lie symmetry operators compared to the case . In
order to demonstrate the applicability of the symmetries derived, they are used
for reducing the nonlinear problems with power diffusivity and a constant
non-zero flux on the boundary (such problems are common in applications and
describing a wide range of phenomena) to (1+1)-dimensional problems. The
structure and properties of the problems obtained are briefly analysed.
Finally, some results demonstrating how Lie invariance of the boundary value
problem in question depends on geometry of the domain are presented.Comment: 25 pages; the main results were presented at the Conference Symmetry,
Methods, Applications and Related Fields, Vancouver, Canada, May 13-16, 201
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch
Recent and forthcoming advances in instrumentation, and giant new surveys,
are creating astronomical data sets that are not amenable to the methods of
analysis familiar to astronomers. Traditional methods are often inadequate not
merely because of the size in bytes of the data sets, but also because of the
complexity of modern data sets. Mathematical limitations of familiar algorithms
and techniques in dealing with such data sets create a critical need for new
paradigms for the representation, analysis and scientific visualization (as
opposed to illustrative visualization) of heterogeneous, multiresolution data
across application domains. Some of the problems presented by the new data sets
have been addressed by other disciplines such as applied mathematics,
statistics and machine learning and have been utilized by other sciences such
as space-based geosciences. Unfortunately, valuable results pertaining to these
problems are mostly to be found only in publications outside of astronomy. Here
we offer brief overviews of a number of concepts, techniques and developments,
some "old" and some new. These are generally unknown to most of the
astronomical community, but are vital to the analysis and visualization of
complex datasets and images. In order for astronomers to take advantage of the
richness and complexity of the new era of data, and to be able to identify,
adopt, and apply new solutions, the astronomical community needs a certain
degree of awareness and understanding of the new concepts. One of the goals of
this paper is to help bridge the gap between applied mathematics, artificial
intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in
Astronomy, special issue "Robotic Astronomy
Evolving artificial datasets to improve interpretable classifiers
Differential Evolution can be used to construct effective and compact artificial training datasets for machine learning algorithms. In this paper, a series of comparative experiments are performed in which two simple interpretable supervised classifiers (specifically, Naive Bayes and linear Support Vector Machines) are trained (i) directly on “real” data, as would be the normal case, and (ii) indirectly, using special artificial datasets derived from real data via evolutionary optimization. The results across several challenging test problems show that supervised classifiers trained indirectly using our novel evolution-based approach produce models with superior predictive classification performance. Besides presenting the accuracy of the learned models, we also analyze the sensitivity of our artificial data optimization process to Differential Evolution's parameters, and then we examine the statistical characteristics of the artificial data that is evolved
- …