78,153 research outputs found
Composability of Markov Models for Processing Sensor Data
We show that it is possible to apply the divide-and-conquer principle in constructing a Markov model for sensor data from available sensor logs. The state space can be partitioned into clusters, for which the required transition counts or probabilities can be acquired locally. The combination of these local parameters into a global model takes the form of a system of linear equations with a confined solution space. Expected advantages of this approach lie for example in reduced (wireless) communication costs
A Nice Labelling for Tree-Like Event Structures of Degree 3 (Extended Version)
We address the problem of finding nice labellings for event structures of
degree 3. We develop a minimum theory by which we prove that the labelling
number of an event structure of degree 3 is bounded by a linear function of the
height. The main theorem we present in this paper states that event structures
of degree 3 whose causality order is a tree have a nice labelling with 3
colors. Finally, we exemplify how to use this theorem to construct upper bounds
for the labelling number of other event structures of degree 3
The impact of shifting cultivation in the forestry ecosystems of timor-leste
Every year thousands of hectares of forest are destructed as a result of the practice of swidden agriculture, shifting cultivation or "slush and burn" causing changes in forest ecosystems. In Timor- Leste shifting cultivation is still practiced nowadays as a form of subsistence agriculture.
Swidden agriculture is characterized by slash and burn clearing, by a rotation of fields rather than of
crops, and by short periods of cropping (1-3 years) alternating with long fallow periods.
Based on the characterization of shifting cultivation in two Sucos of Bobonaro district, a reflection is
made on the impact of this practice in the sustainable development of forest ecosystems of Timor-
Leste.
Primary data collection was performed using a questionnaire survey of farmers practicing shifting cultivation. The questionnaire characterized shifting cultivation, and asked farmers’ opinion on slash and burning of forest areas and on the importance of forests.
According to the results obtained, in most situations the existing vegetation before the slash was composed of dense forest, the slash is made by the family group, the majority of farmers have been doing the “slush and burn” for more than ten years and the size of the plots where slash is made is less than 2 hectares. The materials resulting from the slash are used for firewood, building materials and fencing. The burning of vegetable residues is done before planting and soil preparation and sowing is done with a lever. Land and forest, despite having an individual use, have a tenure regime of ownership and access in which its nature of common pool good prevails. Every year thousands of hectares of forest are destructed as a result of the practice of swidden agriculture, shifting cultivation or "slush and burn" causing changes in forest ecosystems. In Timor-Leste shifting cultivation is still practiced nowadays as a form of subsistence agriculture
Let's Make Block Coordinate Descent Go Fast: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
Block coordinate descent (BCD) methods are widely-used for large-scale
numerical optimization because of their cheap iteration costs, low memory
requirements, amenability to parallelization, and ability to exploit problem
structure. Three main algorithmic choices influence the performance of BCD
methods: the block partitioning strategy, the block selection rule, and the
block update rule. In this paper we explore all three of these building blocks
and propose variations for each that can lead to significantly faster BCD
methods. We (i) propose new greedy block-selection strategies that guarantee
more progress per iteration than the Gauss-Southwell rule; (ii) explore
practical issues like how to implement the new rules when using "variable"
blocks; (iii) explore the use of message-passing to compute matrix or Newton
updates efficiently on huge blocks for problems with a sparse dependency
between variables; and (iv) consider optimal active manifold identification,
which leads to bounds on the "active set complexity" of BCD methods and leads
to superlinear convergence for certain problems with sparse solutions (and in
some cases finite termination at an optimal solution). We support all of our
findings with numerical results for the classic machine learning problems of
least squares, logistic regression, multi-class logistic regression, label
propagation, and L1-regularization
A general guide to applying machine learning to computer architecture
The resurgence of machine learning since the late 1990s has been enabled by significant advances in computing performance and the growth of big data. The ability of these algorithms to detect complex patterns in data which are extremely difficult to achieve manually, helps to produce effective predictive models. Whilst computer architects have been accelerating the performance of machine learning algorithms with GPUs and custom hardware, there have been few implementations leveraging these algorithms to improve the computer system performance. The work that has been conducted, however, has produced considerably promising results.
The purpose of this paper is to serve as a foundational base and guide to future computer
architecture research seeking to make use of machine learning models for improving system efficiency.
We describe a method that highlights when, why, and how to utilize machine learning
models for improving system performance and provide a relevant example showcasing the effectiveness of applying machine learning in computer architecture. We describe a process of data
generation every execution quantum and parameter engineering. This is followed by a survey of a
set of popular machine learning models. We discuss their strengths and weaknesses and provide
an evaluation of implementations for the purpose of creating a workload performance predictor
for different core types in an x86 processor. The predictions can then be exploited by a scheduler
for heterogeneous processors to improve the system throughput. The algorithms of focus are
stochastic gradient descent based linear regression, decision trees, random forests, artificial neural
networks, and k-nearest neighbors.This work has been supported by the European Research Council (ERC) Advanced Grant RoMoL (Grant Agreemnt 321253) and by the Spanish Ministry of Science and Innovation (contract TIN 2015-65316P).Peer ReviewedPostprint (published version
Biological control of the chestnut gall wasp with \emph{T. sinensis}: a mathematical model
The Asian chestnut gall wasp \emph{Dryocosmus kuriphilus}, native of China,
has become a pest when it appeared in Japan, Korea, and the United States. In
Europe it was first found in Italy, in 2002. In 1982 the host-specific
parasitoid \emph{Torymus sinensis} was introduced in Japan, in an attempt to
achieve a biological control of the pest. After an apparent initial success,
the two species seem to have locked in predator-prey cycles of decadal length.
We have developed a spatially explicit mathematical model that describes the
seasonal time evolution of the adult insect populations, and the competition
for finding egg deposition sites. In a spatially homogeneous situation the
model reduces to an iterated map for the egg density of the two species. While
the map would suggest, for realistic parameters, that both species should
become locally extinct (somewhat corroborating the hypothesis of biological
control), the full model, for the same parameters, shows that the introduction
of \emph{T. sinensis} sparks a traveling wave of the parasitoid population that
destroys the pest on its passage. Depending on the value of the diffusion
coefficients of the two species, the pest can later be able to re-colonize the
empty area left behind the wave. When this occurs the two populations do not
seem to attain a state of spatial homogeneity, but produce an ever-changing
pattern of traveling waves
Large induced subgraphs via triangulations and CMSO
We obtain an algorithmic meta-theorem for the following optimization problem.
Let \phi\ be a Counting Monadic Second Order Logic (CMSO) formula and t be an
integer. For a given graph G, the task is to maximize |X| subject to the
following: there is a set of vertices F of G, containing X, such that the
subgraph G[F] induced by F is of treewidth at most t, and structure (G[F],X)
models \phi.
Some special cases of this optimization problem are the following generic
examples. Each of these cases contains various problems as a special subcase:
1) "Maximum induced subgraph with at most l copies of cycles of length 0
modulo m", where for fixed nonnegative integers m and l, the task is to find a
maximum induced subgraph of a given graph with at most l vertex-disjoint cycles
of length 0 modulo m.
2) "Minimum \Gamma-deletion", where for a fixed finite set of graphs \Gamma\
containing a planar graph, the task is to find a maximum induced subgraph of a
given graph containing no graph from \Gamma\ as a minor.
3) "Independent \Pi-packing", where for a fixed finite set of connected
graphs \Pi, the task is to find an induced subgraph G[F] of a given graph G
with the maximum number of connected components, such that each connected
component of G[F] is isomorphic to some graph from \Pi.
We give an algorithm solving the optimization problem on an n-vertex graph G
in time O(#pmc n^{t+4} f(t,\phi)), where #pmc is the number of all potential
maximal cliques in G and f is a function depending of t and \phi\ only. We also
show how a similar running time can be obtained for the weighted version of the
problem. Pipelined with known bounds on the number of potential maximal
cliques, we deduce that our optimization problem can be solved in time
O(1.7347^n) for arbitrary graphs, and in polynomial time for graph classes with
polynomial number of minimal separators
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
Multiplicity estimates, analytic cycles and Newton polytopes
We consider the problem of estimating the multiplicity of a polynomial when
restricted to the smooth analytic trajectory of a (possibly singular)
polynomial vector field at a given point or points, under an assumption known
as the D-property. Nesterenko has developed an elimination theoretic approach
to this problem which has been widely used in transcendental number theory.
We propose an alternative approach to this problem based on more local
analytic considerations. In particular we obtain simpler proofs to many of the
best known estimates, and give more general formulations in terms of Newton
polytopes, analogous to the Bernstein-Kushnirenko theorem. We also improve the
estimate's dependence on the ambient dimension from doubly-exponential to an
essentially optimal single-exponential.Comment: Some editorial modifications to improve readability; No essential
mathematical change
Cover-Encodings of Fitness Landscapes
The traditional way of tackling discrete optimization problems is by using
local search on suitably defined cost or fitness landscapes. Such approaches
are however limited by the slowing down that occurs when the local minima that
are a feature of the typically rugged landscapes encountered arrest the
progress of the search process. Another way of tackling optimization problems
is by the use of heuristic approximations to estimate a global cost minimum.
Here we present a combination of these two approaches by using cover-encoding
maps which map processes from a larger search space to subsets of the original
search space. The key idea is to construct cover-encoding maps with the help of
suitable heuristics that single out near-optimal solutions and result in
landscapes on the larger search space that no longer exhibit trapping local
minima. We present cover-encoding maps for the problems of the traveling
salesman, number partitioning, maximum matching and maximum clique; the
practical feasibility of our method is demonstrated by simulations of adaptive
walks on the corresponding encoded landscapes which find the global minima for
these problems.Comment: 15 pages, 4 figure
- …