620 research outputs found
zoo: S3 Infrastructure for Regular and Irregular Time Series
zoo is an R package providing an S3 class with methods for indexed totally
ordered observations, such as discrete irregular time series. Its key design
goals are independence of a particular index/time/date class and consistency
with base R and the "ts" class for regular time series. This paper describes
how these are achieved within zoo and provides several illustrations of the
available methods for "zoo" objects which include plotting, merging and
binding, several mathematical operations, extracting and replacing data and
index, coercion and NA handling. A subclass "zooreg" embeds regular time series
into the "zoo" framework and thus bridges the gap between regular and irregular
time series classes in R.Comment: 24 pages, 5 figure
Implementing a Class of Permutation Tests: The coin Package
The R package coin implements a unified approach to permutation tests providing a huge class of independence tests for nominal, ordered, numeric, and censored data as well as multivariate data at mixed scales. Based on a rich and flexible conceptual framework that embeds different permutation test procedures into a common theory, a computational framework is established in coin that likewise embeds the corresponding R functionality in a common S4 class structure with associated generic functions. As a consequence, the computational tools in coin inherit the flexibility of the underlying theory and conditional inference functions for important special cases can be set up easily. Conditional versions of classical tests---such as tests for location and scale problems in two or more samples, independence in two- or three-way contingency tables, or association problems for censored, ordered categorical or multivariate data---can easily be implemented as special cases using this computational toolbox by choosing appropriate transformations of the observations. The paper gives a detailed exposition of both the internal structure of the package and the provided user interfaces along with examples on how to extend the implemented functionality.
Structural Change in (Economic) Time Series
Methods for detecting structural changes, or change points, in time series
data are widely used in many fields of science and engineering. This chapter
sketches some basic methods for the analysis of structural changes in time
series data. The exposition is confined to retrospective methods for univariate
time series. Several recent methods for dating structural changes are compared
using a time series of oil prices spanning more than 60 years. The methods
broadly agree for the first part of the series up to the mid-1980s, for which
changes are associated with major historical events, but provide somewhat
different solutions thereafter, reflecting a gradual increase in oil prices
that is not well described by a step function. As a further illustration, 1990s
data on the volatility of the Hang Seng stock market index are reanalyzed.Comment: 12 pages, 6 figure
Electrical Power Fluctuations in a Network of DC/AC inverters in a Large PV Plant: relationship between correlation, distance and time scale
This paper analyzes the correlation between the fluctuations of the electrical power generated
by the ensemble of 70 DC/AC inverters from a 45.6 MW PV plant. The use of real electrical
power time series from a large collection of photovoltaic inverters of a same plant is an impor-
tant contribution in the context of models built upon simplified assumptions to overcome the
absence of such data.
This data set is divided into three different fluctuation categories with a clustering proce-
dure which performs correctly with the clearness index and the wavelet variances. Afterwards,
the time dependent correlation between the electrical power time series of the inverters is esti-
mated with the wavelet transform. The wavelet correlation depends on the distance between
the inverters, the wavelet time scales and the daily fluctuation level. Correlation values for time
scales below one minute are low without dependence on the daily fluctuation level. For time
scales above 20 minutes, positive high correlation values are obtained, and the decay rate with
the distance depends on the daily fluctuation level. At intermediate time scales the correlation
depends strongly on the daily fluctuation level.
The proposed methods have been implemented using free software. Source code is available
as supplementary material
The geography of recent genetic ancestry across Europe
The recent genealogical history of human populations is a complex mosaic
formed by individual migration, large-scale population movements, and other
demographic events. Population genomics datasets can provide a window into this
recent history, as rare traces of recent shared genetic ancestry are detectable
due to long segments of shared genomic material. We make use of genomic data
for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of
recent genealogical ancestry over the past three thousand years at a
continental scale. We detected 1.9 million shared genomic segments, and used
the lengths of these to infer the distribution of shared ancestors across time
and geography. We find that a pair of modern Europeans living in neighboring
populations share around 10-50 genetic common ancestors from the last 1500
years, and upwards of 500 genetic ancestors from the previous 1000 years. These
numbers drop off exponentially with geographic distance, but since genetic
ancestry is rare, individuals from opposite ends of Europe are still expected
to share millions of common genealogical ancestors over the last 1000 years.
There is substantial regional variation in the number of shared genetic
ancestors: especially high numbers of common ancestors between many eastern
populations likely date to the Slavic and/or Hunnic expansions, while much
lower levels of common ancestry in the Italian and Iberian peninsulas may
indicate weaker demographic effects of Germanic expansions into these areas
and/or more stably structured populations. Recent shared ancestry in modern
Europeans is ubiquitous, and clearly shows the impact of both small-scale
migration and large historical events. Population genomic datasets have
considerable power to uncover recent demographic history, and will allow a much
fuller picture of the close genealogical kinship of individuals across the
world.Comment: Full size figures available from
http://www.eve.ucdavis.edu/~plralph/research.html; or html version at
http://ralphlab.usc.edu/ibd/ibd-paper/ibd-writeup.xhtm
How many crowdsourced workers should a requester hire?
Recent years have seen an increased interest in crowdsourcing as a way of obtaining information from a potentially large group of workers at a reduced cost. The crowdsourcing process, as we consider in this paper, is as follows: a requester hires a number of workers to work on a set of similar tasks. After completing the tasks, each worker reports back outputs. The requester then aggregates the reported outputs to obtain aggregate outputs. A crucial question that arises during this process is: how many crowd workers should a requester hire? In this paper, we investigate from an empirical perspective the optimal number of workers a requester should hire when crowdsourcing tasks, with a particular focus on the crowdsourcing platform Amazon Mechanical Turk. Specifically, we report the results of three studies involving different tasks and payment schemes. We find that both the expected error in the aggregate outputs as well as the risk of a poor combination of workers decrease as the number of workers increases. Surprisingly, we find that the optimal number of workers a requester should hire for each task is around 10 to 11, no matter the underlying task and payment scheme. To derive such a result, we employ a principled analysis based on bootstrapping and segmented linear regression. Besides the above result, we also find that overall top-performing workers are more consistent across multiple tasks than other workers. Our results thus contribute to a better understanding of, and provide new insights into, how to design more effective crowdsourcing processes
Growth and cycles of the Italian economy since 1861: the new evidence
Based on a newly-available large set of historical national accounts, the paper revisits the main features of economic growth and cycles in Italy for the post-Unification period 1861-2011. Alongside the structural changes in growth dynamics, the main sources of output and productivity growth are identified. As regards the analysis of the underlying cyclical component, a business cycle chronology is first established and then both the specific patterns of individual cycles and the co-movements of output with key macroeconomic variables are investigated. In the 150 years since its political Unification, Italy's economic growth was mainly propelled by consumption and investments, whereas on the supply side the industry and services sectors were by far the main contributors, also because of the positive effect of labour reallocation to nonfarm activities. Over the same period, Italy experienced approximately 20 business cycles of varying duration and amplitude. Output fluctuations were dominated by the short-term variability of agricultural production before World War II and by fluctuations of the industry sector thereafter. The cyclical behaviour exhibited by aggregate demand components conforms quite well to that evidenced in the standard international business cycle literature, although some exceptions arise in the pre-World War II years
Arbuscular mycorrhizal colonisation of roots of grass species differing in invasiveness
Recent research indicates that the soil microbial community, particularly arbuscular mycorrhizal
fungi (AMF), can influence plant invasion in several ways. We tested if 1) invasive species are
colonised by AMF to a lower degree than resident native species, and 2) AMF colonisation of native
plants is lower in a community inhabited by an invasive species than in an uninvaded resident
community. The two tests were run in semiarid temperate grasslands on grass (Poaceae) species,
and the frequency and intensity of mycorrhizal colonisation, and the proportion of arbuscules and
vesicles in plant roots have been measured. In the first test, grasses representing three classes of
invasiveness were included: invasive species, resident species becoming abundant upon
disturbance, and non-invasive native species. Each class contained one C3 and one C4 species. The
AMF colonisation of the invasive Calamagrostis epigejos and Cynodon dactylon was consistently
lower than that of the non-invasive native Chrysopogon gryllus and Bromus inermis, and contained
fewer arbuscules than the post-disturbance dominant resident grasses Bothriochloa ischaemum and
Brachypodium pinnatum. The C3 and C4 grasses behaved alike despite their displaced phenologies
in these habitats. The second test compared AMF colonisation for sand grassland dominant grasses
Festuca vaginata and Stipa borysthenica in stands invaded by either C. epigejos or C. dactylon, and
in the uninvaded natural community. Resident grasses showed lower degree of AMF colonisation in the invaded stand compared to the uninvaded natural community with F. vaginata responding so to
both invaders, while S. borysthenica responding to C. dactylon only. These results indicate that
invasive grasses supposedly less reliant on AMF symbionts have the capacity of altering the soil
mycorrhizal community in such a way that resident native species can establish a considerably
reduced extent of the beneficial AMF associations, hence their growth, reproduction and ultimately
abundance may decline. Accumulating evidence suggests that such indirect influences of invasive
alien plants on resident native species mediated by AMF or other members of the soil biota is probably more the rule than the exception
The yield of essential oils in Melaleuca alternifolia (Myrtaceae) is regulated through transcript abundance of genes in the MEP pathway
Medicinal tea tree (Melaleuca alternifolia) leaves contain large amounts of an essential oil, dominated by monoterpenes. Several enzymes of the chloroplastic methylerythritol phosphate (MEP) pathway are hypothesised to act as bottlenecks to the production of monoterpenes. We investigated, whether transcript abundance of genes encoding for enzymes of the MEP pathway were correlated with foliar terpenes in M. alternifolia using a population of 48 individuals that ranged in their oil concentration from 39 -122 mg x g DM(-1). Our study shows that most genes in the MEP pathway are co-regulated and that the expression of multiple genes within the MEP pathway is correlated with oil yield. Using multiple regression analysis, variation in expression of MEP pathway genes explained 87% of variation in foliar monoterpene concentrations. The data also suggest that sesquiterpenes in M. alternifolia are synthesised, at least in part, from isopentenyl pyrophosphate originating from the plastid via the MEP pathway
- …
