84,199 research outputs found
Recursive tree traversal dependence analysis
While there has been much work done on analyzing and transforming regular programs that operate over linear arrays and dense matrices, comparatively little has been done to try to carry these optimizations over to programs that operate over heap-based data structures using pointers. Previous work has shown that point blocking, a technique similar to loop tiling in regular programs, can help increase the temporal locality of repeated tree traversals. Point blocking, however, has only been shown to work on tree traversals where each traversal is fully independent and would allow parallelization, greatly limiting the types of applications that this transformation could be applied to.^ The purpose of this study is to develop a new framework for analyzing recursive methods that perform traversals over trees, called tree dependence analysis. This analysis translates dependence analysis techniques for regular programs to the irregular space, identifying the structure of dependences within a recursive method that traverses trees. In this study, a dependence test that exploits the dependence structure of such programs is developed, and is shown to be able to prove the legality of several locality— and parallelism-enhancing transformations, including point blocking. In addition, the analysis is extended with a novel path-dependent, conditional analysis to refine the dependence test and prove the legality of transformations for a wider range of algorithms. These analyses are then used to show that several common algorithms that manipulate trees recursively are amenable to several locality— and parallelism-enhancing transformations. This work shows that classical dependence analysis techniques, which have largely been confined to nested loops over array data structures, can be extended and translated to work for complex, recursive programs that operate over pointer-based data structures
Bayesian model averaging over tree-based dependence structures for multivariate extremes
Describing the complex dependence structure of extreme phenomena is
particularly challenging. To tackle this issue we develop a novel statistical
algorithm that describes extremal dependence taking advantage of the inherent
hierarchical dependence structure of the max-stable nested logistic
distribution and that identifies possible clusters of extreme variables using
reversible jump Markov chain Monte Carlo techniques. Parsimonious
representations are achieved when clusters of extreme variables are found to be
completely independent. Moreover, we significantly decrease the computational
complexity of full likelihood inference by deriving a recursive formula for the
nested logistic model likelihood. The algorithm performance is verified through
extensive simulation experiments which also compare different likelihood
procedures. The new methodology is used to investigate the dependence
relationships between extreme concentration of multiple pollutants in
California and how these pollutants are related to extreme weather conditions.
Overall, we show that our approach allows for the representation of complex
extremal dependence structures and has valid applications in multivariate data
analysis, such as air pollution monitoring, where it can guide policymaking
Measuring association with recursive rank binning
Pairwise measures of dependence are a common tool to map data in the early
stages of analysis with several modern examples based on maximized partitions
of the pairwise sample space. Following a short survey of modern measures of
dependence, we introduce a new measure which recursively splits the ranks of a
pair of variables to partition the sample space and computes the
statistic on the resulting bins. Splitting logic is detailed for splits
maximizing a score function and randomly selected splits. Simulations indicate
that random splitting produces a statistic conservatively approximated by the
distribution without a loss of power to detect numerous different data
patterns compared to maximized binning. Though it seems to add no power to
detect dependence, maximized recursive binning is shown to produce a natural
visualization of the data and the measure. Applying maximized recursive rank
binning to S&P 500 constituent data suggests the automatic detection of tail
dependence.Comment: 59 pages, 22 figure
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
US Disposable Personal Income and Housing Price Index: A Fractional Integration Analysis
This paper examines the relationship between US disposable personal income (DPI) and house price index (HPI) during the last twenty years applying fractional integration and long-range dependence techniques to monthly data from January 1991 to July 2010. The empirical findings indicate that the stochastic properties of the two series are such that cointegration cannot hold between them, as mean reversion occurs in the case of DPI but not of HPI. Also, recursive analysis shows that the estimated fractional parameter is relatively stable over time for DPI whilst it increases throughout the sample for HPI. Interestingly, the estimates tend to converge toward the unit root case after 2008 once the bubble had burst. The implications for explaining the recent financial crisis and choosing appropriate policy actions are discussed.Personal Disposable Income, House Price Index, Fractional Integration
US Disposable Personal Income and Housing Price Index: A Fractional Integration Analysis
This paper examines the relationship between US disposable personal income (DPI) and house price index (HPI) during the last twenty years applying fractional integration and long-range dependence techniques to monthly data from January 1991 to July 2010. The empirical findings indicate that the stochastic properties of the two series are such that cointegration cannot hold between them, as mean reversion occurs in the case of DPI but not of HPI. Also, recursive analysis shows that the estimated fractional parameter is relatively stable over time for DPI whilst it increases throughout the sample for HPI. Interestingly, the estimates tend to converge toward the unit root case after 2008 once the bubble had burst. The implications for explaining the recent financial crisis and choosing appropriate policy actions are discussed.personal disposable income, house price index, fractional integration
Recommended from our members
US disposable personal income and housing price index: A fractional integration analysis
This paper examines the relationship between US disposable personal income (DPI) and
house price index (HPI) during the last twenty years applying fractional integration and long-range dependence techniques to monthly data from January 1991 to July 2010. The empirical findings indicate that the stochastic properties of the two series are uch that cointegration cannot hold between them, as mean reversion occurs in the case of DPI but not of HPI. Also, recursive analysis shows that the estimated fractional parameter is relatively stable over time for DPI whilst it increases throughout the sample for HPI. Interestingly, the estimates tend to converge toward the unit root case fter 2008 once the bubble had burst. The implications for explaining the recent financial crisis and choosing appropriate policy actions are discussed.The second named-author gratefully acknowledges financial support from the Ministerio de Ciencia y TecnologĂa (ECO2008-03035 ECON Y FINANZAS, Spain) and from a PIUNA Project from the University of Navarra
Parallelizing irregular C codes assisted by interprocedural shape analysis
In the new multicore architecture arena, the problem of improving the performance of a code is more in the soft-ware side than in the hardware one. However, optimizing irregular dynamic data structure based codes for such ar-chitectures is not easy, either by hand or compiler assisted. Regarding this last approach, shape analysis is a static tech-nique that achieves abstraction of dynamic memory and can help to disambiguate, quite accurately, memory references in programs that create and traverse recursive data struc-tures. This kind of analysis has promising applicability for accurate data dependence tests in loops or recursive func-tions that traverse dynamic data structures. However, sup-port for interprocedural programs in shape analysis is still a challenge, especially in the presence of recursive func-tions. In this work we present a novel fully context-sensitive interprocedural shape analysis algorithm that supports re-cursion and can be used to uncover parallelism. Our ap-proach is based on three key ideas: i) intraprocedural sup-port based on “Coexistent Links Sets ” to precisely describe the memory configurations during the abstract interpreta-tion of the C code; ii) interprocedural support based on “Recursive Flow Links ” to trace the state of pointers in previous calls; and iii) annotations of the read/written heap locations during the program analysis. We present prelim-inary experiments that reveal that our technique compares favorably with related work, and obtains precise memory abstractions in a variety of recursive programs that create and manipulate dynamic data structures. We have also im-plemented a data dependence test over our interprocedural shape analysis. With this test we have obtained promis-ing results, automatically detecting parallelism in three C codes, which have been successfully parallelized
A MULTIVARIATE I(2) COINTEGRATION ANALYSIS OF GERMAN HYPERINFLATION
This paper re-examines the Cagan model of German hyperinflation during the 1920s under the twin hypotheses that the system contains variables that are I(2) and that a linear trend is required in the cointegrating relations. Using the recently developed I(2) cointegration analysis developed by Johansen (1992, 1995, 1997) extended by Paruolo (1996) and Rahbek et al. (1999) we find that the linear trend hypothesis is rejected for the sample. However, we provide conclusive evidence that money supply and the price level have a common I(2) component. Then, the validity of Cagan’s model is tested via a transformation of the I(2) to an I(1) model between real money balances and money growth or inflation. This transformation is not imposed on the data but it is shown to satisfy the statistical property of polynomial cointegration. Evidence is obtained in favor of cointegration between the two sets of variables which is however weakened by the sample dependence of the trace test that the application of the recursive stability tests for cointegrated VAR models show.I(2) analysis, hyperinflation, cointegration, identification, temporal stability
Generalized Points-to Graphs: A New Abstraction of Memory in the Presence of Pointers
Flow- and context-sensitive points-to analysis is difficult to scale; for
top-down approaches, the problem centers on repeated analysis of the same
procedure; for bottom-up approaches, the abstractions used to represent
procedure summaries have not scaled while preserving precision.
We propose a novel abstraction called the Generalized Points-to Graph (GPG)
which views points-to relations as memory updates and generalizes them using
the counts of indirection levels leaving the unknown pointees implicit. This
allows us to construct GPGs as compact representations of bottom-up procedure
summaries in terms of memory updates and control flow between them. Their
compactness is ensured by the following optimizations: strength reduction
reduces the indirection levels, redundancy elimination removes redundant memory
updates and minimizes control flow (without over-approximating data dependence
between memory updates), and call inlining enhances the opportunities of these
optimizations. We devise novel operations and data flow analyses for these
optimizations.
Our quest for scalability of points-to analysis leads to the following
insight: The real killer of scalability in program analysis is not the amount
of data but the amount of control flow that it may be subjected to in search of
precision. The effectiveness of GPGs lies in the fact that they discard as much
control flow as possible without losing precision (i.e., by preserving data
dependence without over-approximation). This is the reason why the GPGs are
very small even for main procedures that contain the effect of the entire
program. This allows our implementation to scale to 158kLoC for C programs
- …