9 research outputs found
Using MapReduce Streaming for Distributed Life Simulation on the Cloud
Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
Recommended from our members
Limits of Model Selection, Link Prediction, and Community Detection
Relational data has become increasingly ubiquitous nowadays. Networks are very rich tools in graph theory, which represent real world interactions through a simple abstract graph, including nodes and edges. Network analysis and modeling has gained extremely wide attentions from the researchers in various disciplines, such as computer science, social science, biology, economics, electrical engineering, and physics. Network analysis is the study of the network topology to answer a variety of application-based questions regarding the original real world problem. For example in social network analysis the questions are related to how people interact with each other in online social networks, or in collaboration networks, how diseases propagate or how information flows through a network, or how to control a disease or food outbreak. In electric networks like power grids or in internet networks, the questions can be related to vulnerability assessment of the networks to be prepared for power outage or internet blackout. In biological network analysis, the questions are related to how different diseases are related to each other, which can be useful in discovering new symptoms of diseases and producing and developing new medicines. It appears clearly that the reason of the importance of this interdisciplinary area of science, is due to its widespread applications which involves scientists and researchers with a variety of background and interests.
Although networks are much simpler compared to the original complex systems, the interactions among the nodes in the real-world network may seem random, and capturing patterns on these entities is not trivial. There are tremendous questions about inference on networks, which makes this topic very attractive for researchers in the field. In this dissertation we answer some of the questions regarding this topic in two lines of study: one focused on experimental analyses and one focused on theoretical limitations.
In Chapter 2 we look at community detection, a common graph mining task in network inference, which seeks an unsupervised decomposition of a network into groups based on statistical regularities in network connectivity. Although many such algorithms exist, community detection’s No Free Lunch theorem implies that no algorithm can be optimal across all inputs. However, little is known in practice about how different algorithms over or underfit to real networks, or how to reliably assess such behavior across algorithms. We present a broad investigation of over and underfitting across 16 state-of-the-art community detection algorithms applied to a novel benchmark corpus of 572 structurally diverse real-world networks. We find that (i) algorithms vary widely in the number and composition of communities they find, given the same input; (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks; (iii) algorithmic differences induce wide variation in accuracy on link-based learning tasks; and, (iv) no algorithm is always the best at such tasks across all inputs. Finally, we quantify each algorithm’s overall tendency to over or underfit to network data using a theoretically principled diagnostic, and discuss the implications for future advances in community detection.
In Chapter 3 we investigate link prediction problem, another important inference task in complex networks with a wide variety of applications. As we observed in Chapter 2, the community detection algorithmic differences induce wide variation in accuracy on link prediction tasks. On the other hand, many link prediction techniques exist in literature and still there is lack of methodology to analyze and compare these techniques. In Chapter 3, we provide a methodological overview of link prediction techniques and present new results on optimal link prediction and on transfer learning for link prediction. In the former, we investiga
A Statistical Approach to the Alignment of fMRI Data
Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods
A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium
When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available
(In)formal perceptions and arguments on tourism governance multifaceted concept
A brief exploratory approach to (in)formal perceptions and arguments on tourism governance multifaceted concep
An analysis of changing dietary trends and the implications for global health.
Worldwide, obesity has reached epidemic proportions and has almost tripled between 1975 and 2016. Acknowledging that weight gain is a complex and multifactorial condition involving changes in both dietary and physical activity patterns, this research focuses on the dietary origin of obesity. Given the dearth of empirical literature on food consumption at the global level, the aim of this research is to explore dietary trends and dietary types around the world linked with the indications that they imply for obesity and global health. To this aim, annual food availability data collated by the Food and Agriculture Organisation of the United Nations (FAO) covering the period from 1961 to 2013 for 118 countries are scrutinised by econometric convergence tests, clustering techniques and spatial analysis. Results indicate that countries with lower levels of initial calories tend to exhibit higher growth rates of calorie consumption. However, this process is not homogeneous across countries. Low-income countries have converged at the fastest pace and the convergence rate reduces as income rises. In addition, the dietary convergence is conditioned by a range of structural indicators including agroecological, demographic and socio-economic variables. Evidence suggests that economic factors have become a more important determinant of the dietary convergence since the Millennium. Applying innovative fuzzy clustering algorithms which allow multiple diets to coexist within a single country, several dietary trends and dietary types are detected. While the identified clusters are all associated with relentlessly increasing calories and deteriorating dietary healthiness over the past half a century, the most calorific cluster has shown signs of stabilising calorie consumption. A notable contribution of this research is the examination of spatial patterns of global food consumption using both traditional and non-traditional measures of spatial proximity. Differing from the earlier literature emphasising the role of geographical closeness, this research utilises an economic indicator for proximity and finds that countries with similar income levels tend to have similar diets. Spatial convergence analysis reveals a convergence process that is about three times as fast as the non-spatial model; thus, ignoring the spatial relationship leads to biased results. Incorporating the spatial dimension in cluster analysis also affects the clustering results dramatically. Cluster profiling shows that only the segment of more educated and health-aware populations exhibits the behavioural changes towards better diets, hence underlining the importance of improved education and access to knowledge. The finding that dietary evolutions are ‘spatially’ dependent provides a basis for the development of group-specific interventions that target populations at risk of worsening diets. While these policy measures are often place-based, this research lays foundations for the implementation of coherent food policies beyond geographical boundaries. Overall, this research highlights that healthier diets are possible, but we need to act now. As we are living longer but not necessarily healthier, current attempts to improve diets are obviously inadequate and existing efforts need to be redoubled. This is an urgent message for policymakers considering the sobering fact that no country has been able to significantly reverse the rise in obesity