8,002 research outputs found
A model to compare cloud and non-cloud storage of Big Data
When comparing Cloud and non-Cloud Storage it can be difficult to ensure that the comparison is fair. In this paper we examine the process of setting up such a comparison and the metric used. Performance comparisons on Cloud and non Cloud systems, deployed for biomedical scientists, have been conducted to identify improvements of efficiency and performance. Prior to the experiments, network latency, file size and job failures were identified as factors which degrade performance and experiments were conducted to understand their impacts. Organizational Sustainability Modeling (OSM) is used before, during and after the experiments to ensure fair comparisons are achieved. OSM defines the actual and expected execution time, risk control rates and is used to understand key outputs related to both Cloud and non-Cloud experiments. Forty experiments on both Cloud and non Cloud systems were undertaken with two case studies. The first case study was focused on transferring and backing up 10,000 files of 1 GB each and the second case study was focused on transferring and backing up 1,000 files 10 GB each. Results showed that first, the actual and expected execution time on the Cloud was lower than on the non-Cloud system. Second, there was more than 99% consistency between the actual and expected execution time on the Cloud while no comparable consistency was found on the non-Cloud system. Third, the improvement in efficiency was higher on the Cloud than the non-Cloud. OSM is the metric used to analyze the collected data and provided synthesis and insights to the data analysis and visualization of the two case studies
A Methodology for Engineering Collaborative and ad-hoc Mobile Applications using SyD Middleware
Today’s web applications are more collaborative and utilize standard and ubiquitous Internet protocols. We have earlier developed System on Mobile Devices (SyD) middleware to rapidly develop and deploy collaborative applications over heterogeneous and possibly mobile devices hosting web objects. In this paper, we present the software engineering methodology for developing SyD-enabled web applications and illustrate it through a case study on two representative applications: (i) a calendar of meeting application, which is a collaborative application and (ii) a travel application which is an ad-hoc collaborative application. SyD-enabled web objects allow us to create a collaborative application rapidly with limited coding effort. In this case study, the modular software architecture allowed us to hide the inherent heterogeneity among devices, data stores, and networks by presenting a uniform and persistent object view of mobile objects interacting through XML/SOAP requests and responses. The performance results we obtained show that the application scales well as we increase the group size and adapts well within the constraints of mobile devices
Recommended from our members
Improving Understanding of Forest Communities and Biodiversity with Multi-Dimensional Landscape Gradients
This dissertation was motivated by a desire to understand the effects of habitat degradation and urbanization on a single species in a single study system in western Massachusetts, the red-backed salamander (Plethodon cinereus), but along the way unexpected conceptual and methodological hurdles caused the work to grow into a multi-species, multi-region, and multi-scale endeavor. As I designed my dissertation research and began considering approaches to quantifying heterogeneity and human influence in my study landscape, I recognized inconsistencies in methods used to define and quantify landscape metrics, particularly in urban systems. To investigate further, I conducted a critical review of the literature to describe the current practices of landscape quantification in urban systems and to identify any patterns or trends. The review highlighted the fact that variability among definitions of ‘urban’ stems from inconsistent decision making around a set of core principles in landscape ecology, and I used these to establish a standardizing framework for landscape gradient quantification. I then applied this framework to 10 ecologically distinct metro-regions across the United States and revealed a consistent pair of gradients that offer an updated multi-dimensional perspective of landscape heterogeneity that intuitively advances the one-dimension perspective dominating exiting approaches to studying ecological responses across gradients of human influence. Having developed a framework for gradient definition, and extending the single-axis lens through which ecological enquiry is made, I applied these approaches to first investigate environmental drivers of avian community size and structure, and second, to critically evaluate the validity of the red-backed salamander as an indicator for biodiversity in human-dominated landscapes.
Inconsistencies in definitions of “urbanization” are commonly attributed to the lack of general theory describing ecosystem function in urban landscapes. In Chapter 1, I review the literature on urban landscape quantification to identify patterns and best practices that could improve the process by which urban landscape gradients are defined and quantified. This review of 250 research articles revealed striking methodological consistency that aligns with the best practices of gradient definition in landscape ecology, these are: (1) selection of features to represent the urban landscape, (2) identification of associated spatial data to characterize these features, and (3) selection of an ecologically appropriate spatial scale. However, the review also highlighted apparent inconsistencies in urban gradient definition that arise from ad-hoc and ambiguous decision making at each of these stages, and demonstrated that ecologically justified and transparent decision making can standardize gradient definition and contribute to improved understanding of ecosystem processes in human-dominated landscapes (Padilla & Sutherland, 2019). In Chapter 2, I address the lack of standardized heterogeneity metrics that can be used to jointly measure multi-regional ecological responses that has hindered the generalization of urban stressors on ecological communities. I coupled the transparent methodological framework developed in Chapter 1 with a multivariate statistical analysis of land use data to quantify landscape structure in 10 medium sized cities representing the dominant ecoregions of the United States to determine whether consistent and biologically meaningful landscape metrics emerge across spatial domains. This work revealed two dominant axes of spatial variation that are intuitively consistent with the characteristics of human-dominated landscape mosaics but are overlooked when defining landscapes along a single axis of variation. In the context of representative landscapes in the United States, these gradients describe variation in the characteristic physical (soft to hard) and natural (brown to green) structure of landscapes influences by human activity. To develop the ecological relevance of the dual-axis landscape definition, I explored the response of American robin (Turdus migratorius) occupancy to these gradients across the 10 cities. This case study demonstrated that robins generally respond similarly and strongly to both landscape axes and that a multi-dimensional perspective reveals ecological nuance that may otherwise be overlooked.
In Chapter 3, I apply the concepts developed in the previous two chapters to my study system in western Massachusetts. I tested two leading theories regarding how habitat fragmentation in human-dominated landscapes impacts species communities: island biogeography theory, and spatial heterogeneity. In the case of island biogeography, I expected species diversity to linearly decline as the degree of fragmentation and human-modification to the landscape increased, whereas, spatial heterogeneity would result in a quadratic response where species diversity is greatest in moderately disturbed landscape mosaics. These hypotheses were evaluated with data on the bird communities collected at 42 sites in a 3-year field study that were analyzed using a hierarchical model that allows for estimation of site-specific abundance of each species and species richness while simultaneously accounting for imperfect detection. This analysis revealed a strong non-linear community response to both axes of the multi-dimensional landscape (soft-hard and brown-green) that suggested increased heterogeneity promotes higher species abundance as well as species richness. At the species level, there was variation that corresponded with variation in known habitat preferences and life history traits. These results suggest that variation in species richness follows expectations of the spatial heterogeneity hypothesis that predicts greatest diversity in moderately disturbed landscape mosaics. I hypothesize that this process results from a greater diversity of habitat types available in landscape mosaics, and greater structural complexity within forest fragments that are characteristic of heterogenous mosaics.
Finally, in Chapter 4 I provide a rare empirical assessment of the indicator species concept. Specifically, I evaluate the red-backed salamander (Plethodon cinereus) as an indicator of forest biodiversity in human-dominated landscapes. During my 3-year field study, in addition to avian community data, I collected occurrence and abundance data for trees, soil invertebrates and red-backed salamanders at each of the 42 sites. These data were analyzed using a joint-species distribution model to evaluate the salamander’s indicator potential under the premise that species within a community will generally exhibit a shared response to gradients of human influence, and that an ideal indicator species represents an exemplar of the shared community response. I compared this novel approach to indicator species selection with a commonly used metric for identifying indicator species. Despite the frequency with which salamanders are promoted as indicators of forest condition, my results provided no evidence that they are effective indicators for biodiversity based on established conceptual underpinnings of indicator species. As with the avian community, biodiversity showed a non-linear response to the dual axes of human influence where richness is highest in heterogenous landscape. Species that were identified as candidate indicators were species characteristic of edge habitat and dense forests which are common in human-dominated landscape mosaics.
In summary, my dissertation provides much needed methodological improvements to landscape gradient quantification in human-dominated systems and demonstrates the applicability of this framework both at a national scale, as demonstrated across the United States, and the local scale as demonstrated in my field system in Western Massachusetts. This framework results in a multi-dimensional perspective of landscape heterogeneity that extends does a better job of representing complex landscapes beyond single-axis measures that confound two intuitive gradients of human influence. I have demonstrated how such a multi-dimensional perspective sheds light on the processes driving the landscape scale patterns of biodiversity and can be used to build evaluate process-based conceptual models for identifying indicator species. In doing so, this work presents a standardizing framework for landscape gradient quantification in human dominated landscapes, an identification of the existence of unifying measures of human influence, and a demonstration of how coupling this approach and a multi-dimensional perspective offers an general framework for understanding spatial variation in ecological communities that exist in human dominated landscape mosaics
Probabilistic Personalized Recommendation Models For Heterogeneous Social Data
Content recommendation has risen to a new dimension with the advent of platforms like Twitter, Facebook, FriendFeed, Dailybooth, and Instagram. Although this uproar of data has provided us with a goldmine of real-world information, the problem of information overload has become a major barrier in developing predictive models. Therefore, the objective of this The- sis is to propose various recommendation, prediction and information retrieval models that are capable of leveraging such vast heterogeneous content. More specifically, this Thesis focuses on proposing models based on probabilistic generative frameworks for the following tasks: (a) recommending backers and projects in Kickstarter crowdfunding domain and (b) point of interest recommendation in Foursquare. Through comprehensive set of experiments over a variety of datasets, we show that our models are capable of providing practically useful results for recommendation and information retrieval tasks
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
- …