752 research outputs found

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    On combinatorial optimisation in analysis of protein-protein interaction and protein folding networks

    Get PDF
    Abstract: Protein-protein interaction networks and protein folding networks represent prominent research topics at the intersection of bioinformatics and network science. In this paper, we present a study of these networks from combinatorial optimisation point of view. Using a combination of classical heuristics and stochastic optimisation techniques, we were able to identify several interesting combinatorial properties of biological networks of the COSIN project. We obtained optimal or near-optimal solutions to maximum clique and chromatic number problems for these networks. We also explore patterns of both non-overlapping and overlapping cliques in these networks. Optimal or near-optimal solutions to partitioning of these networks into non-overlapping cliques and to maximum independent set problem were discovered. Maximal cliques are explored by enumerative techniques. Domination in these networks is briefly studied, too. Applications and extensions of our findings are discussed

    Customer profile classification using transactional data

    Get PDF
    Customer profiles are by definition made up of factual and transactional data. It is often the case that due to reasons such as high cost of data acquisition and/or protection, only the transactional data are available for data mining operations. Transactional data, however, tend to be highly sparse and skewed due to a large proportion of customers engaging in very few transactions. This can result in a bias in the prediction accuracy of classifiers built using them towards the larger proportion of customers with fewer transactions. This paper investigates an approach for accurately and confidently grouping and classifying customers in bins on the basis of the number of their transactions. The experiments we conducted on a highly sparse and skewed real-world transactional data show that our proposed approach can be used to identify a critical point at which customer profiles can be more confidently distinguished

    Application of Higher-Order Neural Networks to Financial Time-Series Prediction

    Get PDF
    Financial time series data is characterized by non-linearities, discontinuities and high frequency, multi-polynomial components. Not surprisingly, conventional Artificial Neural Networks (ANNs) have difficulty in modelling such complex data. A more appropriate approach is to apply Higher-Order ANNs, which are capable of extracting higher order polynomial coefficients in the data. Moreover, since there is a one-to-one correspondence between network weights and polynomial coefficients, HONNs (unlike ANNs generally) can be considered open-, rather than 'closed box' solutions, and thus hold more appeal to the financial community. After developing Polynomial and Trigonometric HONNs, we introduce the concept of HONN groups. The latter incorporate piecewise continuous activation functions and thresholds, and as a result are capable of modelling discontinuous (piecewise continuous) data, and what's more to any degree of accuracy. Several other PHONN variants are also described. The performance of P(T)HONNs and HONN groups on representative financial time series is described (credit ratings and exchange rates). In short, HONNs offer roughly twice the performance of MLP/BP on financial time series prediction, and HONN groups around 10% further improvement

    ELVIS: Entertainment-led video summaries

    Get PDF
    © ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Multimedia Computing, Communications, and Applications, 6(3): Article no. 17 (2010) http://doi.acm.org/10.1145/1823746.1823751Video summaries present the user with a condensed and succinct representation of the content of a video stream. Usually this is achieved by attaching degrees of importance to low-level image, audio and text features. However, video content elicits strong and measurable physiological responses in the user, which are potentially rich indicators of what video content is memorable to or emotionally engaging for an individual user. This article proposes a technique that exploits such physiological responses to a given video stream by a given user to produce Entertainment-Led VIdeo Summaries (ELVIS). ELVIS is made up of five analysis phases which correspond to the analyses of five physiological response measures: electro-dermal response (EDR), heart rate (HR), blood volume pulse (BVP), respiration rate (RR), and respiration amplitude (RA). Through these analyses, the temporal locations of the most entertaining video subsegments, as they occur within the video stream as a whole, are automatically identified. The effectiveness of the ELVIS technique is verified through a statistical analysis of data collected during a set of user trials. Our results show that ELVIS is more consistent than RANDOM, EDR, HR, BVP, RR and RA selections in identifying the most entertaining video subsegments for content in the comedy, horror/comedy, and horror genres. Subjective user reports also reveal that ELVIS video summaries are comparatively easy to understand, enjoyable, and informative

    Enabling Global Price Comparison through Semantic Integration of Web Data

    Get PDF
    “Sell Globally” and “Shop Globally” have been seen as a potential benefit of web-enabled electronic business. One important step toward realizing this benefit is to know how things are selling in various parts of the world. A global price comparison service would address this need. But there have not been many such services. In this paper, we use a case study of global price dispersion to illustrate the need and the value of a global price comparison service. Then we identify and discuss several technology challenges, including semantic heterogeneity, in providing a global price comparison service. We propose a mediation architecture to address the semantic heterogeneity problem, and demonstrate the feasibility of the proposed architecture by implementing a prototype that enables global price comparison using data from web sources in several countries

    Improving National and Homeland Security through a proposed Laboratory for Information Globalization and Harmonization Technologies (LIGHT)

    Get PDF
    A recent National Research Council study found that: "Although there are many private and public databases that contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this project. Improved access and use of information are essential to better identify and anticipate threats, protect against and respond to threats, and enhance national and homeland security (NHS), as well as other national priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information Globalization and Harmonization Technologies (LIGHT) with two interrelated goals: (1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple autonomous sources, and the use of that information by public and private agencies involved in national and homeland security and the other national priority areas involving complex and interdependent social systems (soc). This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms, and wrapper technologies to overcome information representational conflicts. The COIN approach makes it substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an environment of changing source and receiver context - which will lead to an effective and novel distributed information grid infrastructure. This research also builds on our Global System for Sustainable Development (GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions, languages, and epistemologies relevant to international relations and national security. (2) National Priority Studies: To experiment with and test the developed theory and technologies on practical problems of data integration in national priority areas. Particular focus will be on national and homeland security, including data sources about conflict and war, modes of instability and threat, international and regional demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response. Although LIGHT will leverage the results of our successful prior research projects, this will be the first research effort to simultaneously and effectively address ontological and temporal information conflicts as well as dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing complex environments requires extraction of observations from disparate sources, using different interpretations, at different points in times, for different purposes, with different biases, and for a wide range of different uses and users. This research will focus on integrating information both over individual domains and across multiple domains. Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which applications in a common domain can share, analyze, modify, and develop information. Applications also can span multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the organization and management of such large scale international and diverse research projects. The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and religious backgrounds. The currently identified external collaborators come from over 20 different organizations and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more women, underrepresented minorities, and persons with disabilities. The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Research products include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in research and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical challenges in NHS, ECS, and ASE

    Improving National and Homeland Security through a proposed Laboratory for nformation Globalization and Harmonization Technologies (LIGHT)

    Get PDF
    A recent National Research Council study found that: "Although there are many private and public databases that contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this project. Improved access and use of information are essential to better identify and anticipate threats, protect against and respond to threats, and enhance national and homeland security (NHS), as well as other national priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information Globalization and Harmonization Technologies (LIGHT) with two interrelated goals: (1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple autonomous sources, and the use of that information by public and private agencies involved in national and homeland security and the other national priority areas involving complex and interdependent social systems (soc). This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms, and wrapper technologies to overcome information representational conflicts. The COIN approach makes it substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an environment of changing source and receiver context - which will lead to an effective and novel distributed information grid infrastructure. This research also builds on our Global System for Sustainable Development (GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions, languages, and epistemologies relevant to international relations and national security. (2) National Priority Studies: To experiment with and test the developed theory and technologies on practical problems of data integration in national priority areas. Particular focus will be on national and homeland security, including data sources about conflict and war, modes of instability and threat, international and regional demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response. Although LIGHT will leverage the results of our successful prior research projects, this will be the first research effort to simultaneously and effectively address ontological and temporal information conflicts as well as dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing complex environments requires extraction of observations from disparate sources, using different interpretations, at different points in times, for different purposes, with different biases, and for a wide range of different uses and users. This research will focus on integrating information both over individual domains and across multiple domains. Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which applications in a common domain can share, analyze, modify, and develop information. Applications also can span multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the organization and management of such large scale international and diverse research projects. The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and religious backgrounds. The currently identified external collaborators come from over 20 different organizations and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more women, underrepresented minorities, and persons with disabilities. The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Research products include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in research and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical challenges in NHS, ECS, and ASE

    Security and Privacy of Protocols and Software with Formal Methods

    Get PDF
    International audienceThe protection of users' data conforming to best practice and legislation is one of the main challenges in computer science. Very often, large-scale data leaks remind us that the state of the art in data privacy and anonymity is severely lacking. The complexity of modern systems make it impossible for software architect to create secure software that correctly implements privacy policies without the help of automated tools. The academic community needs to invest more effort in the formal modelization of security and anonymity properties, providing a deeper understanding of the underlying concepts and challenges and allowing the creation of automated tools to help software architects and developers. This track provides numerous contributions to the formal modeling of security and anonymity properties and the creation of tools to verify them on large-scale software projects
    • 

    corecore