49 research outputs found

    Tracking and modelling information diffusion across interactive online media

    Get PDF
    Information spreads rapidly across Web sites, Web logs and online forums. This paper describes the research framework of the IDIOM Project (Information Diffusion across Interactive Online Media), which analyzes this process by identifying redundant content elements, mapping them to an ontological knowledge structure, and tracking their temporal and geographic distribution. Linguists define idiom as an expression whose meaning is different from the literal meanings of its component words. Similarly, the study of information diffusion promises insights that cannot be inferred from individual network elements. This paper presents underlying technology, initial results, and the future roadmap of investigating information diffusion based on ontological knowledge structures. Similar projects often focus on particular media, or neglect important aspects of the human language. This paper addresses these gaps to reveal fundamental mechanisms of information diffusion across media with distinct interactive characteristics

    Policies for Web Services

    Get PDF
    Web services are predominantly used to implement service-oriented architectures (SOA). However, there are several areas such as temporal dimensions, real-time, streaming, or efficient and flexible file transfers where web service functionality should be extended. These extensions can, for example, be achieved by using policies. Since there are often alternative solutions to provide functionality (e.g., different protocols can be used to achieve the transfer of data), the WS-Policy standard is especially useful to extend web services with policies. It allows to create policies to generally state the properties under which a service is provided and to explicitly express alternative properties. To extend the functionality of web services, two policies are introduced in this thesis: the Temporal Policy and the Communication Policy. The temporal policy is the foundation for adding temporal dimensions to a WS-Policy. The temporal policy itself is not a WS-Policy but an independent policy language that describes temporal dimensions of and dependencies between temporal policies and WS-Policies. Switching of protocol dependencies, pricing of services, quality of service, and security are example areas for using a temporal policy. To describe protocol dependencies of a service for streaming, real-time and file transfers, a communication policy can be utilized. The communication policy is a concrete WS-Policy. With the communication policy, a service can expose the protocols it depends on for a communication after its invocation. Thus, a web service client knows the protocols required to support a communication with the service. Therefore, it is possible to evaluate beforehand whether an invocation of a service is reasonable. On top of the newly introduced policies, novel mechanisms and tools are provided to alleviate service use and enable flexible and efficient data handling. Furthermore, the involvement of the end user in the development process can be achieved more easily. The Flex-SwA architecture, the first component in this thesis based on the newly introduced policies, implements the actual file transfers and streaming protocols that are described as dependencies in a communication policy. Several communication patterns support the flexible handling of the communication. A reference concept enables seamless message forwarding with reduced data movement. Based on the Flex-SwA implementation and the communication policy, it is possible to improve usability - especially in the area of service-oriented Grids - by integrating data transfers into an automatically generated web and Grid service client. The Web and Grid Service Browser is introduced in this thesis as such a generic client. It provides a familiar environment for using services by offering the client generation as part of the browser. Data transfers are directly integrated into service invocation without having to perform data transmissions explicitly. For multimedia MIME types, special plugins allow the consumption of multimedia data. To enable an end user to build applications that also leverage high performance computing resources, the Service-enabled Mashup Editor is presented that lets the user combine popular web applications with web and Grid services. Again, the communication policy provides descriptive means for file transfers and Flex-SwAs reference concept is used for data exchange. To show the applicability of these novel concepts, several use cases from the area of multimedia processing have been selected. Based on the temporal policy, the communication policy, Flex-SwA, the Web and Grid Service Browser, and the Service-enabled Mashup Editor, the development of a scalable service-oriented multimedia architecture is presented. The multimedia SOA offers, among others, a face detection workflow, a video-on-demand service, and an audio resynthesis service. More precisely, a video-on-demand service describes its dependency on a multicast protocol by using a communication policy. A temporal policy is then used to perform the description of a protocol switch from one multicast protocol to another one by changing the communication policy at the end of its validity period. The Service-enabled Mashup Editor is used as a client for the new multicast protocol after the multicast protocol has been switched. To stream single frames from a frame decoder service to a face detection service (which are both part of the face detection workflow) and to transfer audio files with the different Flex-SwA communication patterns to an audio resynthesis service, Flex-SwA is used. The invocation of the face detection workflow and the audio resynthesis service is realized with the Web and Grid Service Browser

    A FRAMEWORK FOR BIOPROFILE ANALYSIS OVER GRID

    Get PDF
    An important trend in modern medicine is towards individualisation of healthcare to tailor care to the needs of the individual. This makes it possible, for example, to personalise diagnosis and treatment to improve outcome. However, the benefits of this can only be fully realised if healthcare and ICT resources are exploited (e.g. to provide access to relevant data, analysis algorithms, knowledge and expertise). Potentially, grid can play an important role in this by allowing sharing of resources and expertise to improve the quality of care. The integration of grid and the new concept of bioprofile represents a new topic in the healthgrid for individualisation of healthcare. A bioprofile represents a personal dynamic "fingerprint" that fuses together a person's current and past bio-history, biopatterns and prognosis. It combines not just data, but also analysis and predictions of future or likely susceptibility to disease, such as brain diseases and cancer. The creation and use of bioprofile require the support of a number of healthcare and ICT technologies and techniques, such as medical imaging and electrophysiology and related facilities, analysis tools, data storage and computation clusters. The need to share clinical data, storage and computation resources between different bioprofile centres creates not only local problems, but also global problems. Existing ICT technologies are inappropriate for bioprofiling because of the difficulties in the use and management of heterogeneous IT resources at different bioprofile centres. Grid as an emerging resource sharing concept fulfils the needs of bioprofile in several aspects, including discovery, access, monitoring and allocation of distributed bioprofile databases, computation resoiuces, bioprofile knowledge bases, etc. However, the challenge of how to integrate the grid and bioprofile technologies together in order to offer an advanced distributed bioprofile environment to support individualized healthcare remains. The aim of this project is to develop a framework for one of the key meta-level bioprofile applications: bioprofile analysis over grid to support individualised healthcare. Bioprofile analysis is a critical part of bioprofiling (i.e. the creation, use and update of bioprofiles). Analysis makes it possible, for example, to extract markers from data for diagnosis and to assess individual's health status. The framework provides a basis for a "grid-based" solution to the challenge of "distributed bioprofile analysis" in bioprofiling. The main contributions of the thesis are fourfold: A. An architecture for bioprofile analysis over grid. The design of a suitable aichitecture is fundamental to the development of any ICT systems. The architecture creates a meaiis for categorisation, determination and organisation of core grid components to support the development and use of grid for bioprofile analysis; B. A service model for bioprofile analysis over grid. The service model proposes a service design principle, a service architecture for bioprofile analysis over grid, and a distributed EEG analysis service model. The service design principle addresses the main service design considerations behind the service model, in the aspects of usability, flexibility, extensibility, reusability, etc. The service architecture identifies the main categories of services and outlines an approach in organising services to realise certain functionalities required by distributed bioprofile analysis applications. The EEG analysis service model demonstrates the utilisation and development of services to enable bioprofile analysis over grid; C. Two grid test-beds and a practical implementation of EEG analysis over grid. The two grid test-beds: the BIOPATTERN grid and PlymGRID are built based on existing grid middleware tools. They provide essential experimental platforms for research in bioprofiling over grid. The work here demonstrates how resources, grid middleware and services can be utilised, organised and implemented to support distributed EEG analysis for early detection of dementia. The distributed Electroencephalography (EEG) analysis environment can be used to support a variety of research activities in EEG analysis; D. A scheme for organising multiple (heterogeneous) descriptions of individual grid entities for knowledge representation of grid. The scheme solves the compatibility and adaptability problems in managing heterogeneous descriptions (i.e. descriptions using different languages and schemas/ontologies) for collaborated representation of a grid environment in different scales. It underpins the concept of bioprofile analysis over grid in the aspect of knowledge-based global coordination between components of bioprofile analysis over grid

    Proceedings of the 12th International Conference on Digital Preservation

    Get PDF
    The 12th International Conference on Digital Preservation (iPRES) was held on November 2-6, 2015 in Chapel Hill, North Carolina, USA. There were 327 delegates from 22 countries. The program included 12 long papers, 15 short papers, 33 posters, 3 demos, 6 workshops, 3 tutorials and 5 panels, as well as several interactive sessions and a Digital Preservation Showcase

    Proceedings of the 12th International Conference on Digital Preservation

    Get PDF
    The 12th International Conference on Digital Preservation (iPRES) was held on November 2-6, 2015 in Chapel Hill, North Carolina, USA. There were 327 delegates from 22 countries. The program included 12 long papers, 15 short papers, 33 posters, 3 demos, 6 workshops, 3 tutorials and 5 panels, as well as several interactive sessions and a Digital Preservation Showcase

    A Hybrid Scavenger Grid Approach to Intranet Search

    Get PDF
    According to a 2007 global survey of 178 organisational intranets, 3 out of 5 organisations are not satisfied with their intranet search services. However, as intranet data collections become large, effective full-text intranet search services are needed more than ever before. To provide an effective full-text search service based on current information retrieval algorithms, organisations have to deal with the need for greater computational power. Hardware architectures that can scale to large data collections and can be obtained and maintained at a reasonable cost are needed. Web search engines address scalability and cost-effectiveness by using large-scale centralised cluster architectures. The scalability of cluster architectures is evident in the ability of Web search engines to respond to millions of queries within a few seconds while searching very large data collections. Though more cost-effective than high-end supercomputers, cluster architectures still have relatively high acquisition and maintenance costs. Where information retrieval is not the core business of an organisation, a cluster-based approach may not be economically viable. A hybrid scavenger grid is proposed as an alternative architecture — it consists of a combination of dedicated and dynamic resources in the form of idle desktop workstations. From the dedicated resources, the architecture gets predictability and reliability whereas from the dynamic resources it gets scalability. An experimental search engine was deployed on a hybrid scavenger grid and evaluated. Test results showed that the resources of the grid can be organised to deliver the best performance by using the optimal number of machines and scheduling the optimal combination of tasks that the machines perform. A system efficiency and cost-effectiveness comparison of a grid and a multi-core machine showed that for workloads of modest to large sizes, the grid architecture delivers better throughput per unit cost than the multi-core, at a system efficiency that is comparable to that of the multi-core. The study has shown that a hybrid scavenger grid is a feasible search engine architecture that is cost-effective and scales to medium- to large-scale data collections

    Methods and design issues for next generation network-aware applications

    Get PDF
    Networks are becoming an essential component of modern cyberinfrastructure and this work describes methods of designing distributed applications for high-speed networks to improve application scalability, performance and capabilities. As the amount of data generated by scientific applications continues to grow, to be able to handle and process it, applications should be designed to use parallel, distributed resources and high-speed networks. For scalable application design developers should move away from the current component-based approach and implement instead an integrated, non-layered architecture where applications can use specialized low-level interfaces. The main focus of this research is on interactive, collaborative visualization of large datasets. This work describes how a visualization application can be improved through using distributed resources and high-speed network links to interactively visualize tens of gigabytes of data and handle terabyte datasets while maintaining high quality. The application supports interactive frame rates, high resolution, collaborative visualization and sustains remote I/O bandwidths of several Gbps (up to 30 times faster than local I/O). Motivated by the distributed visualization application, this work also researches remote data access systems. Because wide-area networks may have a high latency, the remote I/O system uses an architecture that effectively hides latency. Five remote data access architectures are analyzed and the results show that an architecture that combines bulk and pipeline processing is the best solution for high-throughput remote data access. The resulting system, also supporting high-speed transport protocols and configurable remote operations, is up to 400 times faster than a comparable existing remote data access system. Transport protocols are compared to understand which protocol can best utilize high-speed network connections, concluding that a rate-based protocol is the best solution, being 8 times faster than standard TCP. An HD-based remote teaching application experiment is conducted, illustrating the potential of network-aware applications in a production environment. Future research areas are presented, with emphasis on network-aware optimization, execution and deployment scenarios

    Veröffentlichungen und Vorträge 2007 der Mitglieder der Fakultät für Informatik

    Get PDF

    Estudo sobre processamento maciçamente paralelo na internet

    Get PDF
    Orientador: Marco Aurélio Amaral HenriquesTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de ComputaçãoResumo: Este trabalho estuda a possibilidade de aproveitar o poder de processamento agregado dos computadores conectados pela Internet para resolver problemas de grande porte. O trabalho apresenta um estudo do problema tanto do ponto de vista teórico quanto prático. Desde o ponto de vista teórico estudam-se as características das aplicações paralelas que podem tirar proveito de um ambiente computacional com um grande número de computadores heterogêneos fracamente acoplados. Desde o ponto de vista prático estudam-se os problemas fundamentais a serem resolvidos para se construir um computador paralelo virtual com estas características e propõem-se soluções para alguns dos mais importantes como balanceamento de carga e tolerância a falhas. Os resultados obtidos indicam que é possível construir um computador paralelo virtual robusto, escalável e tolerante a falhas e obter bons resultados na execução de aplicações com alta razão computação/comunicaçãoAbstract: This thesis explores the possibility of using the aggregated processing power of computers connected by the Internet to solve large problems. The issue is studied both from the theoretical and practical point of views. From the theoretical perspective this work studies the characteristics that parallel applications should have to be able to exploit an environment with a large, weakly connected set of computers. From the practical perspective the thesis indicates the fundamental problems to be solved in order to construct a large parallel virtual computer, and proposes solutions to some of the most important of them, such as load balancing and fault tolerance. The results obtained so far indicate that it is possible to construct a robust, scalable and fault tolerant parallel virtual computer and use it to execute applications with high computing/communication ratioDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétric

    A statistical mechanics approach for an effective, scalable, and reliable distributed load balancing scheme for grid networks

    Get PDF
    The advances in computer and networking technologies over the past decades produced new type of collaborative computing environment called Grid Networks. Grid network is a parallel and distributed computing network system that possesses the ability to achieve a higher computing throughput by taking advantage of many computing resources available in the network. To achieve a scalable and reliable Grid network system, the workload needs to be efficiently distributed among the resources accessible on the network. A novel distributed algorithm based on statistical mechanics that provides an efficient load-balancing paradigm without any centralised monitoring is proposed here. The resulting load-balancer would be integrated into Grid network to increase its efficiency and resources utilisation. This distributed and scalable load-balancing framework is conducted using the biased random sampling (BRS) algorithm. In this thesis, a novel statistical mechanics approach that gives a distributed loadbalancing scheme by generating almost regular networks is proposed. The generated network system is self-organised and depends only on local information for load distribution and resource discovery. The in-degree of each node refers to its free resources, and job assignment and resource updating processes required for load balancing are accomplished by using random sampling (RS). An analytical solution for the stationary degree distributions has been derived that confirms that the edge distribution of the proposed network system is compatible with ER random networks. Therefore, the generated network system can provide an effective loadbalancing paradigm for the distributed resources accessible on large-scale network 1 systems. Furthermore, it has been demonstrated that introducing a geographic awareness factor in the random walk sampling can reduce the effects of communication latency in the Grid network environment. Theoretical and simulation results prove that the proposed BRS load-balancing scheme provides an effective, scalable, and reliable distributed load-balancing scheme for the distributed resources available on Grid networks
    corecore