12,323 research outputs found

    Flexible, wide-area storage for distributed systems using semantic cues

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 81-87).There is a growing set of Internet-based services that are too big, or too important, to run at a single site. Examples include Web services for e-mail, video and image hosting, and social networking. Splitting such services over multiple sites can increase capacity, improve fault tolerance, and reduce network delays to clients. These services often need storage infrastructure to share data among the sites. This dissertation explores the use of a new file system (WheelFS) specifically designed to be the storage infrastructure for wide-area distributed services. WheelFS allows applications to adjust the semantics of their data via semantic cues, which provide application control over consistency, failure handling, and file and replica placement. This dissertation describes a particular set of semantic cues that reflect the specific challenges that storing data over the wide-area network entails: high-latency and low-bandwidth links, coupled with increased node and link failures, when compared to local-area networks. By augmenting a familiar POSIX interface with support for semantic cues, WheelFS provides a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance, in the form of a distributed file system. Its design allows applications to adjust the tradeoff between prompt visibility of updates from other sites and the ability for sites to operate independently despite failures and long delays. WheelFS is implemented as a user-level file system and is deployed on PlanetLab and Emu-lab.(cont.) Six applications (an all-pairs-pings script, a distributed Web cache, an email service, large file distribution, distributed compilation, and protein sequence alignment software) demonstrate that WheelFS's file system interface simplifies construction of distributed applications by allowing reuse of existing software. These applications would perform poorly with the strict semantics implied by a traditional file system interface, but by providing cues to WheelFS they are able to achieve good performance. Measurements show that applications built on WheelFS deliver comparable performance to services such as CoralCDN and BitTorrent that use specialized wide-area storage systems.by Jeremy Andrew Stribling.Ph.D

    Semantic analysis of field sports video using a petri-net of audio-visual concepts

    Get PDF
    The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework

    Context for Ubiquitous Data Management

    Get PDF
    In response to the advance of ubiquitous computing technologies, we believe that for computer systems to be ubiquitous, they must be context-aware. In this paper, we address the impact of context-awareness on ubiquitous data management. To do this, we overview different characteristics of context in order to develop a clear understanding of context, as well as its implications and requirements for context-aware data management. References to recent research activities and applicable techniques are also provided

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Supporting document management in complex multitask environments

    Get PDF
    In this thesis, the challenges for the support of information workers in the domain of personal information management are addressed. In Chapter 1 three major challenges are identified: 1) information overload and fragmentation, 2) multitasking within an unstructured, frequently interrupted workflow, and 3) increasing mobility demand. It has been argued that dedicated support of current needs in personal information management will help to overcome the challenges, reduce information and cognitive overload, and facilitate performance of information workers. Investigating the current needs of information workers, one has to focus on those that are currently supported by paper document management and transfer the mechanisms of this support to the digital domain. Our studies have addressed the role of paper documents in dealing with each of the three identified challenges. In the first study, presented in Chapter 2, paper document management has been discussed in relation with information overload and fragmentation. The study used contextual interviewing technique, with participants interviewed at their workplace. The results showed that information workers keep actively using task-related collections of paper documents. By grouping task-related documents from different origins together, information workers create a representation of a "stable state" within a task, which helps to resume the task after an interruption that is almost inevitable in a multitasking environment. To investigate task-switching patterns, related to document manipulation, and factors influencing the occurrence of the patterns, an observational study was performed, described in Chapter 3. This study identified eight task switching patterns, which varied in the explicitness of an indication of a task state in the environment and in the level of subject’s activity directed to indicate the task state at the moments of switching. Among the identified influencing factors, the reason for the switch (self-switching or external interruption) had an effect on the occurrence of subjectactive patterns. Self-switching usually resulted in user-active document manipulation in the environment which could not be observed during external interruptions. The domain where the last action was performed also had an influence on the switching pattern, with active manipulation of documents occurring more often in the physical than in the digital domain. It has been concluded that, while switching tasks in an unstructured multitasking workflow, manipulation of paper documents plays an important role in creating a stable state at the moments of switching between tasks. We hypothesized that paper documents possess visually distinctive attributes that are associated with the semantics of the related tasks. By manipulating task-related documents at the moments of task switching, these visually distinctive attributes change, reflecting the changes in the task state. This hypothesis has been tested in a study using triad elicitation interview technique in combination with laddering, presented in Chapter 4. As a result, we developed a clustered model of relationships among identified visual cues of paper documents and semantic judgments of the tasks. The relationships among clusters have been analyzed based on three criteria: content-dependency, flexibility, and effort, which together define ease of manipulation for each cluster of visual cues. It has been concluded that physical environment, in particular, task-relevant paper documents, allow flexible encoding of task-related semantic cues into available environmental visual cues. This mechanism needs to be transferred to the digital domain, especially to support mobility of information workers. This research suggested that the extensive use of paper documents in the digital era can be largely explained by the embodiment of paper as a part of physical environment in which a human acts. Chapter 5 summarized the results of all studies into a set of requirements for the design of a personal information management system. We proposed a layered framework for presenting the requirements from the point of view of task decomposition and discussed the needs of the information workers related to each layer. For each of the aforementioned layers within the framework, requirements for the design of a digital system were presented and discussed in detail. Chapter 6 revised the challenges discussed in Chapter 1 from the point of view of the findings, summarized methodology and contribution of the research and reflected on the most prominent results

    Methodologies for the Automatic Location of Academic and Educational Texts on the Internet

    Get PDF
    Traditionally online databases of web resources have been compiled by a human editor, or though the submissions of authors or interested parties. Considerable resources are needed to maintain a constant level of input and relevance in the face of increasing material quantity and quality, and much of what is in databases is of an ephemeral nature. These pressures dictate that many databases stagnate after an initial period of enthusiastic data entry. The solution to this problem would seem to be the automatic harvesting of resources, however, this process necessitates the automatic classification of resources as ‘appropriate’ to a given database, a problem only solved by complex text content analysis. This paper outlines the component methodologies necessary to construct such an automated harvesting system, including a number of novel approaches. In particular this paper looks at the specific problems of automatically identifying academic research work and Higher Education pedagogic materials. Where appropriate, experimental data is presented from searches in the field of Geography as well as the Earth and Environmental Sciences. In addition, appropriate software is reviewed where it exists, and future directions are outlined

    Methodologies for the Automatic Location of Academic and Educational Texts on the Internet

    Get PDF
    Traditionally online databases of web resources have been compiled by a human editor, or though the submissions of authors or interested parties. Considerable resources are needed to maintain a constant level of input and relevance in the face of increasing material quantity and quality, and much of what is in databases is of an ephemeral nature. These pressures dictate that many databases stagnate after an initial period of enthusiastic data entry. The solution to this problem would seem to be the automatic harvesting of resources, however, this process necessitates the automatic classification of resources as ‘appropriate’ to a given database, a problem only solved by complex text content analysis. This paper outlines the component methodologies necessary to construct such an automated harvesting system, including a number of novel approaches. In particular this paper looks at the specific problems of automatically identifying academic research work and Higher Education pedagogic materials. Where appropriate, experimental data is presented from searches in the field of Geography as well as the Earth and Environmental Sciences. In addition, appropriate software is reviewed where it exists, and future directions are outlined

    Cognition and the Web

    No full text
    Empirical research related to the Web has typically focused on its impact to social relationships and wider society; however, the cognitive impact of the Web is also an increasing focus of scientific interest and research attention. In this paper, I attempt to provide an overview of what I see as the important issues in the debate regarding the relationship between human cognition and the Web. I argue that the Web is potentially poised to transform our cognitive and epistemic profiles, but that in order to understand the nature of this influence we need to countenance a position that factors in the available scientific evidence, the changing nature of our interaction with the Web, and the possibility that many of our everyday cognitive achievements rely on complex webs of social and technological scaffolding. I review the literature relating to the cognitive effects of current Web technology, and I attempt to anticipate the cognitive impact of next-generation technologies, such as Web-based augmented reality systems and the transition to data-centric modes of information representation. I suggest that additional work is required to more fully understand the cognitive impact of both current and future Web technologies, and I identify some of the issues for future scientific work in this area. Given that recent scientific effort around the Web has coalesced into a new scientific discipline, namely that of Web Science, I suggest that many of the issues related to cognition and the Web could form part of the emerging Web Science research agenda
    corecore