1,477 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationIn the past few years, we have seen a tremendous increase in digital data being generated. By 2011, storage vendors had shipped 905 PB of purpose-built backup appliances. By 2013, the number of objects stored in Amazon S3 had reached 2 trillion. Facebook had stored 20 PB of photos by 2010. All of these require an efficient storage solution. To improve space efficiency, compression and deduplication are being widely used. Compression works by identifying repeated strings and replacing them with more compact encodings while deduplication partitions data into fixed-size or variable-size chunks and removes duplicate blocks. While we have seen great improvements in space efficiency from these two approaches, there are still some limitations. First, traditional compressors are limited in their ability to detect redundancy across a large range since they search for redundant data in a fine-grain level (string level). For deduplication, metadata embedded in an input file changes more frequently, and this introduces more unnecessary unique chunks, leading to poor deduplication. Cloud storage systems suffer from unpredictable and inefficient performance because of interference among different types of workloads. This dissertation proposes techniques to improve the effectiveness of traditional compressors and deduplication in improving space efficiency, and a new IO scheduling algorithm to improve performance predictability and efficiency for cloud storage systems. The common idea is to utilize similarity. To improve the effectiveness of compression and deduplication, similarity in content is used to transform an input file into a compression- or deduplication-friendly format. We propose Migratory Compression, a generic data transformation that identifies similar data in a coarse-grain level (block level) and then groups similar blocks together. It can be used as a preprocessing stage for any traditional compressor. We find metadata have a huge impact in reducing the benefit of deduplication. To isolate the impact from metadata, we propose to separate metadata from data. Three approaches are presented for use cases with different constrains. For the commonly used tar format, we propose Migratory Tar: a data transformation and also a new tar format that deduplicates better. We also present a case study where we use deduplication to reduce storage consumption for storing disk images, while at the same time achieving high performance in image deployment. Finally, we apply the same principle of utilizing similarity in IO scheduling to prevent interference between random and sequential workloads, leading to efficient, consistent, and predictable performance for sequential workloads and a high disk utilization

    An Investigation of Students\u27 Use and Understanding of Evaluation Strategies

    Get PDF
    One expected outcome of physics instruction is that students develop quantitative reasoning skills, including evaluation of problem solutions. To investigate students’ use of evaluation strategies, we developed and administered tasks prompting students to check the validity of a given expression. We collected written (N\u3e673) and interview (N=31) data at the introductory, sophomore, and junior levels. Tasks were administered in three different physics contexts: the velocity of a block at the bottom of an incline with friction, the electric field due to three point charges of equal magnitude, and the final velocities of two masses in an elastic collision. Responses were analyzed using modified grounded theory and phenomenology. In these three contexts, we explored different facets of students’ use and understanding of evaluation strategies. First, we document and analyze the various evaluation strategies students use when prompted, comparing to canonical strategies. Second, we describe how the identified strategies relate to prior work, with particular emphasis on how a strategy we describe as grouping relates to the phenomenon of chunking as described in cognitive science. Finally, we examine how the prevalence of these strategies varies across different levels of the physics curriculum. From our quantitative data, we found that while all the surveyed student populations drew from the same set of evaluation strategies, the percentage of students who used sophisticated evaluation strategies was higher in the sophomore and junior/senior student populations than in the first-year population. From our case studies of two pair interviews (one pair of first years, and one pair of juniors), we found that that while evaluating an expression, both juniors and first-years performed similar actions. However, while the first-year students focused on computation and checked for arithmetic consistency with the laws of physics, juniors checked for computational correctness and probed whether the equation accurately described the physical world and obeyed the laws of physics. Our case studies suggest that a key difference between expert and novice evaluation is that experts extract physical meaning from their result and make sense of them by comparing them to other representations of laws of physics, and real-life experience. We conclude with remarks including implications for classroom instruction as well as suggestions for future work

    Toward a Heuristic Model for Evaluating the Complexity of Computer Security Visualization Interface

    Get PDF
    Computer security visualization has gained much attention in the research community in the past few years. However, the advancement in security visualization research has been hampered by the lack of standardization in visualization design, centralized datasets, and evaluation methods. We propose a new heuristic model for evaluating the complexity of computer security visualizations. This complexity evaluation method is designed to evaluate the efficiency of performing visual search in security visualizations in terms of measuring critical memory capacity load needed to perform such tasks. Our method is based on research in cognitive psychology along with characteristics found in a majority of the security visualizations. The main goal for developing this complexity evaluation method is to guide computer security visualization design and compare different visualization designs. Finally, we compare several well known computer security visualization systems. The proposed method has the potential to be extended to other areas of information visualization

    What is the nature of the knowledge specialist teachers conceive of as deep subject and pedagogical knowledge of primary mathematics?

    Get PDF
    One of the key recommendations of the Williams review of primary mathematics (2008) was for every school to have a primary mathematics specialist teacher (MaST) with ‘deep mathematical subject and pedagogical knowledge’ (Williams, 2008 p. 7). This knowledge would act as a ‘nucleus’ (p.1) for the whole school, with MaSTs supporting the teaching and learning of mathematics across the primary phase. As yet there is no model for the knowledge of these specialist teachers. This study aimed to examine the nature of this knowledge conceived of by a small sample of MaSTs, by conducting interviews as they undertook the role, and after developing it over two years and completing the Masters level training programme. The interviewer identified with the MaSTs the knowledge they conceived that they drew on in their teaching of one aspect of the mathematics curriculum and which they identified as deep subject knowledge. There were common features in this knowledge, which are argued to be indicative of the knowledge of the specialist teachers more generally. These features related to knowledge of progression across the primary phase. The MaSTs perceived that they gained new knowledge of mathematics and pedagogy which enabled them to support other staff but also impacted on their own teaching. The research found only a partial relationship between the current models which articulate the knowledge of primary classroom teachers of mathematics (Rowland et al 2009; Ball et al 2008; Ma, 1999) and the knowledge which MaSTs conceived that they drew on, and identified as deep. The research examined the relationship between the perceived knowledge of these teachers as specialists and class teachers, finding examples of case and strategic knowledge (Shulman, 1986). The MaSTs identified their new knowledge as distinct from that gained by classroom experience and valued the Masters aspects of their training programme

    The effects of age and expertise on discourse processing

    Get PDF
    The paradoxical nature of adult development is that it is marked by a decline in processing capacity but an increase in knowledge. A specialized formulation of increased knowledge that can occur throughout the lifespan is expertise. Because discourse processing is both a method of acquiring domain expertise and is facilitated by domain expertise, the nature of this interrelationship is central to successful aging. However, the processes through which expertise facilitates discourse processing are virtually unexplored within the cognitive aging literature. Four experiments investigating this issue are presented. The first experiment investigated age differences in on-line reading strategies of readers with high and low recall using passages in which expertise was induced by giving high-knowledge subjects titles to passages that were otherwise incoherent. In Experiment 2, age differences in parsing mechanisms underlying discourse processing of high- and low-knowledge listeners were examined using speech segmentation methodology. Experiment 3 was conducted to examine age differences in the effects of task demands on the reading strategies of high- and low-knowledge adults. Lastly, in Experiment 4, age differences in discourse processing strategies were investigated in the real-world domain of cooking

    The Resonant Dynamics of Speech Perception: Interword Integration and Duration-Dependent Backward Effects

    Full text link
    How do listeners integrate temporally distributed phonemic information into coherent representations of syllables and words? During fluent speech perception, variations in the durations of speech sounds and silent pauses can produce different pereeived groupings. For exarnple, increasing the silence interval between the words "gray chip" may result in the percept "great chip", whereas increasing the duration of fricative noise in "chip" may alter the percept to "great ship" (Repp et al., 1978). The ARTWORD neural model quantitatively simulates such context-sensitive speech data. In AHTWORD, sequential activation and storage of phonemic items in working memory provides bottom-up input to unitized representations, or list chunks, that group together sequences of items of variable length. The list chunks compete with each other as they dynamically integrate this bottom-up information. The winning groupings feed back to provide top-down supportto their phonemic items. Feedback establishes a resonance which temporarily boosts the activation levels of selected items and chunks, thereby creating an emergent conscious percept. Because the resonance evolves more slowly than wotking memory activation, it can be influenced by information presented after relatively long intervening silence intervals. The same phonemic input can hereby yield different groupings depending on its arrival time. Processes of resonant transfer and competitive teaming help determine which groupings win the competition. Habituating levels of neurotransmitter along the pathways that sustain the resonant feedback lead to a resonant collapsee that permits the formation of subsequent. resonances.Air Force Office of Scientific Research (F49620-92-J-0225); Defense Advanced Research projects Agency and Office of Naval Research (N00014-95-1-0409); National Science Foundation (IRI-97-20333); Office of Naval Research (N00014-92-J-1309, NOOO14-95-1-0657

    On the Use of Parsing for Named Entity Recognition

    Get PDF
    [Abstract] Parsing is a core natural language processing technique that can be used to obtain the structure underlying sentences in human languages. Named entity recognition (NER) is the task of identifying the entities that appear in a text. NER is a challenging natural language processing task that is essential to extract knowledge from texts in multiple domains, ranging from financial to medical. It is intuitive that the structure of a text can be helpful to determine whether or not a certain portion of it is an entity and if so, to establish its concrete limits. However, parsing has been a relatively little-used technique in NER systems, since most of them have chosen to consider shallow approaches to deal with text. In this work, we study the characteristics of NER, a task that is far from being solved despite its long history; we analyze the latest advances in parsing that make its use advisable in NER settings; we review the different approaches to NER that make use of syntactic information; and we propose a new way of using parsing in NER based on casting parsing itself as a sequence labeling task.Xunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G 2019/01This work has been funded by MINECO, AEI and FEDER of UE through the ANSWER-ASAP project (TIN2017-85160-C2-1-R); and by Xunta de Galicia through a Competitive Reference Group grant (ED431C 2020/11). CITIC, as Research Center of the Galician University System, is funded by the Consellería de Educación, Universidade e Formación Profesional of the Xunta de Galicia through the European Regional Development Fund (ERDF/FEDER) with 80%, the Galicia ERDF 2014-20 Operational Programme, and the remaining 20% from the Secretaría Xeral de Universidades (Ref. ED431G 2019/01). Carlos Gómez-Rodríguez has also received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, Grant No. 714150)
    • …
    corecore