14 research outputs found

    On the design of efficient caching systems

    Get PDF
    Content distribution is currently the prevalent Internet use case, accounting for the majority of global Internet traffic and growing exponentially. There is general consensus that the most effective method to deal with the large amount of content demand is through the deployment of massively distributed caching infrastructures as the means to localise content delivery traffic. Solutions based on caching have been already widely deployed through Content Delivery Networks. Ubiquitous caching is also a fundamental aspect of the emerging Information-Centric Networking paradigm which aims to rethink the current Internet architecture for long term evolution. Distributed content caching systems are expected to grow substantially in the future, in terms of both footprint and traffic carried and, as such, will become substantially more complex and costly. This thesis addresses the problem of designing scalable and cost-effective distributed caching systems that will be able to efficiently support the expected massive growth of content traffic and makes three distinct contributions. First, it produces an extensive theoretical characterisation of sharding, which is a widely used technique to allocate data items to resources of a distributed system according to a hash function. Based on the findings unveiled by this analysis, two systems are designed contributing to the abovementioned objective. The first is a framework and related algorithms for enabling efficient load-balanced content caching. This solution provides qualitative advantages over previously proposed solutions, such as ease of modelling and availability of knobs to fine-tune performance, as well as quantitative advantages, such as 2x increase in cache hit ratio and 19-33% reduction in load imbalance while maintaining comparable latency to other approaches. The second is the design and implementation of a caching node enabling 20 Gbps speeds based on inexpensive commodity hardware. We believe these contributions advance significantly the state of the art in distributed caching systems

    Improving Data Management and Data Movement Efficiency in Hybrid Storage Systems

    Get PDF
    University of Minnesota Ph.D. dissertation.July 2017. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); ix, 116 pages.In the big data era, large volumes of data being continuously generated drive the emergence of high performance large capacity storage systems. To reduce the total cost of ownership, storage systems are built in a more composite way with many different types of emerging storage technologies/devices including Storage Class Memory (SCM), Solid State Drives (SSD), Shingle Magnetic Recording (SMR), Hard Disk Drives (HDD), and even across off-premise cloud storage. To make better utilization of each type of storage, industries have provided multi-tier storage through dynamically placing hot data in the faster tiers and cold data in the slower tiers. Data movement happens between devices on one single device and as well as between devices connected via various networks. Toward improving data management and data movement efficiency in such hybrid storage systems, this work makes the following contributions: To bridge the giant semantic gap between applications and modern storage systems, passing a piece of tiny and useful information (I/O access hints) from upper layers to the block storage layer may greatly improve application performance or ease data management in heterogeneous storage systems. We present and develop a generic and flexible framework, called HintStor, to execute and evaluate various I/O access hints on heterogeneous storage systems with minor modifications to the kernel and applications. The design of HintStor contains a new application/user level interface, a file system plugin and a block storage data manager. With HintStor, storage systems composed of various storage devices can perform pre-devised data placement, space reallocation and data migration polices assisted by the added access hints. Each storage device/technology has its own unique price-performance tradeoffs and idiosyncrasies with respect to workload characteristics they prefer to support. To explore the internal access patterns and thus efficiently place data on storage systems with fully connected (i.e., data can move from one device to any other device instead of moving tier by tier) differential pools (each pool consists of storage devices of a particular type), we propose a chunk-level storage-aware workload analyzer framework, simplified as ChewAnalyzer. With ChewAnalzyer, the storage manager can adequately distribute and move the data chunks across different storage pools. To reduce the duplicate content transferred between local storage devices and devices in remote data centers, an inline Network Redundancy Elimination (NRE) process with Content-Defined Chunking (CDC) policy can obtain a higher Redundancy Elimination (RE) ratio but may suffer from a considerably higher computational requirement than fixed-size chunking. We build an inline NRE appliance which incorporates an improved FPGA based scheme to speed up CDC processing. To efficiently utilize the hardware resources, the whole NRE process is handled by a Virtualized NRE (VNRE) controller. The uniqueness of this VNRE that we developed lies in its ability to exploit the redundancy patterns of different TCP flows and customize the chunking process to achieve a higher RE ratio

    Proactive Mechanisms for Video-on-Demand Content Delivery

    Get PDF
    Video delivery over the Internet is the dominant source of network load all over the world. Especially VoD streaming services such as YouTube, Netflix, and Amazon Video have propelled the proliferation of VoD in many peoples' everyday life. VoD allows watching video from a large quantity of content at any time and on a multitude of devices, including smart TVs, laptops, and smartphones. Studies show that many people under the age of 32 grew up with VoD services and have never subscribed to a traditional cable TV service. This shift in video consumption behavior is continuing with an ever-growing number of users. satisfy this large demand, VoD service providers usually rely on CDN, which make VoD streaming scalable by operating a geographically distributed network of several hundreds of thousands of servers. Thereby, they deliver content from locations close to the users, which keeps traffic local and enables a fast playback start. CDN experience heavy utilization during the day and are usually reactive to the user demand, which is not optimal as it leads to expensive over-provisioning, to cope with traffic peaks, and overreacting content eviction that decreases the CDN's performance. However, to sustain future VoD streaming projections with hundreds of millions of users, new approaches are required to increase the content delivery efficiency. To this end, this thesis identifies three key research areas that have the potential to address the future demand for VoD content. Our first contribution is the design of vFetch, a privacy-preserving prefetching mechanism for mobile devices. It focuses explicitly on OTT VoD providers such as YouTube. vFetch learns the user interest towards different content channels and uses these insights to prefetch content on a user terminal. To do so, it continually monitors the user behavior and the device's mobile connectivity pattern, to allow for resource-efficient download scheduling. Thereby, vFetch illustrates how personalized prefetching can reduce the mobile data volume and alleviate mobile networks by offloading peak-hour traffic. Our second contribution focuses on proactive in-network caching. To this end, we present the design of the ProCache mechanism that divides the available cache storage concerning separate content categories. Thus, the available storage is allocated to these divisions based on their contribution to the overall cache efficiency. We propose a general work-flow that emphasizes multiple categories of a mixed content workload in addition to a work-flow tailored for music video content, the dominant traffic source on YouTube. Thereby, ProCache shows how content-awareness can contribute to efficient in-network caching. Our third contribution targets the application of multicast for VoD scenarios. Many users request popular VoD content with only small differences in their playback start time which offers a potential for multicast. Therefore, we present the design of the VoDCast mechanism that leverages this potential to multicast parts of popular VoD content. Thereby, VoDCast illustrates how ISP can collaborate with CDN to coordinate on content that should be delivered by ISP-internal multicast

    USING THE DIAMETRICAL MODEL TO EXAMINE THE RELATIONSHIP BETWEEN THE AUTISM AND PSYCHOSIS SPECTRA

    Get PDF
    Schizophrenia and autism spectrum disorders (SSD; ASD) share clinical features, although considered distinct. Theories contrast ASD and SSD social cognition. The reasoning for this thesis is based on dimensional models of personality spanning from the healthy to pathological variations. Under this scenario, do some healthy autistic traits oppose to schizotypic ones on a Mentalism continuum? Also, does this psychometric opposition correspond to a behavioural one, f.i. in processing face and gaze? First, we validated schizotypic and autistic trait questionnaires in French. Second, we identified shared and diametrical traits. Third, we conducted 3 experiments to measure face pareidolia-proneness. We expected larger pareidolia-proneness with larger positive schizotypy, and smaller autistic trait scores. Fourth, we assessed gaze direction discrimination, and gaze cueing of attention. We expected larger sensitivity to gaze with larger positive schizotypy, but a smaller one with larger autistic traits. Psychometrically, we replicated oppositions between autistic mentalizing deficits and positive schizotypic traits. Although pareidolia-proneness was unrelated to personality, configural face processing was impaired with larger positive schizotypy, but preserved with smaller autistic mentalizing deficits scores. Also, gaze sensitivity was decreased in men with larger autistic mentalizing traits, but unassociated with positive schizotypy. Our results partially support ASD-SSD opposition in social cognition, to be further confirmed by future studies. Pareidolia-proneness may be better measured using other measurement strategies. Gaze direction attribution might better contrast ASD and SSD. Comparisons of resembling disorder-related phenotypes is promising for understanding underlying aetiological mechanisms, notably using a transdiagnostic approach associating personality, cognitive styles, endophenotypes, and multidimensional or network models. -- Les troubles des spectres schizophréniques et autistiques (TSS; TSA) sont cliniquement ressemblants, mais catégoriellement distincts. Des théories opposent la cognition sociale des TSA et TSS. Le raisonnement de cette thèse se base sur les modèles dimensionnels de la personnalité comme reliant normal et pathologique. Aussi, certains traits autistiques s'opposent-ils aux traits schizotypiques ? Une opposition psychométrique correspond-elle à une opposition comportementale, i.e. dans le traitement des visages et du regard ? Premièrement, nous avons validé les questionnaires de personnalité schizotypiques et autistiques. Deuxièmement, nous avons identifié les traits partagés et opposés. Troisièmement, nous avons conduit 3 expériences sur la paréidolie facial, que nous attendions associée à plus de schizotypie positive et moins de traits autistiques. Quatrièmement, nous avons examiné la discrimination de la direction du regard et la redirection de l'attention par le regard, que nous attendions associées à plus de schizotypie positive et moins de traits autistiques. Au niveau psychométrique, nous avons répliqués les oppositions entre traits autistiques de mentalisation déficitaire et traits schizotypiques positifs. Bien que paréidolie et personnalité étaient sans liens, le traitement configural des informations faciales était péjoré avec plus de schizotypie positive, mais préservé avec plus de déficits autistiques de mentalisation. Aussi, la sensibilité au regard était moindre chez les hommes avec plus de déficits autistiques de mentalisation, mais sans lien avec la schizotypie positive. Nos résultats soutiennent partiellement une opposition TSA-TSS de la cognition sociale, à confirmer par de futures études. La tendance à la paréidolie gagnerait à être mesurée par d'autres stratégies. L'attribution de la direction du regard pourrait mieux distinguer TSA et TSS. La comparaison de phénotypes psychiatriques resemblants est une approche prometteuse pour comprendre des méchanismes étiologiques sous-jacents, notamment par une approche transdiagnostique associant la personnalité, les styles cognitifs, les endophénotypes, des modèles multidimensionels ou en réseau

    Automated Digital Forensic Triage: Rapid Detection of Anti-Forensic Tools

    Get PDF
    We live in the information age. Our world is interconnected by digital devices and electronic communication. As such, criminals are finding opportunities to exploit our information rich electronic data. In 2014, the estimated annual cost from computer-related crime was more than 800 billion dollars. Examples include the theft of intellectual property, electronic fraud, identity theft and the distribution of illicit material. Digital forensics grew out of necessity to combat computer crime and involves the investigation and analysis of electronic data after a suspected criminal act. Challenges in digital forensics exist due to constant changes in technology. Investigation challenges include exponential growth in the number of cases and the size of targets; for example, forensic practitioners must analyse multi-terabyte cases comprised of numerous digital devices. A variety of applied challenges also exist, due to continual technological advancements; for example, anti-forensic tools, including the malicious use of encryption or data wiping tools, hinder digital investigations by hiding or removing the availability of evidence. In response, the objective of the research reported here was to automate the effective and efficient detection of anti-forensic tools. A design science research methodology was selected as it provides an applied research method to design, implement and evaluate an innovative Information Technology (IT) artifact to solve a specified problem. The research objective require that a system be designed and implemented to perform automated detection of digital artifacts (e.g., data files and Windows Registry entries) on a target data set. The goal of the system is to automatically determine if an anti-forensic tool is present, or absent, in order to prioritise additional in-depth investigation. The system performs rapid forensic triage, suitable for execution against multiple investigation targets, providing an analyst with high-level information regarding potential malicious anti-forensic tool usage. The system is divided into two main stages: 1) Design and implementation of a solution to automate creation of an application profile (application software reference set) of known unique digital artifacts; and 2) Digital artifact matching between the created reference set and a target data set. Two tools were designed and implemented: 1) A live differential analysis tool, named LiveDiff, to reverse engineer application software with a specific emphasis on digital forensic requirements; 2) A digital artifact matching framework, named Vestigium, to correlate digital artifact metadata and detect anti-forensic tool presence. In addition, a forensic data abstraction, named Application Profile XML (APXML), was designed to store and distribute digital artifact metadata. An associated Application Programming Interface (API), named apxml.py, was authored to provide automated processing of APXML documents. Together, the tools provided an automated triage system to detect anti-forensic tool presence on an investigation target. A two-phase approach was employed in order to assess the research products. The first phase of experimental testing involved demonstration in a controlled laboratory environment. First, the LiveDiff tool was used to create application profiles for three anti-forensic tools. The automated data collection and comparison procedure was more effective and efficient than previous approaches. Two data reduction techniques were tested to remove irrelevant operating system noise: application profile intersection and dynamic blacklisting were found to be effective in this regard. Second, the profiles were used as input to Vestigium and automated digital artifact matching was performed against authored known data sets. The results established the desired system functionality and demonstration then led to refinements of the system, as per the cyclical nature of design science. The second phase of experimental testing involved evaluation using two additional data sets to establish effectiveness and efficiency in a real-world investigation scenario. First, a public data set was subjected to testing to provide research reproducibility, as well as to evaluate system effectiveness in a variety of complex detection scenarios. Results showed the ability to detect anti-forensic tools using a different version than that included in the application profile and on a different Windows operating system version. Both are scenarios where traditional hash set analysis fails. Furthermore, Vestigium was able to detect residual and deleted information, even after a tool had been uninstalled by the user. The efficiency of the system was determined and refinements made, resulting in an implementation that can meet forensic triage requirements. Second, a real-world data set was constructed using a collection of second-hand hard drives. The goal was to test the system using unpredictable and diverse data to provide more robust findings in an uncontrolled environment. The system detected one anti-forensic tool on the data set and processed all input data successfully without error, further validating system design and implementation. The key outcome of this research is the design and implementation of an automated system to detect anti-forensic tool presence on a target data set. Evaluation suggested the solution was both effective and efficient, adhering to forensic triage requirements. Furthermore, techniques not previously utilised in forensic analysis were designed and applied throughout the research: dynamic blacklisting and profile intersection removed irrelevant operating system noise from application profiles; metadata matching methods resulted in efficient digital artifact detection and path normalisation aided full path correlation in complex matching scenarios. The system was subjected to rigorous experimental testing on three data sets that comprised more than 10 terabytes of data. The ultimate outcome is a practically implemented solution that has been executed on hundreds of forensic disk images, thousands of Windows Registry hives, more than 10 million data files, and approximately 50 million Registry entries. The research has resulted in the design of a scalable triage system implemented as a set of computer forensic tools

    Digital behaviours and cognitions of individuals convicted of online child pornography offences

    Get PDF
    BACKGROUND: Modern Child Sexual Exploitation Material (CSEM) offences predominantly occur within a technological ecosystem. The behaviours and cognitions of CSEM offenders influence, and are influenced by, their choice of facilitative technologies that form that ecosystem. OBJECTIVES: This thesis will review the prior research on cognitive distortions present in and technology usage by CSEM offenders, and present a new theory, Lawless Space Theory (LST), to explain those interactions. The cognitions and technical behaviours of previously convicted CSEM offenders will be examined in a psychosocial context and recommendations for deterrence, investigative, and treatment efforts made. PARTICIPANTS AND SETTING: Data was collected using an online survey collected from two samples, one from a reference population of the general public (n=524) and one from a population of previously convicted CSEM offenders (n=78), both of which were composed of adults living in the United States. METHODS: Two reviews were conducted using a PRISMA methodology - a systematic review of the cognitive distortions of CSEM offenders and an integrative review of their technology usage. A theoretical basis for LST was developed, and then seven investigations of the survey data were conducted evaluating the public’s endorsement of lawless spaces; the public’s perceptions of CSEM offenders; the self-perceptions of CSEM offenders; the suicidality of the offender sample; the use of technology and countermeasures by the offender sample; the collecting and viewing behaviours of the offender sample; and the idiographic profiles of the offender sample. RESULTS: The reviews found that the endorsement of traditional child contact offender cognitive distortions by CSEM offenders was low, and that they continued to use technology beyond its normative lifecycle. LST was developed to explain these behaviours, and the view of the Internet as generally lawless was endorsed by the reference and offender samples. The public sample showed biased beliefs that generally overestimated the prevalence of, and risk associated with, CSEM offending when compared to the offender sample. Offenders were found to have viewed investigators as having a lack of understanding and compassion, and they exhibited very high suicidal ideation following their interaction with law enforcement. Offenders exhibited similar technical abilities and lower technophilia than the reference sample, chose technologies to both reduce psychological strain and for utility purposes, and many exhibited cyclic deletions of their collections as part of a guilt/shame cycle. CONCLUSIONS AND IMPLICATIONS: Understanding CSEM offenders’ technological behaviours and cognitions can inform more effective investigative, deterrence, and treatment efforts. Law enforcement showing compassion during investigations may generate more full disclosures while facilitating offender engagement with resources to reduce suicidality. Deterrence efforts focused on establishing capable guardianship and reducing perceived lawlessness provide the potential to reduce offending. Treatment of criminogenic needs for the majority of CSEM offenders is not supported by evidence, but non-criminogenic treatment warrants broader consideration

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute
    corecore