328 research outputs found

    Co-Occurring Disorders: An Outpatient Latent Class Analysis

    Get PDF
    Over the past 20 years researchers and health care practitioners have come to realize in addition to high prevalence rates, individuals with co-occurring disorders did not represent a homogeneous group (Drake, et al., 1998: 2001; Lehman, et al., 1994: 2000; Mueser, et al., 2000). It is essential to consider the heterogeneity of co-occurring disorders when considering new treatment modalities. Thus, it becomes pivotal to identify these differences for treatment approaches and program goals. Research shows that heterogeneity of treatment populations can be reduced through empirically-derived homogeneous groups based on multivariate analysis (Ries, et al., 1993; Lehman et al., 2000; Mueser, et al., 2000). The purpose of the current study was to address a significant void in knowledge on the heterogeneity of co-occurring disorders by determining if homogeneous subgroups exist within an outpatient population presenting for treatment and if so how many groups exist and what makes up group membership. Identification of subgroups can provide a mechanism to better understand the interrelationships between determinants that contribute to the etiology and problem severity at an individual and group level. Secondly, in an effort to improve service delivery, empirically-derived subgroups hold important clinical implications for treatment models. The exploratory research was conducted through a retrospective analysis seeking a parsimonious model of subgroups made up of individuals with co-occurring disorders entering an outpatient program using a latent class analysis (LCA). The best fitting statistical model in the latent class analysis was one in which the overall sample was composed of three (3) subgroups. The three-class model that included alcohol use, illegal drug use, education level and serious depression was identified as best fitting the data

    New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics

    Get PDF
    Mass spectrometry (MS) based techniques have emerged as a standard forlarge-scale protein analysis. The ongoing progress in terms of more sensitive machines and improved data analysis algorithms led to a constant expansion of its fields of applications. Recently, MS was introduced into clinical proteomics with the prospect of early disease detection using proteomic pattern matching. Analyzing biological samples (e.g. blood) by mass spectrometry generates mass spectra that represent the components (molecules) contained in a sample as masses and their respective relative concentrations. In this work, we are interested in those components that are constant within a group of individuals but differ much between individuals of two distinct groups. These distinguishing components that dependent on a particular medical condition are generally called biomarkers. Since not all biomarkers found by the algorithms are of equal (discriminating) quality we are only interested in a small biomarker subset that - as a combination - can be used as a fingerprint for a disease. Once a fingerprint for a particular disease (or medical condition) is identified, it can be used in clinical diagnostics to classify unknown spectra. In this thesis we have developed new algorithms for automatic extraction of disease specific fingerprints from mass spectrometry data. Special emphasis has been put on designing highly sensitive methods with respect to signal detection. Thanks to our statistically based approach our methods are able to detect signals even below the noise level inherent in data acquired by common MS machines, such as hormones. To provide access to these new classes of algorithms to collaborating groups we have created a web-based analysis platform that provides all necessary interfaces for data transfer, data analysis and result inspection. To prove the platform's practical relevance it has been utilized in several clinical studies two of which are presented in this thesis. In these studies it could be shown that our platform is superior to commercial systems with respect to fingerprint identification. As an outcome of these studies several fingerprints for different cancer types (bladder, kidney, testicle, pancreas, colon and thyroid) have been detected and validated. The clinical partners in fact emphasize that these results would be impossible with a less sensitive analysis tool (such as the currently available systems). In addition to the issue of reliably finding and handling signals in noise we faced the problem to handle very large amounts of data, since an average dataset of an individual is about 2.5 Gigabytes in size and we have data of hundreds to thousands of persons. To cope with these large datasets, we developed a new framework for a heterogeneous (quasi) ad-hoc Grid - an infrastructure that allows to integrate thousands of computing resources (e.g. Desktop Computers, Computing Clusters or specialized hardware, such as IBM's Cell Processor in a Playstation 3)

    Segurança e privacidade em terminologia de rede

    Get PDF
    Security and Privacy are now at the forefront of modern concerns, and drive a significant part of the debate on digital society. One particular aspect that holds significant bearing in these two topics is the naming of resources in the network, because it directly impacts how networks work, but also affects how security mechanisms are implemented and what are the privacy implications of metadata disclosure. This issue is further exacerbated by interoperability mechanisms that imply this information is increasingly available regardless of the intended scope. This work focuses on the implications of naming with regards to security and privacy in namespaces used in network protocols. In particular on the imple- mentation of solutions that provide additional security through naming policies or increase privacy. To achieve this, different techniques are used to either embed security information in existing namespaces or to minimise privacy ex- posure. The former allows bootstraping secure transport protocols on top of insecure discovery protocols, while the later introduces privacy policies as part of name assignment and resolution. The main vehicle for implementation of these solutions are general purpose protocols and services, however there is a strong parallel with ongoing re- search topics that leverage name resolution systems for interoperability such as the Internet of Things (IoT) and Information Centric Networks (ICN), where these approaches are also applicable.Segurança e Privacidade são dois topicos que marcam a agenda na discus- são sobre a sociedade digital. Um aspecto particularmente subtil nesta dis- cussão é a forma como atribuímos nomes a recursos na rede, uma escolha com consequências práticas no funcionamento dos diferentes protocols de rede, na forma como se implementam diferentes mecanismos de segurança e na privacidade das várias partes envolvidas. Este problema torna-se ainda mais significativo quando se considera que, para promover a interoperabili- dade entre diferentes redes, mecanismos autónomos tornam esta informação acessível em contextos que vão para lá do que era pretendido. Esta tese foca-se nas consequências de diferentes políticas de atribuição de nomes no contexto de diferentes protocols de rede, para efeitos de segurança e privacidade. Com base no estudo deste problema, são propostas soluções que, através de diferentes políticas de atribuição de nomes, permitem introdu- zir mecanismos de segurança adicionais ou mitigar problemas de privacidade em diferentes protocolos. Isto resulta na implementação de mecanismos de segurança sobre protocolos de descoberta inseguros, assim como na intro- dução de mecanismos de atribuiçao e resolução de nomes que se focam na protecçao da privacidade. O principal veículo para a implementação destas soluções é através de ser- viços e protocolos de rede de uso geral. No entanto, a aplicabilidade destas soluções extende-se também a outros tópicos de investigação que recorrem a mecanismos de resolução de nomes para implementar soluções de intero- perabilidade, nomedamente a Internet das Coisas (IoT) e redes centradas na informação (ICN).Programa Doutoral em Informátic

    Computing and estimating information leakage with a quantitative point-to-point information flow model

    Get PDF
    Information leakage occurs when a system exposes its secret information to an unauthorised entity. Information flow analysis is concerned with tracking flows of information through systems to determine whether they process information securely or leak information. We present a novel information flow model that permits an arbitrary amount of secret and publicly-observable information to occur at any point and in any order in a system. This is an improvement over previous models, which generally assume that systems process a single piece of secret information present before execution and produce a single piece of publicly-observable information upon termination. Our model precisely quantifies the information leakage from secret to publicly-observable values at user-defined points - hence, a "point-to-point" model - using the information-theoretic measures of mutual information and min-entropy leakage; it is ideal for analysing systems of low to moderate complexity. We also present a relaxed version of our information flow model that estimates, rather than computes, the measures of mutual information and min-entropy leakage via sampling of a system. We use statistical techniques to bound the accuracy of the estimates this model provides. We demonstrate how our relaxed model is more suitable for analysing complex systems by implementing it in a quantitative information flow analysis tool for Java programs

    Approximate Data Analytics Systems

    Get PDF
    Today, most modern online services make use of big data analytics systems to extract useful information from the raw digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore’s law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. Unfortunately, the advancements in approximate computing are primarily geared towards batch analytics and cannot provide low-latency guarantees in the context of stream processing, where new data continuously arrives as an unbounded stream. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale stream data to achieve low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: • StreamApprox—a data stream analytics system for approximate computing. This system supports approximate computing for low-latency stream analytics in a transparent way and has an ability to adapt to rapid fluctuations of input data streams. In this system, we designed an online adaptive stratified reservoir sampling algorithm to produce approximate output with bounded error. • IncApprox—a data analytics system for incremental approximate computing. This system adopts approximate and incremental computing in stream processing to achieve high-throughput and low-latency with efficient resource utilization. In this system, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. • PrivApprox—a data stream analytics system for privacy-preserving and approximate computing. This system supports high utility and low-latency data analytics and preserves user’s privacy at the same time. The system is based on the combination of privacy-preserving data analytics and approximate computing. • ApproxJoin—an approximate distributed joins system. This system improves the performance of joins — critical but expensive operations in big data systems. In this system, we employed a sketching technique (Bloom filter) to avoid shuffling non-joinable data items through the network as well as proposed a novel sampling mechanism that executes during the join to obtain an unbiased representative sample of the join output. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance speedup compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications

    TOWARDS A HOLISTIC EFFICIENT STACKING ENSEMBLE INTRUSION DETECTION SYSTEM USING NEWLY GENERATED HETEROGENEOUS DATASETS

    Get PDF
    With the exponential growth of network-based applications globally, there has been a transformation in organizations\u27 business models. Furthermore, cost reduction of both computational devices and the internet have led people to become more technology dependent. Consequently, due to inordinate use of computer networks, new risks have emerged. Therefore, the process of improving the speed and accuracy of security mechanisms has become crucial.Although abundant new security tools have been developed, the rapid-growth of malicious activities continues to be a pressing issue, as their ever-evolving attacks continue to create severe threats to network security. Classical security techniquesfor instance, firewallsare used as a first line of defense against security problems but remain unable to detect internal intrusions or adequately provide security countermeasures. Thus, network administrators tend to rely predominantly on Intrusion Detection Systems to detect such network intrusive activities. Machine Learning is one of the practical approaches to intrusion detection that learns from data to differentiate between normal and malicious traffic. Although Machine Learning approaches are used frequently, an in-depth analysis of Machine Learning algorithms in the context of intrusion detection has received less attention in the literature.Moreover, adequate datasets are necessary to train and evaluate anomaly-based network intrusion detection systems. There exist a number of such datasetsas DARPA, KDDCUP, and NSL-KDDthat have been widely adopted by researchers to train and evaluate the performance of their proposed intrusion detection approaches. Based on several studies, many such datasets are outworn and unreliable to use. Furthermore, some of these datasets suffer from a lack of traffic diversity and volumes, do not cover the variety of attacks, have anonymized packet information and payload that cannot reflect the current trends, or lack feature set and metadata.This thesis provides a comprehensive analysis of some of the existing Machine Learning approaches for identifying network intrusions. Specifically, it analyzes the algorithms along various dimensionsnamely, feature selection, sensitivity to the hyper-parameter selection, and class imbalance problemsthat are inherent to intrusion detection. It also produces a new reliable dataset labeled Game Theory and Cyber Security (GTCS) that matches real-world criteria, contains normal and different classes of attacks, and reflects the current network traffic trends. The GTCS dataset is used to evaluate the performance of the different approaches, and a detailed experimental evaluation to summarize the effectiveness of each approach is presented. Finally, the thesis proposes an ensemble classifier model composed of multiple classifiers with different learning paradigms to address the issue of detection accuracy and false alarm rate in intrusion detection systems

    Confidential Data-Outsourcing and Self-Optimizing P2P-Networks: Coping with the Challenges of Multi-Party Systems

    Get PDF
    This work addresses the inherent lack of control and trust in Multi-Party Systems at the examples of the Database-as-a-Service (DaaS) scenario and public Distributed Hash Tables (DHTs). In the DaaS field, it is shown how confidential information in a database can be protected while still allowing the external storage provider to process incoming queries. For public DHTs, it is shown how these highly dynamic systems can be managed by facilitating monitoring, simulation, and self-adaptation
    • …
    corecore