181 research outputs found

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Program analysis for android security and reliability

    Get PDF
    The recent, widespread growth and adoption of mobile devices have revolutionized the way users interact with technology. As mobile apps have become increasingly prevalent, concerns regarding their security and reliability have gained significant attention. The ever-expanding mobile app ecosystem presents unique challenges in ensuring the protection of user data and maintaining app robustness. This dissertation expands the field of program analysis with techniques and abstractions tailored explicitly to enhancing Android security and reliability. This research introduces approaches for addressing critical issues related to sensitive information leakage, device and user fingerprinting, mobile medical score calculators, as well as termination-induced data loss. Through a series of comprehensive studies and employing novel approaches that combine static and dynamic analysis, this work provides valuable insights and practical solutions to the aforementioned challenges. In summary, this dissertation makes the following contributions: (1) precise identifier leak tracking via a novel algebraic representation of leak signatures, (2) identifier processing graphs (IPGs), an abstraction for extracting and subverting user-based and device-based fingerprinting schemes, (3) interval-based verification of medical score calculator correctness, and (4) identifying potential data losses caused by app termination

    CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital.

    Get PDF
    BACKGROUND: Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King's College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present. RESULTS: To date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King's College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall. CONCLUSION: We describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine

    3d Surface Registration Using Geometric Spectrum Of Shapes

    Get PDF
    Morphometric analysis of 3D surface objects are very important in many biomedical applications and clinical diagnoses. Its critical step lies in shape comparison and registration. Considering that the deformations of most organs such as heart or brain structures are non-isometric, it is very difficult to find the correspondence between the shapes before and after deformation, and therefore, very challenging for diagnosis purposes. To solve these challenges, we propose two spectral based methods. The first method employs the variation of the eigenvalues of the Laplace-Beltrami operator of the shape and optimize a quadratic equation in order to minimize the distance between two shapes’ eigenvalues. This method can determine multi-scale, non-isometric deformations through the variation of Laplace-Beltrami spectrum of two shapes. Given two triangle meshes, the spectra can be varied from one to another with a scale function defined on each vertex. The variation is expressed as a linear interpolation of eigenvalues of the two shapes. In each iteration step, a quadratic programming problem is constructed, based on our derived spectrum variation theorem and smoothness energy constraint, to compute the spectrum variation. The derivation of the scale function is the solution of such a problem. Therefore, the final scale function can be solved by integral of the derivation from each step, which, in turn, quantitatively describes non-isometric deformations between two shapes. However, this method can not find the point to point correspondence between two shapes. Our second method, extends the first method and uses some feature points generated from the eigenvectors of two shapes to minimize the difference between two eigenvectors of the shapes in addition to their eigenvalues. In order to register two surfaces, we map both eigenvalues and eigenvectors of the Laplace-Beltrami of the shapes by optimizing an energy function. The function is defined by the integration of a smooth term to align the eigenvalues and a distance term between the eigenvectors at feature points to align the eigenvectors. The feature points are generated using the static points of certain eigenvectors of the surfaces. By using both the eigenvalues and the eigenvectors on these feature points, the computational efficiency is improved considerably without losing the accuracy in comparison to the approaches that use the eigenvectors for all vertices. The variation of the shape is expressed using a scale function defined at each vertex. Consequently, the total energy function to align the two given surfaces can be defined using the linear interpolation of the scale function derivatives. Through the optimization the energy function, the scale function can be solved and the alignment is achieved. After the alignment, the eigenvectors can be employed to calculate the point to point correspondence of the surfaces. Therefore, the proposed method can accurately define the displacement of the vertices. For both methods, we evaluate them by conducting some experiments on synthetic and real data using hippocampus and heart data. These experiments demonstrate the advantages and accuracy of our methods. We then integrate our methods to a workflow system named DataView. Using this workflow system, users can design, save, run, and share their workflow using their web-browsers without the need of installing any software and regardless of the power of their computers. We have also integrated Grid to this system therefore the same task can be executed on up to 64 different cases which will increase the performance of the system enormously

    N-of-1 : better living through self-experimentation

    Get PDF
    This project's aim was to create a platform for personalized health data analysis, testing, and prediction, making it easier for ordinary people who are interested in N-of-1 trials to do their own self-experiments and take control of their mental and physical health. In these studies a single subject is observed and different interventions are systematically evaluated on them over time. These are typically longitudinal, occurring over weeks or months, with several rounds of treatments and evaluations in the form of a number of AB assignments. In these studies wearable technology, trackers, apps, sensors, and other IoT devices may be used to record information about the subject multiple times per day or week, if not constantly. In this study a singular self-experimenter collected data on themselves from several different sources such as mood questionnaires and a Fitbit wearable, among others. This data from the various sources was merged so that a variety of statistical methods could be performed. A few different modes of experimenting went into this study. One experiment tested the claim that spending 15 minutes per day writing in a gratitude journal had an effect on the subject's mood. This was achieved through a BABABA crossover phase design study, with each of the three phases being 28 days, for a total of 84 days in the experiment. The tests done for this experiment were the more traditional ANOVARM and ANCOVA, which were used to discover whether the intervention (B) phases were significantly different from the baseline (A), with relation to the subject's mood. Another test compared the claim that there was a difference in the subject's mood between the two groups of pre-experiment and during the experimental phase, through a Mann-Whitney U test. The last part of the study was a more complex machine learning (ML) pipeline that sought to predict the subject's mood based on over 3 years of daily collected data. The ML pipeline ingested the data, created several different ML models such as random forests and support vector machines, and compared which model was best at predicting the subject's mood. Feature importance was extracted from the best model through SHapley Additive exPlanations (SHAP), where the weight of the various feature effects on the target, in this case the subject's mood, was obtained. This notified the subject which behaviors had an effect on their mood. These different modes of experimenting were then compared, to see which was easier to implement or understand for future self-experimenters.Includes bibliographical references

    Data Infrastructure for Medical Research

    Get PDF
    While we are witnessing rapid growth in data across the sciences and in many applications, this growth is particularly remarkable in the medical domain, be it because of higher resolution instruments and diagnostic tools (e.g. MRI), new sources of structured data like activity trackers, the wide-spread use of electronic health records and many others. The sheer volume of the data is not, however, the only challenge to be faced when using medical data for research. Other crucial challenges include data heterogeneity, data quality, data privacy and so on. In this article, we review solutions addressing these challenges by discussing the current state of the art in the areas of data integration, data cleaning, data privacy, scalable data access and processing in the context of medical data. The techniques and tools we present will give practitioners — computer scientists and medical researchers alike — a starting point to understand the challenges and solutions and ultimately to analyse medical data and gain better and quicker insights

    Personalized data analytics for internet-of-things-based health monitoring

    Get PDF
    The Internet-of-Things (IoT) has great potential to fundamentally alter the delivery of modern healthcare, enabling healthcare solutions outside the limits of conventional clinical settings. It can offer ubiquitous monitoring to at-risk population groups and allow diagnostic care, preventive care, and early intervention in everyday life. These services can have profound impacts on many aspects of health and well-being. However, this field is still at an infancy stage, and the use of IoT-based systems in real-world healthcare applications introduces new challenges. Healthcare applications necessitate satisfactory quality attributes such as reliability and accuracy due to their mission-critical nature, while at the same time, IoT-based systems mostly operate over constrained shared sensing, communication, and computing resources. There is a need to investigate this synergy between the IoT technologies and healthcare applications from a user-centered perspective. Such a study should examine the role and requirements of IoT-based systems in real-world health monitoring applications. Moreover, conventional computing architecture and data analytic approaches introduced for IoT systems are insufficient when used to target health and well-being purposes, as they are unable to overcome the limitations of IoT systems while fulfilling the needs of healthcare applications. This thesis aims to address these issues by proposing an intelligent use of data and computing resources in IoT-based systems, which can lead to a high-level performance and satisfy the stringent requirements. For this purpose, this thesis first delves into the state-of-the-art IoT-enabled healthcare systems proposed for in-home and in-hospital monitoring. The findings are analyzed and categorized into different domains from a user-centered perspective. The selection of home-based applications is focused on the monitoring of the elderly who require more remote care and support compared to other groups of people. In contrast, the hospital-based applications include the role of existing IoT in patient monitoring and hospital management systems. Then, the objectives and requirements of each domain are investigated and discussed. This thesis proposes personalized data analytic approaches to fulfill the requirements and meet the objectives of IoT-based healthcare systems. In this regard, a new computing architecture is introduced, using computing resources in different layers of IoT to provide a high level of availability and accuracy for healthcare services. This architecture allows the hierarchical partitioning of machine learning algorithms in these systems and enables an adaptive system behavior with respect to the user's condition. In addition, personalized data fusion and modeling techniques are presented, exploiting multivariate and longitudinal data in IoT systems to improve the quality attributes of healthcare applications. First, a real-time missing data resilient decision-making technique is proposed for health monitoring systems. The technique tailors various data resources in IoT systems to accurately estimate health decisions despite missing data in the monitoring. Second, a personalized model is presented, enabling variations and event detection in long-term monitoring systems. The model evaluates the sleep quality of users according to their own historical data. Finally, the performance of the computing architecture and the techniques are evaluated in this thesis using two case studies. The first case study consists of real-time arrhythmia detection in electrocardiography signals collected from patients suffering from cardiovascular diseases. The second case study is continuous maternal health monitoring during pregnancy and postpartum. It includes a real human subject trial carried out with twenty pregnant women for seven months
    corecore