1,280 research outputs found
Predicting CMS datasets popularity with machine learning
In CMS Ăš stato lanciato un progetto di Data Analytics e, allâinterno di
esso, unâattivitĂ specifica pilota che mira a sfruttare tecniche di Machine Learning
per predire la popolaritĂ dei dataset di CMS. Si tratta di unâosservabile molto
delicata, la cui eventuale predizione premetterebbe a CMS di costruire modelli di
data placement piĂč intelligenti, ampie ottimizzazioni nellâuso dello storage a tutti i
livelli Tiers, e formerebbe la base per lâintroduzione di un solito sistema di data
management dinamico e adattivo. Questa tesi descrive il lavoro fatto sfruttando un
nuovo prototipo pilota chiamato DCAFPilot, interamente scritto in python, per
affrontare questa sfida
Workload-Aware Performance Tuning for Autonomous DBMSs
Optimal configuration is vital for a DataBase Management System (DBMS) to achieve high performance. There is no one-size-fits-all configuration that works for different workloads since each workload has varying patterns with different resource requirements. There is a relationship between configuration, workload, and system performance. If a configuration cannot adapt to the dynamic changes of a workload, there could be a significant degradation in the overall performance of DBMS unless a sophisticated administrator is continuously re-configuring the DBMS. In this tutorial, we focus on autonomous workload-aware performance tuning, which is expected to automatically and continuously tune the configuration as the workload changes. We survey three research directions, including 1) workload classification, 2) workload forecasting, and 3) workload-based tuning. While the first two topics address the issue of obtaining accurate workload information, the third one tackles the problem of how to properly use the workload information to optimize performance. We also identify research challenges and open problems, and give real-world examples about leveraging workload information for database tuning in commercial products (e.g., Amazon Redshift). We will demonstrate workload-aware performance tuning in Amazon Redshift in the presentation.Peer reviewe
Studies of CMS data access patterns with machine learning techniques
This thesis presents a study of the Grid data access patterns in distributed analysis in
the CMS experiment at the LHC accelerator.
This study ranges from the deep analysis of the historical patterns of access to the
most relevant data types in CMS, to the exploitation of a supervised Machine Learning classification system to set-up a machinery able to eventually predict future data access patterns - i.e. the so-called dataset âpopularityâ of the CMS datasets on the Grid - with focus on specific data types. All the CMS workflows run on the Worldwide LHC Computing Grid (WCG) computing centers (Tiers), and in particular the distributed analysis systems sustains hundreds of users and applications submitted every day. These applications (or âjobsâ) access different data types hosted on disk storage systems at a large set of WLCG Tiers. The detailed study of how this data is accessed, in terms of data types, hosting Tiers, and different time periods, allows to gain precious insight on storage occupancy over time and different access patterns, and ultimately to extract suggested actions based on this information (e.g. targetted disk clean-up and/or data replication). In this sense, the application of Machine Learning techniques allows to learn from past data and to gain predictability potential for the future CMS data access patterns.
Chapter 1 provides an introduction to High Energy Physics at the LHC.
Chapter 2 describes the CMS Computing Model, with special focus on the data management sector, also discussing the concept of dataset popularity.
Chapter 3 describes the study of CMS data access patterns with different depth levels.
Chapter 4 offers a brief introduction to basic machine learning concepts and gives an introduction to its application in CMS and discuss the results obtained by using this approach in the context of this thesis
AN EXAMINATION OF CONCURRENT DISCRIMINATION LEARNING WITHIN INDIVIDUALS WITH PARKINSONâS DISEASE
The main focus of this research is to further understand memory formation by examining the role of the basal ganglia in learning. Broadly, this study examines how the basal ganglia may play a role in a task that has been associated with declarative memory mechanisms, in this case the concurrent discrimination task (CDT). Specifically, we examine how performance is affected on the CDT when structures of the basal ganglia are compromised by recruiting individuals with Parkinsonâs disease (PD). Past work examining the performance of individuals with PD on a CDT have had contradicting results and have proposed that participants may adopt different strategies that rely variously either on declarative or non-declarative strategy (Moody et. al., 2010). We aimed to reduce strategy differences by making changes in stimuli, increasing the number of stimuli significantly, increasing the number of learning blocks, and making all participants explicitly aware of the task structure and goals. By making the goals explicit, we predicted that we would engage a declarative mechanism in both PD and control individuals. To examine declarative memory formation we used the Remember Know task (RK). However, since used a significantly larger set size of stimuli we hypothesized that individuals with PD would perform significantly worse on the CDT than control individuals. The current study reveals that there are no significant differences in performance between individuals with PD and control participants on both the CDT and RK task. We attribute these results to design of our paradigm and stimuli which may have influenced individuals to engage in declarative strategies to perform the CDT reasonably well
Proceedings, MSVSCC 2016
Proceedings of the 10th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 14, 2016 at VMASC in Suffolk, Virginia
Optimizing the neural response to electrical stimulation and exploring new applications of neurostimulation
Electrical stimulation has been successful in treating patients who suffer from neurologic and neuropsychiatric disorders that are resistant to standard treatments. For deep brain stimulation (DBS), its official approved use has been limited to mainly motor disorders, such as Parkinson\u27s disease and essential tremor. Alcohol use disorder, and addictive disorders in general, is a prevalent condition that is difficult to treat long-term. To determine whether DBS can reduce alcohol drinking in animals, voluntary alcohol consumption of alcohol-preferring rats before, during, and after stimulation of the nucleus accumbens shell were compared. Intake levels in the low stimulus intensity group (n=3, 100&mgr;A current) decreased by as much as 43% during stimulation, but the effect did not persist. In the high stimulus intensity group (n=4, 200&mgr;A current), alcohol intake decreased as much as 59%, and the effect was sustained. These results demonstrate the potent, reversible effects of DBS.^ Left vagus nerve stimulation (VNS) is approved for treating epilepsy and depression. However, the standard method of determining stimulus parameters is imprecise, and the patient responses are highly variable. I developed a method of designing custom stimulus waveforms and assessing the nerve response to optimize stimulation selectivity and efficiency. VNS experiments were performed in rats aiming to increase the selectivity of slow nerve fibers while assessing activation efficiency. When producing 50% of maximal activation of slow fibers, customized stimuli were able to activate as low as 12.8% of fast fibers, while the lowest for standard rectangular waveforms was 35.0% (n=4-6 animals). However, the stimulus with the highest selectivity requires 19.6nC of charge per stimulus phase, while the rectangular stimulus required only 13.2nC.^ Right VNS is currently under clinical investigation for preventing sudden unexpected death in epilepsy and for treating heart failure. Activation of the right vagal parasympathetic fibers led to waveform-independent reductions in heart rate, ejection ratio, and stroke volume. Customized stimulus design with response feedback produces reproducible and predictable patterns of nerve activation and physiological effects, which will lead to more consistent patient responses
Dynamic power management: from portable devices to high performance computing
Electronic applications are nowadays converging under the umbrella of the cloud computing vision. The future ecosystem of information and communication technology is going to integrate clouds of portable clients and embedded devices exchanging information, through the internet layer, with processing clusters of servers, data-centers and high performance computing systems. Even thus the whole society is waiting to embrace this revolution, there is a backside of the story. Portable devices require battery to work far from the power plugs and their storage capacity does not scale as the increasing power requirement does. At the other end processing clusters, such as data-centers and server farms, are build upon the integration of thousands multiprocessors. For each of them during the last decade the technology scaling has produced a dramatic increase in power density with significant spatial and temporal variability. This leads to power and temperature hot-spots, which may cause non-uniform ageing and accelerated chip failure. Nonetheless all the heat removed from the silicon translates in high cooling costs. Moreover trend in ICT carbon footprint shows that run-time power consumption of the all spectrum of devices accounts for a significant slice of entire world carbon emissions.
This thesis work embrace the full ICT ecosystem and dynamic power consumption concerns by describing a set of new and promising system levels resource management techniques to reduce the power consumption and related issues for two corner cases: Mobile Devices and High Performance Computing
Broadening Responsibilities: Consideration Of The Potential To Broaden The Role Of Uniformed Fire Service Employees
What is this report about? This report, commissioned by the National Joint Council for Local Authority Fire and Rescue Services (NJC), aims to identify what impact, if any, firefighters can have on the delivery of emergency medical response and wider community health interventions in the UK. What are the overall conclusions? Appropriately trained and equipped firefighters co-responding1 to targeted, specific time critical medical events, such as cardiac arrest, can improve patient survival rates. The data also indicate that there is support from fire service staff â and a potential need from members of the public, particularly the elderly, isolated or vulnerable â to expand âwider workâ. This includes winter warmth assessments, Safe and Well checks, community defibrillator training and client referrals when staff believe someone may have dementia, are vulnerable or even, for example, have substance dependencies such as an alcohol addiction. However, there is currently insufficient data to estimate the net benefit of this work
Recommended from our members
Deterministic, Mutable, and Distributed Record-Replay for Operating Systems and Database Systems
Application record and replay is the ability to record application execution and replay it at a later time. Record-replay has many use cases including diagnosing and debugging applications by capturing and reproducing hard to find bugs, providing transparent application fault tolerance by maintaining a live replica of a running program, and offline instrumentation that would be too costly to run in a production environment. Different record-replay systems may offer different levels of replay faithfulness, the strongest level being deterministic replay which guarantees an identical reenactment of the original execution. Such a guarantee requires capturing all sources of nondeterminism during the recording phase. In the general case, such record-replay systems can dramatically hinder application performance, rendering them unpractical in certain application domains. Furthermore, various use cases are incompatible with strictly replaying the original execution. For example, in a primary-secondary database scenario, the secondary database would be unable to serve additional traffic while being replicated. No record-replay system fit all use cases.
This dissertation shows how to make deterministic record-replay fast and efficient, how broadening replay semantics can enable powerful new use cases, and how choosing the right level of abstraction for record-replay can support distributed and heterogeneous database replication with little effort.
We explore four record-replay systems with different semantics enabling different use cases. We first present Scribe, an OS-level deterministic record-replay mechanism that support multi-process applications on multi-core systems. One of the main challenge is to record the interaction of threads running on different CPU cores in an efficient manner. Scribe introduces two new lightweight OS mechanisms, rendezvous point and sync points, to efficiently record nondeterministic interactions such as related system calls, signals, and shared memory accesses. Scribe allows the capture and replication of hard to find bugs to facilitate debugging and serves as a solid foundation for our two following systems.
We then present RacePro, a process race detection system to improve software correctness. Process races occur when multiple processes access shared operating system resources, such as files, without proper synchronization. Detecting process races is difficult due to the elusive nature of these bugs, and the heterogeneity of frameworks involved in such bugs. RacePro is the first tool to detect such process races. RacePro records application executions in deployed systems, allowing offline race detection by analyzing the previously recorded log. RacePro then replays the application execution and forces the manifestation of detected races to check their effect on the application. Upon failure, RacePro reports potentially harmful races to developers.
Third, we present Dora, a mutable record-replay system which allows a recorded execution of an application to be replayed with a modified version of the application. Mutable record-replay provides a number of benefits for reproducing, diagnosing, and fixing software bugs. Given a recording and a modified application, finding a mutable replay is challenging, and undecidable in the general case. Despite the difficulty of the problem, we show a very simple but effective algorithm to search for suitable replays.
Lastly, we present Synapse, a heterogeneous database replication system designed for Web applications. Web applications are increasingly built using a service-oriented architecture that integrates services powered by a variety of databases. Often, the same data, needed by multiple services, must be replicated across different databases and kept in sync. Unfortunately, these databases use vendor specific data replication engines which are not compatible with each other. To solve this challenge, Synapse operates at the application level to access a unified data representation through object relational mappers. Additionally, Synapse leverages application semantics to replicate data with good consistency semantics using mechanisms similar to Scribe
- âŠ