1,828 research outputs found

    GPU-optimized approaches to molecular docking-based virtual screening in drug discovery: A comparative analysis

    Get PDF
    Finding a novel drug is a very long and complex procedure. Using computer simulations, it is possible to accelerate the preliminary phases by performing a virtual screening that filters a large set of drug candidates to a manageable number. This paper presents the implementations and comparative analysis of two GPU-optimized implementations of a virtual screening algorithm targeting novel GPU architectures. This work focuses on the analysis of parallel computation patterns and their mapping onto the target architecture. The first method adopts a traditional approach that spreads the computation for a single molecule across the entire GPU. The second uses a novel batched approach that exploits the parallel architecture of the GPU to evaluate more molecules in parallel. Experimental results showed a different behavior depending on the size of the database to be screened, either reaching a performance plateau sooner or having a more extended initial transient period to achieve a higher throughput (up to 5x), which is more suitable for extreme-scale virtual screening campaigns

    Evaluating Architectural Safeguards for Uncertain AI Black-Box Components

    Get PDF
    Although tremendous progress has been made in Artificial Intelligence (AI), it entails new challenges. The growing complexity of learning tasks requires more complex AI components, which increasingly exhibit unreliable behaviour. In this book, we present a model-driven approach to model architectural safeguards for AI components and analyse their effect on the overall system reliability

    Reshaping Higher Education for a Post-COVID-19 World: Lessons Learned and Moving Forward

    Get PDF
    No abstract available

    Memory-friendly fixed-point iteration method for nonlinear surface mode oscillations of acoustically driven bubbles: from the perspective of high-performance GPU programming

    Get PDF
    A fixed-point iteration technique is presented to handle the implicit nature of the governing equations of nonlinear surface mode oscillations of acoustically excited microbubbles. The model is adopted from the theoretical work of Shaw [1], where the dynamics of the mean bubble radius and the surface modes are bi-directionally coupled via nonlinear terms. The model comprises a set of second-order ordinary differential equations. It extends the classic Keller–Miksis equation and the linearized dynamical equations for each surface mode. Only the implicit parts (containing the second derivatives) are reevaluated during the iteration process. The performance of the technique is tested at various parameter combinations. The majority of the test cases needs only a single reevaluation to achieve 10^-9 error. Although the arithmetic operation count is higher than the Gauss elimination, due to its memory-friendly matrix-free nature, it is a viable alternative for high-performance GPU computations of massive parameter studies

    Farming out : a study.

    Get PDF
    Farming is one of severals ways of arranging for a group of individuals to perform work simultaneously. Farming is attractive. It is a simple concept, and yet it allocates work dynamically, balancing the load automatically. This gives rise to potentially great efficiency; yet the range of applications that can be farmed efficiently and which implementation strategies are the most effective has not been classified. This research has investigated the types of application, design and implementation that farm efficiently on computer systems constructed from a network of communicating parallel processors. This research shows that all applications can be farmed and identifies those concerns that dictate efficiency. For the first generation of transputer hardware, extensive experiments have been performed using Occam, independent of any specific application. This study identified the boundary conditions that dictate which design parameters farm efficiently. These boundary conditions are expressed in a general form that is directly amenable to other architectures. The specific quantitative results are of direct use to others who wish to implement farms on this architecture. Because of farming’s simplicity and potential for high efficiency, this work concludes that architects of parallel hardware should consider binding this paradigm into future systems so as to enable the dynamic allocation of processes to processors to take place automatically. As well as resulting in high levels of machine utilisation for all programs, this would also permanently remove the burden of allocation from the programmer

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Constraint-based simulation of virtual crowds

    Get PDF
    Central to simulating pedestrian crowds is their motion and behaviour. It is required to understand how pedestrians move to simulate and predict scenarios with crowds of people. Pedestrian behaviours enhance the range of motions people can demonstrate, resulting in greater variety, believability, and accuracy. Models with complex computations and motion have difficulty in being extended with additional behaviours. This is because the structure of these models are not designed in a way that is generally compatible with collision avoidance behaviours. To address this issue, this thesis will research a possible pedestrian model that can simulate collision response with a wide range of additional behaviours. The model will do so by using constraints, a limit on the velocity of a person's movement. The proposed model will use constraints as the core computation. By describing behaviours in terms of constraints, these behaviours can be combined with the proposed model. Pedestrian simulations strike a balance between model complexity and runtime speed. Some models focus entirely on the complexity and accuracy of people, while other models focus on creating believable yet lightweight and performant simulations. Believable crowds look realistic to human observation, but do not match up to numerical analysis under scrutiny. The larger the population, and the more complex the motion of people, the slower the simulation will run. One route for improving performance of software is by using Graphical Processing Units (GPUs). GPUs are devices with theoretical performance that far outperforms equivalent multi-core CPUs. Research literature tends to focus on either the accuracy, or the performance optimisations of pedestrian crowd simulations. This suggests that there is opportunity to create more accurate models that run relatively quickly. Real time is a useful measure of model runtime. A simulation that runs in real time can be interactive and respond live to user input. By increasing the performance of the model, larger and more complex models can be simulated. This in turn increases the range of applications the model can represent. This thesis will develop a performant pedestrian simulation that runs in real time. It will explore how suitable the model is for GPU acceleration, and what performance gains can be obtained by implementing the model on the GPU

    Undergraduate and Graduate Course Descriptions, 2023 Spring

    Get PDF
    Wright State University undergraduate and graduate course descriptions from Spring 2023

    Privaatsust säilitavad paralleelarvutused graafiülesannete jaoks

    Get PDF
    Turvalisel mitmeosalisel arvutusel põhinevate reaalsete privaatsusrakenduste loomine on SMC-protokolli arvutusosaliste ümmarguse keerukuse tõttu keeruline. Privaatsust säilitavate tehnoloogiate uudsuse ja nende probleemidega kaasnevate suurte arvutuskulude tõttu ei ole paralleelseid privaatsust säilitavaid graafikualgoritme veel uuritud. Graafikalgoritmid on paljude arvutiteaduse rakenduste selgroog, nagu navigatsioonisüsteemid, kogukonna tuvastamine, tarneahela võrk, hüperspektraalne kujutis ja hõredad lineaarsed lahendajad. Graafikalgoritmide suurte privaatsete andmekogumite töötlemise kiirendamiseks ja kõrgetasemeliste arvutusnõuete täitmiseks on vaja privaatsust säilitavaid paralleelseid algoritme. Seetõttu esitleb käesolev lõputöö tipptasemel protokolle privaatsuse säilitamise paralleelarvutustes erinevate graafikuprobleemide jaoks, ühe allika lühima tee, kõigi paaride lühima tee, minimaalse ulatuva puu ja metsa ning algebralise tee arvutamise. Need uued protokollid on üles ehitatud kombinatoorsete ja algebraliste graafikualgoritmide põhjal lisaks SMC protokollidele. Nende protokollide koostamiseks kasutatakse ka ühe käsuga mitut andmeoperatsiooni, et vooru keerukust tõhusalt vähendada. Oleme väljapakutud protokollid juurutanud Sharemind SMC platvormil, kasutades erinevaid graafikuid ja võrgukeskkondi. Selles lõputöös kirjeldatakse uudseid paralleelprotokolle koos nendega seotud algoritmide, tulemuste, kiirendamise, hindamiste ja ulatusliku võrdlusuuringuga. Privaatsust säilitavate ühe allika lühimate teede ja minimaalse ulatusega puuprotokollide tegelike juurutuste tulemused näitavad tõhusat meetodit, mis vähendas tööaega võrreldes varasemate töödega sadu kordi. Lisaks ei ole privaatsust säilitavate kõigi paaride lühima tee protokollide hindamine ja ulatuslik võrdlusuuringud sarnased ühegi varasema tööga. Lisaks pole kunagi varem käsitletud privaatsust säilitavaid metsa ja algebralise tee arvutamise protokolle.Constructing real-world privacy applications based on secure multiparty computation is challenging due to the round complexity of the computation parties of SMC protocol. Due to the novelty of privacy-preserving technologies and the high computational costs associated with these problems, parallel privacy-preserving graph algorithms have not yet been studied. Graph algorithms are the backbone of many applications in computer science, such as navigation systems, community detection, supply chain network, hyperspectral image, and sparse linear solvers. In order to expedite the processing of large private data sets for graphs algorithms and meet high-end computational demands, privacy-preserving parallel algorithms are needed. Therefore, this Thesis presents the state-of-the-art protocols in privacy-preserving parallel computations for different graphs problems, single-source shortest path (SSSP), All-pairs shortest path (APSP), minimum spanning tree (MST) and forest (MSF), and algebraic path computation. These new protocols have been constructed based on combinatorial and algebraic graph algorithms on top of the SMC protocols. Single-instruction-multiple-data (SIMD) operations are also used to build those protocols to reduce the round complexities efficiently. We have implemented the proposed protocols on the Sharemind SMC platform using various graphs and network environments. This Thesis outlines novel parallel protocols with their related algorithms, the results, speed-up, evaluations, and extensive benchmarking. The results of the real implementations of the privacy-preserving single-source shortest paths and minimum spanning tree protocols show an efficient method that reduced the running time hundreds of times compared with previous works. Furthermore, the evaluation and extensive benchmarking of privacy-preserving All-pairs shortest path protocols are not similar to any previous work. Moreover, the privacy-preserving minimum spanning forest and algebraic path computation protocols have never been addressed before.https://www.ester.ee/record=b555865

    Architecture and Advanced Electronics Pathways Toward Highly Adaptive Energy- Efficient Computing

    Get PDF
    With the explosion of the number of compute nodes, the bottleneck of future computing systems lies in the network architecture connecting the nodes. Addressing the bottleneck requires replacing current backplane-based network topologies. We propose to revolutionize computing electronics by realizing embedded optical waveguides for onboard networking and wireless chip-to-chip links at 200-GHz carrier frequency connecting neighboring boards in a rack. The control of novel rate-adaptive optical and mm-wave transceivers needs tight interlinking with the system software for runtime resource management
    corecore