189 research outputs found

    Tracing and profiling machine learning dataflow applications on GPU

    Get PDF
    In this paper, we propose a profiling and tracing method for dataflow applications with GPU acceleration. Dataflow models can be represented by graphs and are widely used in many domains like signal processing or machine learning. Within the graph, the data flows along the edges, and the nodes correspond to the computing units that process the data. To accelerate the execution, some co-processing units, like GPUs, are often used for computing intensive nodes. The work in this paper aims at providing useful information about the execution of the dataflow graph on the available hardware, in order to understand and possibly improve the performance. The collected traces include low-level information about the CPU, from the Linux Kernel (system calls), as well as mid-level and high-level information respectively about intermediate libraries like CUDA, HIP or HSA, and the dataflow model. This is followed by post-mortem analysis and visualization steps in order to enhance the trace and show useful information to the user. To demonstrate the effectiveness of the method, it was evaluated for TensorFlow, a well-known machine learning library that uses a dataflow computational graph to represent the algorithms. We present a few examples of machine learning applications that can be optimized with the help of the information provided by our proposed method. For example, we reduce the execution time of a face recognition application by a factor of 5X. We suggest a better placement of the computation nodes on the available hardware components for a distributed application. Finally, we also enhance the memory management of an application to speed up the execution

    Traçage et profilage d'applications d'apprentissage automatique de type flot de données utilisant un processeur graphique

    Get PDF
    Actuellement, les besoins en puissance de calcul sont de plus en plus importants, alors que les améliorations au niveau du matériel commencent à ralentir. La puissance des processeurs et notamment leur fréquence de fonctionnement stagnent pour des raisons physiques comme la finesse de gravure ou la dissipation de chaleur. Afin de surpasser ces limites, le calcul en parallèle semble être une solution prometteuse avec l’utilisation d’architectures hétérogènes. Ces dernières mettent en oeuvre une combinaison de plusieurs unités de calculs de types possiblement différents, ce qui leur permet d’offrir un fonctionnement hautement parallèle. Malgré tout, utiliser l’ensemble du matériel efficacement reste difficile, et la programmation au niveau logiciel de ces architectures pose problème. Par conséquent, différents modèles ont émergé avec notamment les approches flot de données. Ces dernières proposent des caractéristiques très adaptées pour ce genre de contexte parallèle. Elles permettent de programmer plus facilement les différentes unités de calcul afin de bénéficier au maximum du matériel disponible. Dans un contexte de recherche de performance optimale, il est essentiel d’avoir des outils permettant de diagnostiquer d’éventuels problèmes. Quelques solutions ont déjà pu démontrer leur efficacité dans le cas d’un modèle de programmation plus traditionnel et séquentiel, utilisant ou non un processeur graphique. On retrouve par exemple des outils comme LTTng ou Ftrace destinés à l’analyse du processeur central. Concernant les processeurs graphiques, les outils propriétaires et à sources fermées, proposés par les constructeurs sont en général les plus complets et privilégiés par les programmeurs. Cela présente toutefois une limite, puisque les solutions ne sont pas générales et restent dépendantes du matériel proposé par un constructeur. Par ailleurs, elles offrent une flexibilité limitée avec des visualisations et analyses définies et fixes qui ne peuvent ni être modifiées ni améliorées en fonction des besoins d’un utilisateur. Finalement, aucun outil existant ne s’intéresse spécifiquement aux modèles flot de données.----------ABSTRACT: Recently, increasing computing capabilities have been required in various areas like scientific computing, video games and graphical rendering or artificial intelligence. These domains usually involve the processing of a large amount of data, intended to be performed as fast as possible. Unfortunately, hardware improvements have recently slowed down. The CPU clock speed, for example, is not increasing much any more, possibly nearing technological limits. Physical constraints like the heat dissipation or fine etching are the main reasons for that. Consequently, new opportunities like parallel processing using heterogeneous architectures became popular. In this context, the traditional processors get support from other computing units like graphical processors. In order to program these, the dataflow model offers several advantages. It is inherently parallel and thus well adapted. In this context, guaranteeing optimal performances is another main concern. For that, tracing and profiling central processing and graphical processing units are two useful techniques that can be considered. Several tools exist, like LTTng and FTrace that can trace the operating system and focus on the central processor. In addition, proprietary tools offered by hardware vendors can help to analyze and monitor the graphical processor. However, these tools are specific to one type of hardware and lack flexibility. Moreover, none of them target in particular dataflow applications executed on a heterogeneous platform

    Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views

    Get PDF
    Code is available at https://gitlab.inria.fr/pzins/data-driven-3d-reconstruction-of-dressed-humans-from-sparse-viewsInternational audienceRecently, data-driven single-view reconstruction methods have shown great progress in modeling 3D dressed humans. However, such methods suffer heavily from depth ambiguities and occlusions inherent to single view inputs.In this paper, we tackle this problem by considering a small set of input views and investigate the best strategy to suitably exploit information from these views. We propose a data-driven end-to-end approach that reconstructs an implicit 3D representation of dressed humans from sparse camera views. Specifically, we introduce three key components: first a spatially consistent reconstruction that allows for arbitrary placement of the person in the input views using a perspective camera model; second an attention-based fusion layer that learns to aggregate visual information from several viewpoints; and third a mechanism that encodes local 3D patterns under the multi-view context. In the experiments, we show the proposed approach outperforms the state of the art on standard data both quantitatively and qualitatively. To demonstrate the spatially consistent reconstruction, we apply our approach to dynamic scenes. Additionally, we apply our method on real data acquired with a multi-camera platform and demonstrate our approach can obtain results comparable to multi-view stereo with dramatically less views

    Status and new operation modes of the versatile VLT/NACO

    Full text link
    This paper aims at giving an update on the most versatile adaptive optics fed instrument to date, the well known and successful NACO . Although NACO is only scheduled for about two more years at the Very Large Telescope (VLT), it keeps on evolving with additional operation modes bringing original astronomical results. The high contrast imaging community uses it creatively as a test-bench for SPHERE and other second generation planet imagers. A new visible wavefront sensor (WFS) optimized for Laser Guide Star (LGS) operations has been installed and tested, the cube mode is more and more requested for frame selection on bright sources, a seeing enhancer mode (no tip/tilt correction) is now offered to provide full sky coverage and welcome all kind of extragalactic applications, etc. The Instrument Operations Team (IOT) and Paranal engineers are currently working hard at maintaining the instrument overall performances but also at improving them and offering new capabilities, providing the community with a well tuned and original instrument for the remaining time it is being used. The present contribution delivers a non-exhaustive overview of the new modes and experiments that have been carried out in the past months.Comment: 10 pages, 7 figures, SPIE 2010 Astronomical Instrumentation Proceedin

    Reducing socio-economic inequalities in all-cause mortality: a counterfactual mediation approach.

    Get PDF
    Socio-economic inequalities in mortality are well established, yet the contribution of intermediate risk factors that may underlie these relationships remains unclear. We evaluated the role of multiple modifiable intermediate risk factors underlying socio-economic-associated mortality and quantified the potential impact of reducing early all-cause mortality by hypothetically altering socio-economic risk factors. Data were from seven cohort studies participating in the LIFEPATH Consortium (total n = 179 090). Using both socio-economic position (SEP) (based on occupation) and education, we estimated the natural direct effect on all-cause mortality and the natural indirect effect via the joint mediating role of smoking, alcohol intake, dietary patterns, physical activity, body mass index, hypertension, diabetes and coronary artery disease. Hazard ratios (HRs) were estimated, using counterfactual natural effect models under different hypothetical actions of either lower or higher SEP or education. Lower SEP and education were associated with an increase in all-cause mortality within an average follow-up time of 17.5 years. Mortality was reduced via modelled hypothetical actions of increasing SEP or education. Through higher education, the HR was 0.85 [95% confidence interval (CI) 0.84, 0.86] for women and 0.71 (95% CI 0.70, 0.74) for men, compared with lower education. In addition, 34% and 38% of the effect was jointly mediated for women and men, respectively. The benefits from altering SEP were slightly more modest. These observational findings support policies to reduce mortality both through improving socio-economic circumstances and increasing education, and by altering intermediaries, such as lifestyle behaviours and morbidities

    The GRAVITY+ Project: Towards All-sky, Faint-Science, High-Contrast Near-Infrared Interferometry at the VLTI

    Full text link
    The GRAVITY instrument has been revolutionary for near-infrared interferometry by pushing sensitivity and precision to previously unknown limits. With the upgrade of GRAVITY and the Very Large Telescope Interferometer (VLTI) in GRAVITY+, these limits will be pushed even further, with vastly improved sky coverage, as well as faint-science and high-contrast capabilities. This upgrade includes the implementation of wide-field off-axis fringe-tracking, new adaptive optics systems on all Unit Telescopes, and laser guide stars in an upgraded facility. GRAVITY+ will open up the sky to the measurement of black hole masses across cosmic time in hundreds of active galactic nuclei, use the faint stars in the Galactic centre to probe General Relativity, and enable the characterisation of dozens of young exoplanets to study their formation, bearing the promise of another scientific revolution to come at the VLTI.Comment: Published in the ESO Messenge

    Autoantibodies against type I IFNs in patients with critical influenza pneumonia

    Full text link
    In an international cohort of 279 patients with hypoxemic influenza pneumonia, we identified 13 patients (4.6%) with autoantibodies neutralizing IFN-alpha and/or -omega, which were previously reported to underlie 15% cases of life-threatening COVID-19 pneumonia and one third of severe adverse reactions to live-attenuated yellow fever vaccine. Autoantibodies neutralizing type I interferons (IFNs) can underlie critical COVID-19 pneumonia and yellow fever vaccine disease. We report here on 13 patients harboring autoantibodies neutralizing IFN-alpha 2 alone (five patients) or with IFN-omega (eight patients) from a cohort of 279 patients (4.7%) aged 6-73 yr with critical influenza pneumonia. Nine and four patients had antibodies neutralizing high and low concentrations, respectively, of IFN-alpha 2, and six and two patients had antibodies neutralizing high and low concentrations, respectively, of IFN-omega. The patients' autoantibodies increased influenza A virus replication in both A549 cells and reconstituted human airway epithelia. The prevalence of these antibodies was significantly higher than that in the general population for patients 70 yr of age (3.1 vs. 4.4%, P = 0.68). The risk of critical influenza was highest in patients with antibodies neutralizing high concentrations of both IFN-alpha 2 and IFN-omega (OR = 11.7, P = 1.3 x 10(-5)), especially those <70 yr old (OR = 139.9, P = 3.1 x 10(-10)). We also identified 10 patients in additional influenza patient cohorts. Autoantibodies neutralizing type I IFNs account for similar to 5% of cases of life-threatening influenza pneumonia in patients <70 yr old

    Autoantibodies neutralizing type I IFNs are present in ~4% of uninfected individuals over 70 years old and account for ~20% of COVID-19 deaths

    Get PDF
    Publisher Copyright: © 2021 The Authors, some rights reserved.Circulating autoantibodies (auto-Abs) neutralizing high concentrations (10 ng/ml; in plasma diluted 1:10) of IFN-alpha and/or IFN-omega are found in about 10% of patients with critical COVID-19 (coronavirus disease 2019) pneumonia but not in individuals with asymptomatic infections. We detect auto-Abs neutralizing 100-fold lower, more physiological, concentrations of IFN-alpha and/or IFN-omega (100 pg/ml; in 1:10 dilutions of plasma) in 13.6% of 3595 patients with critical COVID-19, including 21% of 374 patients >80 years, and 6.5% of 522 patients with severe COVID-19. These antibodies are also detected in 18% of the 1124 deceased patients (aged 20 days to 99 years; mean: 70 years). Moreover, another 1.3% of patients with critical COVID-19 and 0.9% of the deceased patients have auto-Abs neutralizing high concentrations of IFN-beta. We also show, in a sample of 34,159 uninfected individuals from the general population, that auto-Abs neutralizing high concentrations of IFN-alpha and/or IFN-omega are present in 0.18% of individuals between 18 and 69 years, 1.1% between 70 and 79 years, and 3.4% >80 years. Moreover, the proportion of individuals carrying auto-Abs neutralizing lower concentrations is greater in a subsample of 10,778 uninfected individuals: 1% of individuals 80 years. By contrast, auto-Abs neutralizing IFN-beta do not become more frequent with age. Auto-Abs neutralizing type I IFNs predate SARS-CoV-2 infection and sharply increase in prevalence after the age of 70 years. They account for about 20% of both critical COVID-19 cases in the over 80s and total fatal COVID-19 cases.Peer reviewe

    Low incidence of SARS-CoV-2, risk factors of mortality and the course of illness in the French national cohort of dialysis patients

    Get PDF
    corecore