200 research outputs found

    Vamsa: Automated Provenance Tracking in Data Science Scripts

    Full text link
    There has recently been a lot of ongoing research in the areas of fairness, bias and explainability of machine learning (ML) models due to the self-evident or regulatory requirements of various ML applications. We make the following observation: All of these approaches require a robust understanding of the relationship between ML models and the data used to train them. In this work, we introduce the ML provenance tracking problem: the fundamental idea is to automatically track which columns in a dataset have been used to derive the features/labels of an ML model. We discuss the challenges in capturing such information in the context of Python, the most common language used by data scientists. We then present Vamsa, a modular system that extracts provenance from Python scripts without requiring any changes to the users' code. Using 26K real data science scripts, we verify the effectiveness of Vamsa in terms of coverage, and performance. We also evaluate Vamsa's accuracy on a smaller subset of manually labeled data. Our analysis shows that Vamsa's precision and recall range from 90.4% to 99.1% and its latency is in the order of milliseconds for average size scripts. Drawing from our experience in deploying ML models in production, we also present an example in which Vamsa helps automatically identify models that are affected by data corruption issues

    Moment inversion problem for piecewise D-finite functions

    Full text link
    We consider the problem of exact reconstruction of univariate functions with jump discontinuities at unknown positions from their moments. These functions are assumed to satisfy an a priori unknown linear homogeneous differential equation with polynomial coefficients on each continuity interval. Therefore, they may be specified by a finite amount of information. This reconstruction problem has practical importance in Signal Processing and other applications. It is somewhat of a ``folklore'' that the sequence of the moments of such ``piecewise D-finite''functions satisfies a linear recurrence relation of bounded order and degree. We derive this recurrence relation explicitly. It turns out that the coefficients of the differential operator which annihilates every piece of the function, as well as the locations of the discontinuities, appear in this recurrence in a precisely controlled manner. This leads to the formulation of a generic algorithm for reconstructing a piecewise D-finite function from its moments. We investigate the conditions for solvability of the resulting linear systems in the general case, as well as analyze a few particular examples. We provide results of numerical simulations for several types of signals, which test the sensitivity of the proposed algorithm to noise

    Geometric Approach to Quantum Statistical Mechanics and Application to Casimir Energy and Friction Properties

    Full text link
    A geometric approach to general quantum statistical systems (including the harmonic oscillator) is presented. It is applied to Casimir energy and the dissipative system with friction. We regard the (N+1)-dimensional Euclidean {\it coordinate} system (Xi^i,ฯ„\tau) as the quantum statistical system of N quantum (statistical) variables (Xi^i) and one {\it Euclidean time} variable (ฯ„\tau). Introducing paths (lines or hypersurfaces) in this space (Xi^i,ฯ„\tau), we adopt the path-integral method to quantize the mechanical system. This is a new view of (statistical) quantization of the {\it mechanical} system. The system Hamiltonian appears as the {\it area}. We show quantization is realized by the {\it minimal area principle} in the present geometric approach. When we take a {\it line} as the path, the path-integral expressions of the free energy are shown to be the ordinary ones (such as N harmonic oscillators) or their simple variation. When we take a {\it hyper-surface} as the path, the system Hamiltonian is given by the {\it area} of the {\it hyper-surface} which is defined as a {\it closed-string configuration} in the bulk space. In this case, the system becomes a O(N) non-linear model. We show the recently-proposed 5 dimensional Casimir energy (ArXiv:0801.3064,0812.1263) is valid. We apply this approach to the visco-elastic system, and present a new method using the path-integral for the calculation of the dissipative properties.Comment: 20 pages, 8 figures, Proceedings of ICFS2010 (2010.9.13-18, Ise-Shima, Mie, Japan

    Synthesis of Bimetallic Uranium and Neptunium Complexes of a Binucleating Macrocycle and Determination of the Solid-State Structure by Magnetic Analysis

    Get PDF
    Syntheses of the bimetallic uranium(III) and neptunium(III) complexes [(UI)2(L)], [(NpI)2(L)], and [{U(BH4)}2(L)] of the Schiff-base pyrrole macrocycles L are described. In the absence of single-crystal structural data, fitting of the variable-temperature solid-state magnetic data allows the prediction of polymeric structures for these compounds in the solid state.JRC.E.6-Actinides researc

    Review of the methods of determination of directed connectivity from multichannel data

    Get PDF
    The methods applied for estimation of functional connectivity from multichannel data are described with special emphasis on the estimators of directedness such as directed transfer function (DTF) and partial directed coherence. These estimators based on multivariate autoregressive model are free of pitfalls connected with application of bivariate measures. The examples of applications illustrating the performance of the methods are given. Time-varying estimators of directedness: short-time DTF and adaptive methods are presented

    On the impact of different approaches to classify age-related macular degeneration: Results from the German AugUR study

    Get PDF
    While age-related macular degeneration (AMD) poses an important personal and public health burden, comparing epidemiological studies on AMD is hampered by differing approaches to classify AMD. In our AugUR study survey, recruiting residents from in/around Regensburg, Germany, aged 70+, we analyzed the AMD status derived from color fundus images applying two different classification systems. Based on 1,040 participants with gradable fundus images for at least one eye, we show that including individuals with only one gradable eye (n = 155) underestimates AMD prevalence and we provide a correction procedure. Bias-corrected and standardized to the Bavarian population, late AMD prevalence is 7.3% (95% confidence interval = [5.4; 9.4]). We find substantially different prevalence estimates for "early/intermediate AMD" depending on the classification system: 45.3% (95%-CI = [41.8; 48.7]) applying the Clinical Classification (early/intermediate AMD) or 17.1% (95%-CI = [14.6; 19.7]) applying the Three Continent AMD Consortium Severity Scale (mild/moderate/severe early AMD). We thus provide a first effort to grade AMD in a complete study with different classification systems, a first approach for bias-correction from individuals with only one gradable eye, and the first AMD prevalence estimates from a German elderly population. Our results underscore substantial differences for early/intermediate AMD prevalence estimates between classification systems and an urgent need for harmonization

    Scaling Effects and Spatio-Temporal Multilevel Dynamics in Epileptic Seizures

    Get PDF
    Epileptic seizures are one of the most well-known dysfunctions of the nervous system. During a seizure, a highly synchronized behavior of neural activity is observed that can cause symptoms ranging from mild sensual malfunctions to the complete loss of body control. In this paper, we aim to contribute towards a better understanding of the dynamical systems phenomena that cause seizures. Based on data analysis and modelling, seizure dynamics can be identified to possess multiple spatial scales and on each spatial scale also multiple time scales. At each scale, we reach several novel insights. On the smallest spatial scale we consider single model neurons and investigate early-warning signs of spiking. This introduces the theory of critical transitions to excitable systems. For clusters of neurons (or neuronal regions) we use patient data and find oscillatory behavior and new scaling laws near the seizure onset. These scalings lead to substantiate the conjecture obtained from mean-field models that a Hopf bifurcation could be involved near seizure onset. On the largest spatial scale we introduce a measure based on phase-locking intervals and wavelets into seizure modelling. It is used to resolve synchronization between different regions in the brain and identifies time-shifted scaling laws at different wavelet scales. We also compare our wavelet-based multiscale approach with maximum linear cross-correlation and mean-phase coherence measures

    Non-Agonistic Bivalent Antibodies That Promote c-MET Degradation and Inhibit Tumor Growth and Others Specific for Tumor Related c-MET

    Get PDF
    The c-MET receptor has a function in many human cancers and is a proven therapeutic target. Generating antagonistic or therapeutic monoclonal antibodies (mAbs) targeting c-MET has been difficult because bivalent, intact anti-Met antibodies frequently display agonistic activity, necessitating the use of monovalent antibody fragments for therapy. By using a novel strategy that included immunizing with cells expressing c-MET, we obtained a range of mAbs. These c-MET mAbs were tested for binding specificity and anti-tumor activity using a range of cell-based techniques and in silico modeling. The LMH 80 antibody bound an epitope, contained in the small cysteine-rich domain of c-MET (amino acids 519โ€“561), that was preferentially exposed on the c-MET precursor. Since the c-MET precursor is only expressed on the surface of cancer cells and not normal cells, this antibody is potentially tumor specific. An interesting subset of our antibodies displayed profound activities on c-MET internalization and degradation. LMH 87, an antibody binding the loop connecting strands 3d and 4a of the 7-bladed ฮฒ-propeller domain of c-MET, displayed no intrinsic agonistic activity but promoted receptor internalization and degradation. LMH 87 inhibited HGF/SF-induced migration of SK-OV-3 ovarian carcinoma cells, the proliferation of A549 lung cancer cells and the growth of human U87MG glioma cells in a mouse xenograft model. These results indicate that c-MET antibodies targeting epitopes controlling receptor internalization and degradation provide new ways of controlling c-MET expression and activity and may enable the therapeutic targeting of c-MET by intact, bivalent antibodies

    Seizure prediction : ready for a new era

    Get PDF
    Acknowledgements: The authors acknowledge colleagues in the international seizure prediction group for valuable discussions. L.K. acknowledges funding support from the National Health and Medical Research Council (APP1130468) and the James S. McDonnell Foundation (220020419) and acknowledges the contribution of Dean R. Freestone at the University of Melbourne, Australia, to the creation of Fig. 3.Peer reviewedPostprin
    • โ€ฆ
    corecore