200 research outputs found
Vamsa: Automated Provenance Tracking in Data Science Scripts
There has recently been a lot of ongoing research in the areas of fairness,
bias and explainability of machine learning (ML) models due to the self-evident
or regulatory requirements of various ML applications. We make the following
observation: All of these approaches require a robust understanding of the
relationship between ML models and the data used to train them. In this work,
we introduce the ML provenance tracking problem: the fundamental idea is to
automatically track which columns in a dataset have been used to derive the
features/labels of an ML model. We discuss the challenges in capturing such
information in the context of Python, the most common language used by data
scientists. We then present Vamsa, a modular system that extracts provenance
from Python scripts without requiring any changes to the users' code. Using 26K
real data science scripts, we verify the effectiveness of Vamsa in terms of
coverage, and performance. We also evaluate Vamsa's accuracy on a smaller
subset of manually labeled data. Our analysis shows that Vamsa's precision and
recall range from 90.4% to 99.1% and its latency is in the order of
milliseconds for average size scripts. Drawing from our experience in deploying
ML models in production, we also present an example in which Vamsa helps
automatically identify models that are affected by data corruption issues
Moment inversion problem for piecewise D-finite functions
We consider the problem of exact reconstruction of univariate functions with
jump discontinuities at unknown positions from their moments. These functions
are assumed to satisfy an a priori unknown linear homogeneous differential
equation with polynomial coefficients on each continuity interval. Therefore,
they may be specified by a finite amount of information. This reconstruction
problem has practical importance in Signal Processing and other applications.
It is somewhat of a ``folklore'' that the sequence of the moments of such
``piecewise D-finite''functions satisfies a linear recurrence relation of
bounded order and degree. We derive this recurrence relation explicitly. It
turns out that the coefficients of the differential operator which annihilates
every piece of the function, as well as the locations of the discontinuities,
appear in this recurrence in a precisely controlled manner. This leads to the
formulation of a generic algorithm for reconstructing a piecewise D-finite
function from its moments. We investigate the conditions for solvability of the
resulting linear systems in the general case, as well as analyze a few
particular examples. We provide results of numerical simulations for several
types of signals, which test the sensitivity of the proposed algorithm to
noise
Geometric Approach to Quantum Statistical Mechanics and Application to Casimir Energy and Friction Properties
A geometric approach to general quantum statistical systems (including the
harmonic oscillator) is presented. It is applied to Casimir energy and the
dissipative system with friction. We regard the (N+1)-dimensional Euclidean
{\it coordinate} system (X,) as the quantum statistical system of N
quantum (statistical) variables (X) and one {\it Euclidean time} variable
(). Introducing paths (lines or hypersurfaces) in this space
(X,), we adopt the path-integral method to quantize the mechanical
system. This is a new view of (statistical) quantization of the {\it
mechanical} system. The system Hamiltonian appears as the {\it area}. We show
quantization is realized by the {\it minimal area principle} in the present
geometric approach. When we take a {\it line} as the path, the path-integral
expressions of the free energy are shown to be the ordinary ones (such as N
harmonic oscillators) or their simple variation. When we take a {\it
hyper-surface} as the path, the system Hamiltonian is given by the {\it area}
of the {\it hyper-surface} which is defined as a {\it closed-string
configuration} in the bulk space. In this case, the system becomes a O(N)
non-linear model. We show the recently-proposed 5 dimensional Casimir energy
(ArXiv:0801.3064,0812.1263) is valid. We apply this approach to the
visco-elastic system, and present a new method using the path-integral for the
calculation of the dissipative properties.Comment: 20 pages, 8 figures, Proceedings of ICFS2010 (2010.9.13-18,
Ise-Shima, Mie, Japan
Synthesis of Bimetallic Uranium and Neptunium Complexes of a Binucleating Macrocycle and Determination of the Solid-State Structure by Magnetic Analysis
Syntheses of the bimetallic uranium(III) and neptunium(III) complexes [(UI)2(L)], [(NpI)2(L)], and [{U(BH4)}2(L)] of the Schiff-base
pyrrole macrocycles L are described. In the absence of single-crystal structural data, fitting of the variable-temperature solid-state magnetic data allows the prediction of polymeric structures for these compounds in the solid state.JRC.E.6-Actinides researc
Review of the methods of determination of directed connectivity from multichannel data
The methods applied for estimation of functional connectivity from multichannel data are described with special emphasis on the estimators of directedness such as directed transfer function (DTF) and partial directed coherence. These estimators based on multivariate autoregressive model are free of pitfalls connected with application of bivariate measures. The examples of applications illustrating the performance of the methods are given. Time-varying estimators of directedness: short-time DTF and adaptive methods are presented
On the impact of different approaches to classify age-related macular degeneration: Results from the German AugUR study
While age-related macular degeneration (AMD) poses an important personal and public health burden, comparing epidemiological studies on AMD is hampered by differing approaches to classify AMD. In our AugUR study survey, recruiting residents from in/around Regensburg, Germany, aged 70+, we analyzed the AMD status derived from color fundus images applying two different classification systems. Based on 1,040 participants with gradable fundus images for at least one eye, we show that including individuals with only one gradable eye (n = 155) underestimates AMD prevalence and we provide a correction procedure. Bias-corrected and standardized to the Bavarian population, late AMD prevalence is 7.3% (95% confidence interval = [5.4; 9.4]). We find substantially different prevalence estimates for "early/intermediate AMD" depending on the classification system: 45.3% (95%-CI = [41.8; 48.7]) applying the Clinical Classification (early/intermediate AMD) or 17.1% (95%-CI = [14.6; 19.7]) applying the Three Continent AMD Consortium Severity Scale (mild/moderate/severe early AMD). We thus provide a first effort to grade AMD in a complete study with different classification systems, a first approach for bias-correction from individuals with only one gradable eye, and the first AMD prevalence estimates from a German elderly population. Our results underscore substantial differences for early/intermediate AMD prevalence estimates between classification systems and an urgent need for harmonization
Scaling Effects and Spatio-Temporal Multilevel Dynamics in Epileptic Seizures
Epileptic seizures are one of the most well-known dysfunctions of the nervous system. During a seizure, a highly synchronized behavior of neural activity is observed that can cause symptoms ranging from mild sensual malfunctions to the complete loss of body control. In this paper, we aim to contribute towards a better understanding of the dynamical systems phenomena that cause seizures. Based on data analysis and modelling, seizure dynamics can be identified to possess multiple spatial scales and on each spatial scale also multiple time scales. At each scale, we reach several novel insights. On the smallest spatial scale we consider single model neurons and investigate early-warning signs of spiking. This introduces the theory of critical transitions to excitable systems. For clusters of neurons (or neuronal regions) we use patient data and find oscillatory behavior and new scaling laws near the seizure onset. These scalings lead to substantiate the conjecture obtained from mean-field models that a Hopf bifurcation could be involved near seizure onset. On the largest spatial scale we introduce a measure based on phase-locking intervals and wavelets into seizure modelling. It is used to resolve synchronization between different regions in the brain and identifies time-shifted scaling laws at different wavelet scales. We also compare our wavelet-based multiscale approach with maximum linear cross-correlation and mean-phase coherence measures
Uranium Nitrogen Multiple Bonding: Isostructural Anionic, Neutral, and Cationic Uranium Nitride Complexes Featuring a Linear U=N=U Core
Non-Agonistic Bivalent Antibodies That Promote c-MET Degradation and Inhibit Tumor Growth and Others Specific for Tumor Related c-MET
The c-MET receptor has a function in many human cancers and is a proven therapeutic target. Generating antagonistic or therapeutic monoclonal antibodies (mAbs) targeting c-MET has been difficult because bivalent, intact anti-Met antibodies frequently display agonistic activity, necessitating the use of monovalent antibody fragments for therapy. By using a novel strategy that included immunizing with cells expressing c-MET, we obtained a range of mAbs. These c-MET mAbs were tested for binding specificity and anti-tumor activity using a range of cell-based techniques and in silico modeling. The LMH 80 antibody bound an epitope, contained in the small cysteine-rich domain of c-MET (amino acids 519โ561), that was preferentially exposed on the c-MET precursor. Since the c-MET precursor is only expressed on the surface of cancer cells and not normal cells, this antibody is potentially tumor specific. An interesting subset of our antibodies displayed profound activities on c-MET internalization and degradation. LMH 87, an antibody binding the loop connecting strands 3d and 4a of the 7-bladed ฮฒ-propeller domain of c-MET, displayed no intrinsic agonistic activity but promoted receptor internalization and degradation. LMH 87 inhibited HGF/SF-induced migration of SK-OV-3 ovarian carcinoma cells, the proliferation of A549 lung cancer cells and the growth of human U87MG glioma cells in a mouse xenograft model. These results indicate that c-MET antibodies targeting epitopes controlling receptor internalization and degradation provide new ways of controlling c-MET expression and activity and may enable the therapeutic targeting of c-MET by intact, bivalent antibodies
Seizure prediction : ready for a new era
Acknowledgements: The authors acknowledge colleagues in the international seizure prediction group for valuable discussions. L.K. acknowledges funding support from the National Health and Medical Research Council (APP1130468) and the James S. McDonnell Foundation (220020419) and acknowledges the contribution of Dean R. Freestone at the University of Melbourne, Australia, to the creation of Fig. 3.Peer reviewedPostprin
- โฆ