2,771 research outputs found
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying
Safeguarding the Intellectual Property (IP) of data has become critically
important as machine learning applications continue to proliferate, and their
success heavily relies on the quality of training data. While various
mechanisms exist to secure data during storage, transmission, and consumption,
fewer studies have been developed to detect whether they are already leaked for
model training without authorization. This issue is particularly challenging
due to the absence of information and control over the training process
conducted by potential attackers.
In this paper, we concentrate on the domain of tabular data and introduce a
novel methodology, Local Distribution Shifting Synthesis (\textsc{LDSS}), to
detect leaked data that are used to train classification models. The core
concept behind \textsc{LDSS} involves injecting a small volume of synthetic
data--characterized by local shifts in class distribution--into the owner's
dataset. This enables the effective identification of models trained on leaked
data through model querying alone, as the synthetic data injection results in a
pronounced disparity in the predictions of models trained on leaked and
modified datasets. \textsc{LDSS} is \emph{model-oblivious} and hence compatible
with a diverse range of classification models, such as Naive Bayes, Decision
Tree, and Random Forest. We have conducted extensive experiments on seven types
of classification models across five real-world datasets. The comprehensive
results affirm the reliability, robustness, fidelity, security, and efficiency
of \textsc{LDSS}.Comment: 13 pages, 11 figures, and 4 table
Collider Inclusive Jet Data and the Gluon Distribution
Inclusive jet production data are important for constraining the gluon
distribution in the global QCD analysis of parton distribution functions. With
the addition of recent CDF and D0 Run II jet data, we study a number of issues
that play a role in determining the up-to-date gluon distribution and its
uncertainty, and produce a new set of parton distributions that make use of
that data. We present in detail the general procedures used to study the
compatibility between new data sets and the previous body of data used in a
global fit. We introduce a new method in which the Hessian matrix for
uncertainties is ``rediagonalized'' to obtain eigenvector sets that
conveniently characterize the uncertainty of a particular observable.Comment: Published versio
Discrete Symmetries in the Weyl Expansion for Quantum Billiards
We consider two and three-dimensional quantum billiards with discrete
symmetries. We derive the first terms of the Weyl expansion for the level
density projected onto the irreducible representations of the symmetry group.
As an illustration the method is applied to the icosahedral billiard. The paper
was published in J. Phys. A /27/ (1994) 4317-4323Comment: 8 printed pages Latex fil
BIOSPHERE MODELING AT YUCCA MOUNTAIN, NEVADA
The objectives of the biosphere modeling efforts are to assess how radionuclides potentially released from the proposed repository could be transported through a variety of environmental media. The study of these transport mechanisms, referred to as pathways, is critical in calculating the potential radiation dose to man. Since most of the existing and pending regulations applicable to the Project are radiation dose based standards, the biosphere modeling effort will provide crucial technical input to support the Viability Assessment (VA), the Working Draft of License Application (WDLA), and the Environmental Impact Statement (EIS)
D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation
In the field of domain adaptation, a trade-off exists between the model
performance and the number of target domain annotations. Active learning,
maximizing model performance with few informative labeled data, comes in handy
for such a scenario. In this work, we present D2ADA, a general active domain
adaptation framework for semantic segmentation. To adapt the model to the
target domain with minimum queried labels, we propose acquiring labels of the
samples with high probability density in the target domain yet with low
probability density in the source domain, complementary to the existing source
domain labeled data. To further facilitate labeling efficiency, we design a
dynamic scheduling policy to adjust the labeling budgets between domain
exploration and model uncertainty over time. Extensive experiments show that
our method outperforms existing active learning and domain adaptation baselines
on two benchmarks, GTA5 -> Cityscapes and SYNTHIA -> Cityscapes. With less than
5% target domain annotations, our method reaches comparable results with that
of full supervision.Comment: 14 pages, 5 figure
The Pitfalls of Necessary Assignments
It has been shown that, finding necessary assignments during a search process, such as Automatic Test Pattern Generation (ATPG) process, can significantly improve its search performance. The techniques to find necessary assignments include static learning, dynamic learning, and recursive learning. All these techniques did improve ATPG performance. However, in our experience with real circuits, we found that necessary assignments can create unnecessary requirements in an ATPG process. Sometimes, these unnecessary requirements are not justifiable such that a testable fault may be mistaken as untestable
Direct and indirect control of the initiation of meiotic recombination by DNA damage checkpoint mechanisms in budding yeast
Meiotic recombination plays an essential role in the proper segregation of chromosomes at meiosis I in many sexually reproducing organisms. Meiotic recombination is initiated by the scheduled formation of genome-wide DNA double-strand breaks (DSBs). The timing of DSB formation is strictly controlled because unscheduled DSB formation is detrimental to genome integrity. Here, we investigated the role of DNA damage checkpoint mechanisms in the control of meiotic DSB formation using budding yeast. By using recombination defective mutants in which meiotic DSBs are not repaired, the effect of DNA damage checkpoint mutations on DSB formation was evaluated. The Tel1 (ATM) pathway mainly responds to unresected DSB ends, thus the sae2 mutant background in which DSB ends remain intact was employed. On the other hand, the Mec1 (ATR) pathway is primarily used when DSB ends are resected, thus the rad51 dmc1 double mutant background was employed in which highly resected DSBs accumulate. In order to separate the effect caused by unscheduled cell cycle progression, which is often associated with DNA damage checkpoint defects, we also employed the ndt80 mutation which permanently arrests the meiotic cell cycle at prophase I. In the absence of Tel1, DSB formation was reduced in larger chromosomes (IV, VII, II and XI) whereas no significant reduction was found in smaller chromosomes (III and VI). On the other hand, the absence of Rad17 (a critical component of the ATR pathway) lead to an increase in DSB formation (chromosomes VII and II were tested). We propose that, within prophase I, the Tel1 pathway facilitates DSB formation, especially in bigger chromosomes, while the Mec1 pathway negatively regulates DSB formation. We also identified prophase I exit, which is under the control of the DNA damage checkpoint machinery, to be a critical event associated with down-regulating meiotic DSB formation
- โฆ