712 research outputs found
Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels
One important assumption underlying common classification models is the
stationarity of the data. However, in real-world streaming applications, the
data concept indicated by the joint distribution of feature and label is not
stationary but drifting over time. Concept drift detection aims to detect such
drifts and adapt the model so as to mitigate any deterioration in the model's
predictive performance. Unfortunately, most existing concept drift detection
methods rely on a strong and over-optimistic condition that the true labels are
available immediately for all already classified instances. In this paper, a
novel Hierarchical Hypothesis Testing framework with Request-and-Reverify
strategy is developed to detect concept drifts by requesting labels only when
necessary. Two methods, namely Hierarchical Hypothesis Testing with
Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with
Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the
novel framework. In experiments with benchmark datasets, our methods
demonstrate overwhelming advantages over state-of-the-art unsupervised drift
detectors. More importantly, our methods even outperform DDM (the widely used
supervised drift detector) when we use significantly fewer labels.Comment: Published as a conference paper at IJCAI 201
Modeling and analysis of thick suspended deep x-ray liga inductors on CMOS/BiCMOS substrate
Modeling and simulation results for two types of 150 μm height air suspended inductors proposed for LIGA fabrication are presented. The inductor substrates used model the TSMC 0.18 μm CMOS/BiCMOS substrates. The RF performance between the suspended structure and the unsuspended counterpart are compared and the advantage of the suspended structures is explored. The potential of LIGA for fabricating high suspended inductors with good performance and for combining these with CMOS/BiCMOS is demonstrated
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Text generation from a knowledge base aims to translate knowledge triples to
natural language descriptions. Most existing methods ignore the faithfulness
between a generated text description and the original table, leading to
generated information that goes beyond the content of the table. In this paper,
for the first time, we propose a novel Transformer-based generation framework
to achieve the goal. The core techniques in our method to enforce faithfulness
include a new table-text optimal-transport matching loss and a table-text
embedding similarity loss based on the Transformer model. Furthermore, to
evaluate faithfulness, we propose a new automatic metric specialized to the
table-to-text generation problem. We also provide detailed analysis on each
component of our model in our experiments. Automatic and human evaluations show
that our framework can significantly outperform state-of-the-art by a large
margin.Comment: Accepted at ACL202
Interannual Variations and Trends in Global Land Surface Phenology Derived from Enhanced Vegetation Index During 1982-2010
Land swiace phenology is widely retrieved from satellite observations at regional and global scales, and its long-term record has been demonstmted to be a valuable tool for reconstructing past climate variations, monitoring the dynamics of terrestrial ecosystems in response to climate impacts, and predicting biological responses to future climate scenarios. This srudy detected global land surface phenology from the advanced very high resolution radiometer (AVHRR) and the Moderate Resolution Imaging Spectroradiometer (MODIS) data from 1982 to 2010. Based on daily enhanced vegetation index at a spatial resolution of 0.05 degrees, we simulated the seasonal vegetative trajectory for each individual pixel using piecewise logistic models, which was then used to detect the onset of greenness increase (OGI) and the length of vegetation growing season (GSL). Further, both overall interannual variations and pixel-based trends were examIned across Koeppen's climate regions for the periods of 1982-1999 and 2000-2010, respectively. The results show that OGI and OSL varied considerably during 1982-2010 across the globe. Generally, the interarmual variation could be more than a month in precipitation-controlled tropical and dry climates while it was mainly less than 15 days in temperature-controlled temperate, cold, and polar climates. OGI, overall, shifted early, and GSL was prolonged from 1982 to 2010 in most climate regions in North America and Asia while the consistently significant trends only occurred in cold climate and polar climate in North America. The overall trends in Europe were generally insignificant. Over South America, late OGI was consistent (particularly from 1982 to 1999) while either positive or negative OSL trends in a climate region were mostly reversed between the periods of 1982-1999 and 2000-2010. In the Northern Hemisphere of Africa, OGI trends were mostly insignificant, but prolonged GSL was evident over individual climate regions during the last 3 decades. OGI mainly showed late trends in the Southern Hemisphere of Africa while GSL was reversed from reduced GSL trends (1982-1999) to prolonged trends (2000-2010). In Australia, GSL exhibited considerable interannual variation, but the consistent trend lacked presence in most regions. Finally, the proportion of pixels with significant trends was less than I% in most of climate regions although it could be as large as 10%
NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation
In this paper, we propose a Chinese multi-turn topic-driven conversation
dataset, NaturalConv, which allows the participants to chat anything they want
as long as any element from the topic is mentioned and the topic shift is
smooth. Our corpus contains 19.9K conversations from six domains, and 400K
utterances with an average turn number of 20.1. These conversations contain
in-depth discussions on related topics or widely natural transition between
multiple topics. We believe either way is normal for human conversation. To
facilitate the research on this corpus, we provide results of several benchmark
models. Comparative results show that for this dataset, our current models are
not able to provide significant improvement by introducing background
knowledge/topic. Therefore, the proposed dataset should be a good benchmark for
further research to evaluate the validity and naturalness of multi-turn
conversation systems. Our dataset is available at
https://ai.tencent.com/ailab/nlp/dialogue/#datasets.Comment: Accepted as a main track paper at AAAI 202
Detecting and quantifying natural selection at two linked loci from time series data of allele frequencies with forward-in-time simulations
Recent advances in DNA sequencing techniques have made it possible to monitor genomes in great detail over time. This improvement provides an opportunity for us to study natural selection based on time serial samples of genomes while accounting for genetic recombination effect and local linkage information. Such time series genomic data allow for more accurate estimation of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel Bayesian statistical framework for inferring natural selection at a pair of linked loci by capitalising on the temporal aspect of DNA data with the additional flexibility of modeling the sampled chromosomes that contain unknown alleles. Our approach is built on a hidden Markov model where the underlying process is a two-locus Wright-Fisher diffusion with selection, which enables us to explicitly model genetic recombination and local linkage. The posterior probability distribution for selection coefficients is computed by applying the particle marginal Metropolis-Hastings algorithm, which allows us to efficiently calculate the likelihood. We evaluate the performance of our Bayesian inference procedure through extensive simulations, showing that our approach can deliver accurate estimates of selection coefficients, and the addition of genetic recombination and local linkage brings about significant improvement in the inference of natural selection. We also illustrate the utility of our method on real data with an application to ancient DNA data associated with white spotting patterns in horses
A Comparison of Tropical Rainforest Phenology Retrieved From Geostationary (SEVIRI) and Polar-Orbiting (MODIS) Sensors Across the Congo Basin
The seasonal and interannual dynamics of tropical rainforests play a critical role in the global carbon cycle and climate change. This paper retrieved and compared land surface phenology from observations acquired by the Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard geostationary satellites and the Moderate Resolution Imaging Spectroradiometer (MODIS) on polar-orbiting satellites over the Congo Basin. To achieve this,we first retrieved canopy greenness cycles (CGCs) and their transition timing from two-band enhanced vegetation index (EVI2) derived from SEVIRI and MODIS data between 2006 and 2013.We then assessed the influences of SEVIRI and MODIS data quality on the reconstruction of the EVI2 temporal trajectory, the detection of the CGC onset and end timing, and the total number of successful CGC retrievals. The significance of influences was determined using the one-tailed two-sample Kolmogorov–Smirnov test. The results indicate that diurnal SEVIRI observations greatly increased the probability of capturing cloud-free daily EVI2 in the rainforest-dominated region of the Congo Basin, where the proportion of good quality (PGQ) observations during a CGC was up to 80% higher than that from MODIS. As a result, the double annual CGCs of the Congo Basin rainforests were well identified from SEVIRI but sparsely detected from MODIS, whereas the single annual CGC in the savanna-dominated northern and southern Congo Basin was successfully retrieved from both SEVIRI and MODIS. Moreover, the decreases of PGQ in an EVI2 time series were found to significantly increase the uncertainties of retrieved phenological timings and increase the probabilities of CGC retrieval failures
Passive monitoring of nonlinear relaxation of cracked polymer concrete samples using Acoustic Emission
International audienc
- …