44 research outputs found
A Pre-trained Data Deduplication Model based on Active Learning
In the era of big data, the issue of data quality has become increasingly
prominent. One of the main challenges is the problem of duplicate data, which
can arise from repeated entry or the merging of multiple data sources. These
"dirty data" problems can significantly limit the effective application of big
data. To address the issue of data deduplication, we propose a pre-trained
deduplication model based on active learning, which is the first work that
utilizes active learning to address the problem of deduplication at the
semantic level. The model is built on a pre-trained Transformer and fine-tuned
to solve the deduplication problem as a sequence to classification task, which
firstly integrate the transformer with active learning into an end-to-end
architecture to select the most valuable data for deduplication model training,
and also firstly employ the R-Drop method to perform data augmentation on each
round of labeled data, which can reduce the cost of manual labeling and improve
the model's performance. Experimental results demonstrate that our proposed
model outperforms previous state-of-the-art (SOTA) for deduplicated data
identification, achieving up to a 28% improvement in Recall score on benchmark
datasets
Low geriatric nutritional risk index as a poor prognostic biomarker for immune checkpoint inhibitor treatment in solid cancer
ObjectiveIn this investigation, we focused on the geriatric nutritional risk index (GNRI), a comprehensive metric that takes into account the patient’s ideal weight, actual weight, and serum albumin levels to measure malnutrition. Our primary objective was to examine the predictive value of GNRI-defined malnutrition in determining the response to immunotherapy among cancer patients.MethodsRelevant articles for this study were systematically searched in PubMed, the Cochrane Library, EMBASE, and Google Scholar up to July 2023. Our analysis evaluated overall survival (OS), progression-free survival (PFS), objective response rate (ORR), and disease control rate (DCR) as clinical outcomes.ResultsThis analysis comprised a total of eleven articles encompassing 1,417 patients. The pooled results revealed that cancer patients with low GNRI levels exhibited shorter OS (HR: 2.64, 95% CI: 2.08–3.36, p < 0.001) and PFS (HR: 1.87, 95% CI: 1.46–2.41, p < 0.001), and lower ORR (OR: 0.46, 95% CI: 0.33–0.65, p < 0.001) and DCR (OR: 0.42, 95% CI: 0.29–0.61, p < 0.001). Sensitivity analyses confirmed that the above results were stable. Egger’s and Begg’s tests revealed that there was no publication bias in the above results.ConclusionOur results imply that the GNRI is a useful predictor of immunotherapy response in cancer patients
CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning
Extending Moore's law by augmenting complementary-metal-oxide semiconductor
(CMOS) transistors with emerging nanotechnologies (X) has become increasingly
important. Accelerating Monte Carlo algorithms that rely on random sampling
with such CMOS+X technologies could have significant impact on a large number
of fields from probabilistic machine learning, optimization to quantum
simulation. In this paper, we show the combination of stochastic magnetic
tunnel junction (sMTJ)-based probabilistic bits (p-bits) with versatile Field
Programmable Gate Arrays (FPGA) to design a CMOS + X (X = sMTJ) prototype. Our
approach enables high-quality true randomness that is essential for Monte Carlo
based probabilistic sampling and learning. Our heterogeneous computer
successfully performs probabilistic inference and asynchronous Boltzmann
learning, despite device-to-device variations in sMTJs. A comprehensive
comparison using a CMOS predictive process design kit (PDK) reveals that
compact sMTJ-based p-bits replace 10,000 transistors while dissipating two
orders of magnitude of less energy (2 fJ per random bit), compared to digital
CMOS p-bits. Scaled and integrated versions of our CMOS + stochastic nanomagnet
approach can significantly advance probabilistic computing and its applications
in various domains by providing massively parallel and truly random numbers
with extremely high throughput and energy-efficiency
Sciences for The 2.5-meter Wide Field Survey Telescope (WFST)
The Wide Field Survey Telescope (WFST) is a dedicated photometric survey
facility under construction jointly by the University of Science and Technology
of China and Purple Mountain Observatory. It is equipped with a primary mirror
of 2.5m in diameter, an active optical system, and a mosaic CCD camera of 0.73
Gpix on the main focus plane to achieve high-quality imaging over a field of
view of 6.5 square degrees. The installation of WFST in the Lenghu observing
site is planned to happen in the summer of 2023, and the operation is scheduled
to commence within three months afterward. WFST will scan the northern sky in
four optical bands (u, g, r, and i) at cadences from hourly/daily to
semi-weekly in the deep high-cadence survey (DHS) and the wide field survey
(WFS) programs, respectively. WFS reaches a depth of 22.27, 23.32, 22.84, and
22.31 in AB magnitudes in a nominal 30-second exposure in the four bands during
a photometric night, respectively, enabling us to search tremendous amount of
transients in the low-z universe and systematically investigate the variability
of Galactic and extragalactic objects. Intranight 90s exposures as deep as 23
and 24 mag in u and g bands via DHS provide a unique opportunity to facilitate
explorations of energetic transients in demand for high sensitivity, including
the electromagnetic counterparts of gravitational-wave events detected by the
second/third-generation GW detectors, supernovae within a few hours of their
explosions, tidal disruption events and luminous fast optical transients even
beyond a redshift of 1. Meanwhile, the final 6-year co-added images,
anticipated to reach g about 25.5 mag in WFS or even deeper by 1.5 mag in DHS,
will be of significant value to general Galactic and extragalactic sciences.
The highly uniform legacy surveys of WFST will also serve as an indispensable
complement to those of LSST which monitors the southern sky.Comment: 46 pages, submitted to SCMP
Improved Collaborative Representation Classifier Based on l2-Regularized for Human Action Recognition
Human action recognition is an important recent challenging task. Projecting depth images onto three depth motion maps (DMMs) and extracting deep convolutional neural network (DCNN) features are discriminant descriptor features to characterize the spatiotemporal information of a specific action from a sequence of depth images. In this paper, a unified improved collaborative representation framework is proposed in which the probability that a test sample belongs to the collaborative subspace of all classes can be well defined and calculated. The improved collaborative representation classifier (ICRC) based on l2-regularized for human action recognition is presented to maximize the likelihood that a test sample belongs to each class, then theoretical investigation into ICRC shows that it obtains a final classification by computing the likelihood for each class. Coupled with the DMMs and DCNN features, experiments on depth image-based action recognition, including MSRAction3D and MSRGesture3D datasets, demonstrate that the proposed approach successfully using a distance-based representation classifier achieves superior performance over the state-of-the-art methods, including SRC, CRC, and SVM
A Theoretical Study of Temperature-dependent Photodissociation Cross Sections and Rates for O2
The photodissociation of O _2 is thought to play a vital role in blocking UV radiation in the Earth’s atmosphere and likely has great importance in characterizing exoplanetary atmospheres. This work considers four photodissociation processes of O _2 associated with its four electronic states, whose potential energy curves and transition dipole moments are calculated at the icMRCI+Q/aug-cc-pwCV5Z-DK level of theory. The quantum-mechanical approach is used to compute the state-resolved cross sections for two triplet transitions from the ground X state to the excited B and E states, and for two singlet transitions from the a ^1 Δ _g and b states to the 1 ^1 Π _u state, with a consideration of photon wavelengths from 500 Å to the relevant threshold. Assuming the populations of the initial states satisfy a Boltzmann distribution, the temperature-dependent photodissociation cross sections are estimated at gas dynamic temperatures of 0–10,000 K, in which the discrete progressions of the B and E transitions are also considered. The photodissociation rates of O _2 in the interstellar, solar, and blackbody radiation fields are also calculated using the temperature-dependent cross sections. The resulting photodissociation cross sections and rates are important for the atmospheric chemistry of Earth and may be also useful for the atmospheric exploration of exoplanets
TW-Co-MFC: Two-level weighted collaborative fuzzy clustering based on maximum entropy for multi-view data
Multi-Task Adversarial Network Bottleneck Features for Noise-Robust Speaker Verification
FAR: Fourier Aerial Video Recognition
We present an algorithm, Fourier Activity Recognition (FAR), for UAV video
activity recognition. Our formulation uses a novel Fourier object
disentanglement method to innately separate out the human agent (which is
typically small) from the background. Our disentanglement technique operates in
the frequency domain to characterize the extent of temporal change of spatial
pixels, and exploits convolution-multiplication properties of Fourier transform
to map this representation to the corresponding object-background entangled
features obtained from the network. To encapsulate contextual information and
long-range space-time dependencies, we present a novel Fourier Attention
algorithm, which emulates the benefits of self-attention by modeling the
weighted outer product in the frequency domain. Our Fourier attention
formulation uses much fewer computations than self-attention. We have evaluated
our approach on multiple UAV datasets including UAV Human RGB, UAV Human Night,
Drone Action, and NEC Drone. We demonstrate a relative improvement of 8.02% -
38.69% in top-1 accuracy and up to 3 times faster over prior works.Comment: ECCV 2022 Poster pape
Recommended from our members
Implementation of a Drone-Based Information Gathering System
This paper demonstrates the implementation of the Personal Information Gathering System (PIGS) for information gathering. This system is intended for several specially-designed drones, a mobile phone, a base station, and a router. Traditionally in combat situations, humans must risk themselves to gain information and identify potential threats. The PIGS system ensures users gain comprehensive information autonomously, while safe from threats. The operator can use the mobile device to remotely command the drones to obtain information, explore different regions, and perform other information-gathering-related tasks. With 802.11ac Wi-Fi and a lightweight computer vision model, PIGS allows the operator to interface with the drones through high-level commands and receive visual information with optional computer vision analysis. The proposed system offers a safer and more efficient way to gather information in dangerous environments.International Foundation for TelemeteringProceedings from the International Telemetering Conference are made available by the International Foundation for Telemetering and the University of Arizona Libraries. Visit https://telemetry.org/contact-us/ if you have questions about items in this collection