8 research outputs found
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Recently, attention based models have been used extensively in many
sequence-to-sequence learning systems. Especially for image captioning, the
attention based models are expected to ground correct image regions with proper
generated words. However, for each time step in the decoding process, the
attention based models usually use the hidden state of the current input to
attend to the image regions. Under this setting, these attention models have a
"deviated focus" problem that they calculate the attention weights based on
previous words instead of the one to be generated, impairing the performance of
both grounding and captioning. In this paper, we propose the Prophet Attention,
similar to the form of self-supervision. In the training stage, this module
utilizes the future information to calculate the "ideal" attention weights
towards image regions. These calculated "ideal" weights are further used to
regularize the "deviated" attention. In this manner, image regions are grounded
with the correct words. The proposed Prophet Attention can be easily
incorporated into existing image captioning models to improve their performance
of both grounding and captioning. The experiments on the Flickr30k Entities and
the MSCOCO datasets show that the proposed Prophet Attention consistently
outperforms baselines in both automatic metrics and human evaluations. It is
worth noticing that we set new state-of-the-arts on the two benchmark datasets
and achieve the 1st place on the leaderboard of the online MSCOCO benchmark in
terms of the default ranking score, i.e., CIDEr-c40.Comment: Accepted by NeurIPS 202
Recommended from our members
Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes.
There is a great and growing need to ascertain what exactly is the state of a patient, in terms of disease progression, actual care practices, pathology, adverse events, and much more, beyond the paucity of data available in structured medical record data. Ascertaining these harder-to-reach data elements is now critical for the accurate phenotyping of complex traits, detection of adverse outcomes, efficacy of off-label drug use, and longitudinal patient surveillance. Clinical notes often contain the most detailed and relevant digital information about individual patients, the nuances of their diseases, the treatment strategies selected by physicians, and the resulting outcomes. However, notes remain largely unused for research because they contain Protected Health Information (PHI), which is synonymous with individually identifying data. Previous clinical note de-identification approaches have been rigid and still too inaccurate to see any substantial real-world use, primarily because they have been trained with too small medical text corpora. To build a new de-identification tool, we created the largest manually annotated clinical note corpus for PHI and develop a customizable open-source de-identification software called Philter ("Protected Health Information filter"). Here we describe the design and evaluation of Philter, and show how it offers substantial real-world improvements over prior methods
Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations
We present algorithms for the type-IV discrete cosine transform (DCT-IV) and
discrete sine transform (DST-IV), as well as for the modified discrete cosine
transform (MDCT) and its inverse, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact
count is strictly lowered for all N > 4. These results are derived by
considering the DCT to be a special case of a DFT of length 8N, with certain
symmetries, and then pruning redundant operations from a recent improved fast
Fourier transform algorithm (based on a recursive rescaling of the
conjugate-pair split radix algorithm). The improved algorithms for DST-IV and
MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page
Qwen Technical Report
Large language models (LLMs) have revolutionized the field of artificial
intelligence, enabling natural language processing tasks that were previously
thought to be exclusive to humans. In this work, we introduce Qwen, the first
installment of our large language model series. Qwen is a comprehensive
language model series that encompasses distinct models with varying parameter
counts. It includes Qwen, the base pretrained language models, and Qwen-Chat,
the chat models finetuned with human alignment techniques. The base language
models consistently demonstrate superior performance across a multitude of
downstream tasks, and the chat models, particularly those trained using
Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The
chat models possess advanced tool-use and planning capabilities for creating
agent applications, showcasing impressive performance even when compared to
bigger models on complex tasks like utilizing a code interpreter. Furthermore,
we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as
well as mathematics-focused models, Math-Qwen-Chat, which are built upon base
language models. These models demonstrate significantly improved performance in
comparison with open-source models, and slightly fall behind the proprietary
models.Comment: 59 pages, 5 figure
Nowhere to run: oligo (p-phenylene vinylene) kills oral intracellular bacteria photodynamically
Abstract Bacterial infections pose a severe threat to human health due to the exacerbation of antibiotic resistance and intracellular bacterial infections. Research suggests that oligo(p-phenylene vinylene) (OPV), commonly employed in the manufacture of organic solar batteries, can help address this issue. This study demonstrates the ability of OPV to target and sterilize intracellular Porphyromonas gingivalis and methicillin-resistant Staphylococcus aureus (MRSA) photodynamically. Most notably, OPV specifically targets bacteria without affecting healthy cells under dark conditions. Its chemical composition includes a conjugated backbone and ionic imidazole side chains, which allow OPV to bind to cell membranes. Furthermore, dental blue light curing lamps may excite OPV. Compared with antibiotics and traditional photosensitizers, OPV proves to be a potentially superior solution to eradicate intracellular microbial infections, both in fundamental research and clinical applications
Methods on COVID-19 epidemic curve estimation during emergency based on Baidu search engine and ILI traditional surveillance in Beijing, China
Surveillance is an essential work on infectious diseases prevention and control. When the pandemic occurred, the inadequacy of traditional surveillance was exposed, but it also provided a valuable opportunity to explore new surveillance methods. This study aimed to estimate the transmission dynamics and epidemic curve of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron BF.7 in Beijing under the emergent situation using Baidu index and influenza-like illness (ILI) surveillance. A novel hybrid model (multiattention bidirectional gated recurrent unit (MABG)–susceptible–exposed–infected–removed (SEIR)) was developed, which leveraged a deep learning algorithm (MABG) to scrutinize the past records of ILI occurrences and the Baidu index of diverse symptoms such as fever, pyrexia, cough, sore throat, anti-fever medicine, and runny nose. By considering the current Baidu index and the correlation between ILI cases and coronavirus disease 2019 (COVID-19) cases, a transmission dynamics model (SEIR) was formulated to estimate the transmission dynamics and epidemic curve of SARS-CoV-2. During the COVID-19 pandemic, when conventional surveillance measures have been suspended temporarily, cases of ILI can serve as a useful indicator for estimating the epidemiological trends of COVID-19. In the specific case of Beijing, it has been ascertained that cumulative infection attack rate surpass 80.25% (95% confidence interval (95% CI): 77.51%–82.99%) since December 17, 2022, with the apex of the outbreak projected to transpire on December 12. The culmination of existing patients is expected to occur three days subsequent to this peak. Effective reproduction number (Rt) represents the average number of secondary infections generated from a single infected individual at a specific point in time during an epidemic, remained below 1 since December 17, 2022. The traditional disease surveillance systems should be complemented with information from modern surveillance data such as online data sources with advanced technical support. Modern surveillance channels should be used primarily in emerging infectious and disease outbreaks. Syndrome surveillance on COVID-19 should be established to following on the epidemic, clinical severity, and medical source demand