174 research outputs found
COMMA: Co-Articulated Multi-Modal Learning
Pretrained large-scale vision-language models such as CLIP have demonstrated
excellent generalizability over a series of downstream tasks. However, they are
sensitive to the variation of input text prompts and need a selection of prompt
templates to achieve satisfactory performance. Recently, various methods have
been proposed to dynamically learn the prompts as the textual inputs to avoid
the requirements of laboring hand-crafted prompt engineering in the fine-tuning
process. We notice that these methods are suboptimal in two aspects. First, the
prompts of the vision and language branches in these methods are usually
separated or uni-directionally correlated. Thus, the prompts of both branches
are not fully correlated and may not provide enough guidance to align the
representations of both branches. Second, it's observed that most previous
methods usually achieve better performance on seen classes but cause
performance degeneration on unseen classes compared to CLIP. This is because
the essential generic knowledge learned in the pretraining stage is partly
forgotten in the fine-tuning process. In this paper, we propose Co-Articulated
Multi-Modal Learning (COMMA) to handle the above limitations. Especially, our
method considers prompts from both branches to generate the prompts to enhance
the representation alignment of both branches. Besides, to alleviate forgetting
about the essential knowledge, we minimize the feature discrepancy between the
learned prompts and the embeddings of hand-crafted prompts in the pre-trained
CLIP in the late transformer layers. We evaluate our method across three
representative tasks of generalization to novel classes, new target datasets
and unseen domain shifts. Experimental results demonstrate the superiority of
our method by exhibiting a favorable performance boost upon all tasks with high
efficiency.Comment: Accepted to AAAI2024. Code is available at
https://github.com/hulianyuyy/COMM
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition
Raw videos have been proven to own considerable feature redundancy where in
many cases only a portion of frames can already meet the requirements for
accurate recognition. In this paper, we are interested in whether such
redundancy can be effectively leveraged to facilitate efficient inference in
continuous sign language recognition (CSLR). We propose a novel adaptive model
(AdaBrowse) to dynamically select a most informative subsequence from input
video sequences by modelling this problem as a sequential decision task. In
specific, we first utilize a lightweight network to quickly scan input videos
to extract coarse features. Then these features are fed into a policy network
to intelligently select a subsequence to process. The corresponding subsequence
is finally inferred by a normal CSLR model for sentence prediction. As only a
portion of frames are processed in this procedure, the total computations can
be considerably saved. Besides temporal redundancy, we are also interested in
whether the inherent spatial redundancy can be seamlessly integrated together
to achieve further efficiency, i.e., dynamically selecting a lowest input
resolution for each sample, whose model is referred to as AdaBrowse+. Extensive
experimental results on four large-scale CSLR datasets, i.e., PHOENIX14,
PHOENIX14-T, CSL-Daily and CSL, demonstrate the effectiveness of AdaBrowse and
AdaBrowse+ by achieving comparable accuracy with state-of-the-art methods with
1.44 throughput and 2.12 fewer FLOPs. Comparisons with other
commonly-used 2D CNNs and adaptive efficient methods verify the effectiveness
of AdaBrowse. Code is available at
\url{https://github.com/hulianyuyy/AdaBrowse}.Comment: ACMMM202
NCACO-score: An effective main-chain dependent scoring function for structure modeling
<p>Abstract</p> <p>Background</p> <p>Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge.</p> <p>Results</p> <p>Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed.</p> <p>Conclusions</p> <p>Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.</p
A centimeter-scale achromatic hybrid metalens with polarization-insensitivity in the visible
Metalenses, featuring ultra-compactness and CMOS compatibility, are limited
by the compromise between the diameter, numerical aperture, and working
waveband. To address this problem, we propose and numerically demonstrate a
centimeter-scale metasurface-refractive hybrid metalens working in the band of
440 - 700 nm. Revisiting the general Snell law, we present the phase profile of
a chromatic aberration correction metasurface that can apply to a plano-convex
refractive lens of an arbitrary surface type. Simulated by our semi-vector
method, the designed achromatic hybrid metalens achieves 81% chromatic
aberration suppression and polarization insensitivity. Broadband imaging
results of the hybrid metalens are further provided, verifying the achromatism
of the designed hybrid metalens. It can find applications in camera lenses and
other optical systems that need compact, high-performance lenses.Comment: 10 pages, 5 figures
Efficient wide-bandgap perovskite solar cells with open-circuit voltage deficit below 0.4 V via hole-selective interface engineering
Wide-bandgap mixed-halide perovskite solar cells (WBG-PSCs) are promising top cells for efficient tandem photovoltaics to achieve high power conversion efficiency (PCE) at low cost. However, the open-circuit voltage (VOC) of WBG-PSCs is still unsatisfactory as the VOC-deficit is generally larger than 0.45 V. Herein, we report a buried interface engineering strategy that substantially improves the VOC of WBG-PSCs by inserting amphophilic molecular hole-selective materials featuring with a cyanovinyl phosphonic acid (CPA) anchoring group between the perovskite and substrate. The assembly and redistribution of CPA-based amphiphilic molecules at the perovskite-substrate buried interface not only promotes the growth of a low-defect crystalline perovskite thin film, but also suppresses the photo-induced halide phase separation. The energy level alignment between wide-bandgap perovskite and the hole-selective layer is further improved by modulating the substituents on the triphenylamine donor moiety (methoxyls for MPA-CPA, methyls for MePA-CPA, and bare TPA-CPA). Using a 1.68 eV bandgap perovskite, the MePA-CPA-based devices achieved an unprecedentedly high VOC of 1.29 V and PCE of 22.3% under standard AM 1.5 sunlight. The VOC-deficit (<0.40 V) is the lowest value reported for WBG-PSCs. This work not only provides an effective approach to decreasing the VOC-deficit of WBG-PSCs, but also confirms the importance of energy level alignment at the charge-selective layers in PSCs.</p
The Influence of Neighbourhood Environment on Airbnb: a Geographically Weighed Regression Analysis
Sharing accommodation has emerged recently as a new business model in the accommodation sector. Due to the potential gentrification Airbnb might bring to an area, it is critical to understand the spatial patterns of sharing economy and its possible determinants. The neighbourhood environment has proven to be an important factor in the traditional hotel business, and whether it is the same for sharing accommodation is worth investigating. In this study, location data of 29,780 houses/apartments on Airbnb.com in London was collected. Using Ordinal Least Square and Geography Weighed Regression analysis, the spatial distribution features of Airbnb and its relationship with neighbourhood environment in London were explored. The results show that sharing accommodation is mainly located in the city centre and around tourist attractions. Neighbourhood elements such as Water, Vegetation Coverage, Art & Human Landscape, Travel & Transport, University, Nightlife Spot emerged as important factors influencing Airbnb. In addition, the distribution of Airbnb in London is spatially non-stationary, in some areas high Airbnb is associated with higher transportation accessibility, in other areas, high Airbnb is associated with more attractions or nightlife spots, suggesting that the role of different factors varies in different regions, proving Tobler’s first law of geography
Clinical, radiological, and laboratory features of HIV-negative pulmonary cryptococcosis with regard to serum lateral flow assay
IntroductionCryptococcosis is the second most common invasive yeast infection in China. Pulmonary cryptococcosis (PC) is difficult to diagnose due to the lack of specific clinical features and the limitation of diagnostic techniques. Although lateral flow assay was very useful in diagnosing cryptococcal infection, quite a few patients with PC presented negative serum lateral flow assay (sLFA).MethodsWe conducted a retrospective study of HIV-negative patients who were diagnosed with PC in our hospital over the past decade to explore the potential relationship between the clinical profiles and sLFA in PC.ResultsIn total, 112 patients with sLFA tested were enrolled in this study, of which 58.93% were male. The positivity rate of sLFA for PC was 91.07%. The extent of pulmonary lesions was positively correlated with sLFA grade (Spearman r = 0.268, p < 0.01). Solitary nodule (SN) and pneumonia were the most common imaging findings in PC with negative and positive sLFA respectively. Among 65 symptomatic PC patients, 14 presented with fever and had higher hypersensitive C-reactive protein (hsCRP) level and more extensive pulmonary involvement (Mann-Whitney U test, p < 0.05) than those without fever. Symptomatic PC patients were more likely to have positive results of sLFA (Mann-Whitney U test, p = 0.05) compared against asymptomatic ones.DiscussionIn conclusion, negative sLFA cannot exclude PC in patients with a solitary nodule in lung. Positive sLFA is more reliable in diagnosing PC in symptomatic patients with diffused lesions in lung who generally experience a more severe systemic inflammatory reaction
Kinetic-MHD hybrid simulation of fishbone modes excited by fast ions on the experimental advanced superconducting tokamak (EAST)
Kinetic-MagnetoHydroDynamic hybrid simulations are carried out to investigate fishbone modes excited by fast ions on the Experimental Advanced Superconducting Tokamak. The simulations use realistic equilibrium reconstructed from experiment data with the constraint of the q = 1 surface location (q is the safety factor). Anisotropic slowing down distribution is used to model the distribution of the fast ions from neutral beam injection. The resonance condition is used to identify the interaction between the fishbone mode and the fast ions, which shows that the fishbone mode is simultaneously in resonance with the bounce motion of the trapped particles and the transit motion of the passing particles. Both the passing and trapped particles are important in destabilizing the fishbone mode. The simulations show that the mode frequency chirps down as the mode reaches the nonlinear stage, during which there is a substantial flattening of the perpendicular pressure of fast ions, compared with that of the parallel pressure. For passing particles, the resonance remains within the q = 1 surface, while, for trapped particles, the resonant location moves out radially during the nonlinear evolution. In addition, parameter scanning is performed to examine the dependence of the linear frequency and growth rate of fishbones on the pressure and injection velocity of fast ions
1,7-Dihydroxy-2,3,4-trimethoxy-9H-xanthen-9-one monohydrate from Halenia elliptica
The title compound, C16H14O7·H2O, possesses a planar three-ring skeleton; its carbonyl, one of the two hydroxy and two of the three methoxy O atoms and the water molecule form hydrogen bonds, giving rise to a layer structure
- …