764 research outputs found
Lexical Retrieval Hypothesis in Multimodal Context
Multimodal corpora have become an essential language resource for language
science and grounded natural language processing (NLP) systems due to the
growing need to understand and interpret human communication across various
channels. In this paper, we first present our efforts in building the first
Multimodal Corpus for Languages in Taiwan (MultiMoco). Based on the corpus, we
conduct a case study investigating the Lexical Retrieval Hypothesis (LRH),
specifically examining whether the hand gestures co-occurring with speech
constants facilitate lexical retrieval or serve other discourse functions. With
detailed annotations on eight parliamentary interpellations in Taiwan Mandarin,
we explore the co-occurrence between speech constants and non-verbal features
(i.e., head movement, face movement, hand gesture, and function of hand
gesture). Our findings suggest that while hand gestures do serve as
facilitators for lexical retrieval in some cases, they also serve the purpose
of information emphasis. This study highlights the potential of the MultiMoco
Corpus to provide an important resource for in-depth analysis and further
research in multimodal communication studies
Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis
This paper explores the grounding issue regarding multimodal semantic
representation from a computational cognitive-linguistic view. We annotate
images from the Flickr30k dataset with five perceptual properties: Affordance,
Perceptual Salience, Object Number, Gaze Cueing, and Ecological Niche
Association (ENA), and examine their association with textual elements in the
image captions. Our findings reveal that images with Gibsonian affordance show
a higher frequency of captions containing 'holding-verbs' and 'container-nouns'
compared to images displaying telic affordance. Perceptual Salience, Object
Number, and ENA are also associated with the choice of linguistic expressions.
Our study demonstrates that comprehensive understanding of objects or events
requires cognitive attention, semantic nuances in language, and integration
across multiple modalities. We highlight the vital importance of situated
meaning and affordance grounding in natural language understanding, with the
potential to advance human-like interpretation in various scenarios.Comment: 10 pages, 9 figure
D-STAR: Dual Simultaneously Transmitting and Reflecting Reconfigurable Intelligent Surfaces for Joint Uplink/Downlink Transmission
The joint uplink/downlink (JUD) design of simultaneously transmitting and
reflecting reconfigurable intelligent surfaces (STAR-RIS) is conceived in
support of both uplink (UL) and downlink (DL) users. Furthermore, the dual
STAR-RISs (D-STAR) concept is conceived as a promising architecture for
360-degree full-plane service coverage, including UL/DL users located between
the base station (BS) and the D-STAR as well as beyond. The corresponding
regions are termed as primary (P) and secondary (S) regions. Both BS/users
exist in the P-region, but only users are located in the S-region. The primary
STAR-RIS (STAR-P) plays an important role in terms of tackling the P-region
inter-user interference, the self-interference (SI) from the BS and from the
reflective as well as refractive UL users imposed on the DL receiver. By
contrast, the secondary STAR-RIS (STAR-S) aims for mitigating the S-region
interferences. The non-linear and non-convex rate-maximization problem
formulated is solved by alternating optimization amongst the decomposed convex
sub-problems of the BS beamformer, and the D-STAR amplitude as well as phase
shift configurations. We also propose a D-STAR based active beamforming and
passive STAR-RIS amplitude/phase (DBAP) optimization scheme to solve the
respective sub-problems by Lagrange dual with Dinkelbach's transformation,
alternating direction method of multipliers (ADMM) with successive convex
approximation (SCA), and penalty convex-concave procedure (PCCP). Our
simulation results reveal that the proposed D-STAR architecture outperforms the
conventional single RIS, single STAR-RIS, and half-duplex networks. The
proposed DBAP of D-STAR outperforms the state-of-the-art solutions found in the
open literature for different numbers of quantization levels, geographic
deployment, transmit power and for diverse numbers of transmit antennas, patch
partitions as well as D-STAR elements.Comment: Accepted by IEEE TCO
Retinal capillary perfusion heterogeneity in diabetic retinopathy detected by optical coherence tomography angiography
Background:
Diabetic retinopathy (DR) is a leading cause of blindness and involves retinal capillary damage, microaneurysms, and altered blood flow regulation. Optical coherence tomography angiography (OCTA) is a non-invasive way of visualizing retinal vasculature but has not been used extensively to study blood flow heterogeneity. The purpose of this study is to detect and quantify blood flow heterogeneity utilizing en-face swept source OCTA in patients with DR.
//
Methods:
This is a prospective clinical study which examined patients with either type 1 or 2 diabetes mellitus. Each included eye was graded clinically as no DR, mild DR, or moderate-severe DR. Ten consecutive en face 6 × 6 mm foveal SS-OCTA images were obtained from each eye using a PLEX Elite 9000 (Zeiss Meditec, Dublin, CA). Built-in fixation-tracking, follow-up functions were utilized to reduce motion artifacts and ensure same location imaging in sequential frames. Images of the superficial and deep vascular complexes (SVC and DVC) were arranged in temporal stacks of 10 and registered to a reference frame for segmentation using a deep neural network. The vessel segmentation was then masked onto each stack to calculate the pixel intensity coefficient of variance (PICoV) and map the spatiotemporal perfusion heterogeneity of each stack.
//
Results:
Twenty-nine eyes were included: 7 controls, 7 diabetics with no DR, 8 mild DR, and 7 moderate-severe DR. The PICoV correlated significantly and positively with DR severity. In patients with DR, the perfusion heterogeneity was higher in the temporal half of the macula, particularly in areas of capillary dropout. PICoV also correlates as expected with the established OCTA metrics of perfusion density and vessel density.
//
Conclusion:
PICoV is a novel way to analyze OCTA imaging and quantify perfusion heterogeneity. Retinal capillary perfusion heterogeneity in both the SVC and DVC increased with DR severity. This may be related to the loss of retinal capillary perfusion autoregulation in diabetic retinopathy
- …