764 research outputs found

    Lexical Retrieval Hypothesis in Multimodal Context

    Full text link
    Multimodal corpora have become an essential language resource for language science and grounded natural language processing (NLP) systems due to the growing need to understand and interpret human communication across various channels. In this paper, we first present our efforts in building the first Multimodal Corpus for Languages in Taiwan (MultiMoco). Based on the corpus, we conduct a case study investigating the Lexical Retrieval Hypothesis (LRH), specifically examining whether the hand gestures co-occurring with speech constants facilitate lexical retrieval or serve other discourse functions. With detailed annotations on eight parliamentary interpellations in Taiwan Mandarin, we explore the co-occurrence between speech constants and non-verbal features (i.e., head movement, face movement, hand gesture, and function of hand gesture). Our findings suggest that while hand gestures do serve as facilitators for lexical retrieval in some cases, they also serve the purpose of information emphasis. This study highlights the potential of the MultiMoco Corpus to provide an important resource for in-depth analysis and further research in multimodal communication studies

    Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis

    Full text link
    This paper explores the grounding issue regarding multimodal semantic representation from a computational cognitive-linguistic view. We annotate images from the Flickr30k dataset with five perceptual properties: Affordance, Perceptual Salience, Object Number, Gaze Cueing, and Ecological Niche Association (ENA), and examine their association with textual elements in the image captions. Our findings reveal that images with Gibsonian affordance show a higher frequency of captions containing 'holding-verbs' and 'container-nouns' compared to images displaying telic affordance. Perceptual Salience, Object Number, and ENA are also associated with the choice of linguistic expressions. Our study demonstrates that comprehensive understanding of objects or events requires cognitive attention, semantic nuances in language, and integration across multiple modalities. We highlight the vital importance of situated meaning and affordance grounding in natural language understanding, with the potential to advance human-like interpretation in various scenarios.Comment: 10 pages, 9 figure

    D-STAR: Dual Simultaneously Transmitting and Reflecting Reconfigurable Intelligent Surfaces for Joint Uplink/Downlink Transmission

    Full text link
    The joint uplink/downlink (JUD) design of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) is conceived in support of both uplink (UL) and downlink (DL) users. Furthermore, the dual STAR-RISs (D-STAR) concept is conceived as a promising architecture for 360-degree full-plane service coverage, including UL/DL users located between the base station (BS) and the D-STAR as well as beyond. The corresponding regions are termed as primary (P) and secondary (S) regions. Both BS/users exist in the P-region, but only users are located in the S-region. The primary STAR-RIS (STAR-P) plays an important role in terms of tackling the P-region inter-user interference, the self-interference (SI) from the BS and from the reflective as well as refractive UL users imposed on the DL receiver. By contrast, the secondary STAR-RIS (STAR-S) aims for mitigating the S-region interferences. The non-linear and non-convex rate-maximization problem formulated is solved by alternating optimization amongst the decomposed convex sub-problems of the BS beamformer, and the D-STAR amplitude as well as phase shift configurations. We also propose a D-STAR based active beamforming and passive STAR-RIS amplitude/phase (DBAP) optimization scheme to solve the respective sub-problems by Lagrange dual with Dinkelbach's transformation, alternating direction method of multipliers (ADMM) with successive convex approximation (SCA), and penalty convex-concave procedure (PCCP). Our simulation results reveal that the proposed D-STAR architecture outperforms the conventional single RIS, single STAR-RIS, and half-duplex networks. The proposed DBAP of D-STAR outperforms the state-of-the-art solutions found in the open literature for different numbers of quantization levels, geographic deployment, transmit power and for diverse numbers of transmit antennas, patch partitions as well as D-STAR elements.Comment: Accepted by IEEE TCO

    Retinal capillary perfusion heterogeneity in diabetic retinopathy detected by optical coherence tomography angiography

    Get PDF
    Background: Diabetic retinopathy (DR) is a leading cause of blindness and involves retinal capillary damage, microaneurysms, and altered blood flow regulation. Optical coherence tomography angiography (OCTA) is a non-invasive way of visualizing retinal vasculature but has not been used extensively to study blood flow heterogeneity. The purpose of this study is to detect and quantify blood flow heterogeneity utilizing en-face swept source OCTA in patients with DR. // Methods: This is a prospective clinical study which examined patients with either type 1 or 2 diabetes mellitus. Each included eye was graded clinically as no DR, mild DR, or moderate-severe DR. Ten consecutive en face 6 × 6 mm foveal SS-OCTA images were obtained from each eye using a PLEX Elite 9000 (Zeiss Meditec, Dublin, CA). Built-in fixation-tracking, follow-up functions were utilized to reduce motion artifacts and ensure same location imaging in sequential frames. Images of the superficial and deep vascular complexes (SVC and DVC) were arranged in temporal stacks of 10 and registered to a reference frame for segmentation using a deep neural network. The vessel segmentation was then masked onto each stack to calculate the pixel intensity coefficient of variance (PICoV) and map the spatiotemporal perfusion heterogeneity of each stack. // Results: Twenty-nine eyes were included: 7 controls, 7 diabetics with no DR, 8 mild DR, and 7 moderate-severe DR. The PICoV correlated significantly and positively with DR severity. In patients with DR, the perfusion heterogeneity was higher in the temporal half of the macula, particularly in areas of capillary dropout. PICoV also correlates as expected with the established OCTA metrics of perfusion density and vessel density. // Conclusion: PICoV is a novel way to analyze OCTA imaging and quantify perfusion heterogeneity. Retinal capillary perfusion heterogeneity in both the SVC and DVC increased with DR severity. This may be related to the loss of retinal capillary perfusion autoregulation in diabetic retinopathy
    • …
    corecore