152 research outputs found

    Multi-Mode Clustering for Graph-Based Lifelog Retrieval

    Full text link
    As part of the 6th Lifelog Search Challenge, this paper presents an approach to arrange Lifelog data in a multi-modal knowledge graph based on cluster hierarchies. We use multiple sequence clustering approaches to address the multi-modal nature of Lifelogs in relation to temporal, spatial, and visual factors. The resulting clusters, along with semantic metadata captions and augmentations based on OpenCLIP, provide for the semantic structure of a graph including all Lifelogs as entries. Textual queries on this hierarchical graph can be expressed to retrieve individual Lifelogs, as well as clusters of Lifelogs

    Risk Spillovers in Returns for Chinese and International Tourists to Taiwan

    Get PDF
    [[abstract]]Fluctuations in the numbers of visitors directly affect the rates of return on tourism business activities. This article is the first to examine the spillover effects between the rate of change in the numbers of Chinese tourist arrivals and the rate of change in the numbers of international traveler arrivals. The study used daily data from January 1, 2014, to October 31, 2016, with 1,035 observations. The diagonal BEKK model was used to analyze the co-volatility spillover effects between the rate of change in the international travelers and the rate of change in the amount of Chinese tourists visiting Taiwan. The empirical findings suggest that Taiwan should abandon its development strategy of focusing only on a single market, namely, China. Moreover, with the reduction in Chinese tour groups visiting Taiwan, and increases in individual travelers, the Taiwan Government should change its previous travel policies of mainly attracting Chinese tour group travelers and actively promote in-depth tourism among international tourists.[[notice]]補正完

    Standardized low-power wireless communication technologies for distributed sensing applications

    Get PDF
    Recent standardization efforts on low-power wireless communication technologies, including time-slotted channel hopping (TSCH) and DASH7 Alliance Mode (D7AM), are starting to change industrial sensing applications, enabling networks to scale up to thousands of nodes whilst achieving high reliability. Past technologies, such as ZigBee, rooted in IEEE 802.15.4, and ISO 18000-7, rooted in frame-slotted ALOHA (FSA), are based on contention medium access control (MAC) layers and have very poor performance in dense networks, thus preventing the Internet of Things (IoT) paradigm from really taking off. Industrial sensing applications, such as those being deployed in oil refineries, have stringent requirements on data reliability and are being built using new standards. Despite the benefits of these new technologies, industrial shifts are not happening due to the enormous technology development and adoption costs and the fact that new standards are not well-known and completely understood. In this article, we provide a deep analysis of TSCH and D7AM, outlining operational and implementation details with the aim of facilitating the adoption of these technologies to sensor application developers.Peer ReviewedPostprint (published version

    Datasheet for subjective and objective quality assessment datasets

    Get PDF
    Over the years, many subjective and objective quality assessment datasets have been created and made available to the research community. However, there is no standard process for documenting the various aspects of the dataset, such as details about the source sequences, number of test subjects, test methodology, encoding settings, etc. Such information is often of great importance to the users of the dataset as it can help them get a quick understanding of the motivation and scope of the dataset. Without such a template, it is left to each reader to collate the information from the relevant publication or website, which is a tedious and time-consuming process. In some cases, the absence of a template to guide the documentation process can result in an unintentional omission of some important information. This paper addresses this simple but significant gap by proposing a datasheet template for documenting various aspects of sub-jective and objective quality assessment datasets for multimedia data. The contributions presented in this work aim to simplify the documentation process for existing and new datasets and improve their reproducibility. The proposed datasheet template is available on GitHub1, along with a few sample datasheets of a few open-source audiovisual subjective and objective datasets

    A Study on the Suitability of Visual Languages for Non-Expert Robot Programmers

    Get PDF
    A visual programming language allows users and developers to create programs by manipulating program elements graphically. Several studies have shown the bene ts of visual languages for learning purposes and their applicability to robot programming. However, at present, there are not enough comparative studies on the suitability of textual and visual languages for this purpose. In this paper, we study if, as with a textual language, the use of a visual language could also be suitable in the context of robot programming and, if so, what the main advantages of using a visual language would be. For our experiments, we selected a sample of 60 individuals among students with adequate knowledge of procedural programming, that was divided into three groups. For the rst group of 20 students, a learning scenario based on a textual objectoriented language was used for programming a speci c commercial robotic ball with sensing, wireless communication, and output capabilities, whereas for the second and the third group, two learning scenarios based on visual languages were used for programming the robot. After taking a course for programming the robot in the corresponding learning scenario, each group was evaluated by completing three programming exercises related to the robot features (i.e. motion, lighting, and collision detection). Our results show that the students that worked with visual languages perceived a higher clarity level in their understanding of the course exposition, and a higher enjoyment level in the use of the programming environment. Moreover, they also achieved an overall better mark

    Filamentary Network and Magnetic Field Structures Revealed with BISTRO in the High-Mass Star-Forming Region NGC2264 : Global Properties and Local Magnetogravitational Configurations

    Full text link
    We report 850 ÎĽ\mum continuum polarization observations toward the filamentary high-mass star-forming region NGC 2264, taken as part of the B-fields In STar forming Regions Observations (BISTRO) large program on the James Clerk Maxwell Telescope (JCMT). These data reveal a well-structured non-uniform magnetic field in the NGC 2264C and 2264D regions with a prevailing orientation around 30 deg from north to east. Field strengths estimates and a virial analysis for the major clumps indicate that NGC 2264C is globally dominated by gravity while in 2264D magnetic, gravitational, and kinetic energies are roughly balanced. We present an analysis scheme that utilizes the locally resolved magnetic field structures, together with the locally measured gravitational vector field and the extracted filamentary network. From this, we infer statistical trends showing that this network consists of two main groups of filaments oriented approximately perpendicular to one another. Additionally, gravity shows one dominating converging direction that is roughly perpendicular to one of the filament orientations, which is suggestive of mass accretion along this direction. Beyond these statistical trends, we identify two types of filaments. The type-I filament is perpendicular to the magnetic field with local gravity transitioning from parallel to perpendicular to the magnetic field from the outside to the filament ridge. The type-II filament is parallel to the magnetic field and local gravity. We interpret these two types of filaments as originating from the competition between radial collapsing, driven by filament self-gravity, and the longitudinal collapsing, driven by the region's global gravity.Comment: Accepted for publication in the Astrophysical Journal. 43 pages, 32 figures, and 4 tables (including Appendix

    Gesture retrieval and its application to the study of multimodal communication

    Full text link
    Comprehending communication is dependent on analyzing the different modalities of conversation, including audio, visual, and others. This is a natural process for humans, but in digital libraries, where preservation and dissemination of digital information are crucial, it is a complex task. A rich conversational model, encompassing all modalities and their co-occurrences, is required to effectively analyze and interact with digital information. Currently, the analysis of co-speech gestures in videos is done through manual annotation by linguistic experts based on textual searches. However, this approach is limited and does not fully utilize the visual modality of gestures. This paper proposes a visual gesture retrieval method using a deep learning architecture to extend current research in this area. The method is based on body keypoints and uses an attention mechanism to focus on specific groups. Experiments were conducted on a subset of the NewsScape dataset, which presents challenges such as multiple people, camera perspective changes, and occlusions. A user study was conducted to assess the usability of the results, establishing a baseline for future gesture retrieval methods in real-world video collections. The results of the experiment demonstrate the high potential of the proposed method in multimodal communication research and highlight the significance of visual gesture retrieval in enhancing interaction with video content. The integration of visual similarity search for gestures in the open-source multimedia retrieval stack, vitrivr, can greatly contribute to the field of computational linguistics. This research advances the understanding of the role of the visual modality in co-speech gestures and highlights the need for further development in this area

    No-reference video quality estimation based on machine learning for passive gaming video streaming applications

    Get PDF
    Recent years have seen increasing growth and popularity of gaming services, both interactive and passive. While interactive gaming video streaming applications have received much attention, passive gaming video streaming, in-spite of its huge success and growth in recent years, has seen much less interest from the research community. For the continued growth of such services in the future, it is imperative that the end user gaming quality of experience (QoE) is estimated so that it can be controlled and maximized to ensure user acceptance. Previous quality assessment studies have shown not so satisfactory performance of existing No-reference (NR) video quality assessment (VQA) metrics. Also, due to the inherent nature and different requirements of gaming video streaming applications, as well as the fact that gaming videos are perceived differently from non-gaming content (as they are usually computer generated and contain artificial/synthetic content), there is a need for application specific light-weight, no-reference gaming video quality prediction models. In this paper, we present two NR machine learning based quality estimation models for gaming video streaming, NR-GVSQI and NR-GVSQE, using NR features such as bitrate, resolution, blockiness, etc. We evaluate their performance on different gaming video datasets and show that the proposed models outperform the current state-of-the-art no-reference metrics, while also reaching a prediction accuracy comparable to the best known full reference metric

    A telepresence wheelchair with 360-degree vision using WebRTC

    Full text link
    © 2020 The Author(s). This paper presents an innovative approach to develop an advanced 360-degree vision telepresence wheelchair for healthcare applications. The study aims at improving a wide field of view surrounding the wheelchair to provide safe wheelchair navigation and efficient assistance for wheelchair users. A dual-fisheye camera is mounted in front of the wheelchair to capture images which can be then streamed over the Internet. Aweb real-time communication (WebRTC) protocol was implemented to provide efficient video and data streaming. An estimation model based on artificial neural networks was developed to evaluate the quality of experience (QoE) of video streaming. Experimental results confirmed that the proposed telepresence wheelchair system was able to stream a 360-degree video surrounding the wheelchair smoothly in real-time. The average streaming rate of the entire 360-degree video was 25.83 frames per second (fps), and the average peak signal to noise ratio (PSNR) was 29.06 dB. Simulation results of the proposed QoE estimation scheme provided a prediction accuracy of 94%. Furthermore, the results showed that the designed system could be controlled remotely via the wireless Internet to follow the desired path with high accuracy. The overall results demonstrate the effectiveness of our proposed approach for the 360-degree vision telepresence wheelchair for assistive technology applications
    • …
    corecore