Search CORE

84 research outputs found

I Buried the Fireworks Under the Tree

Author: Zhu Sihan
Publication venue: DigitalCommons@RISD
Publication date: 03/06/2023
Field of study

Unreachable memories always surround me. I\u27ve been trying to extract logical parts from my chaotic memories, hoping to find a connection with the world within the soundless, intangible black fireworks stored in my retina under the grand fireworks display. When I first encountered intaglio printmaking, I impulsively drew subconscious memories on the plate, arranging them along some chaotic storylines. Gradually, I realized that I needed to create my own logical structure. So I started using specific visual symbols and repeating them, using the repetition of the printmaking process to search for logical clues. Printmaking with its special rhythm allowed me to rediscover the connection with the world in the repetitive process of image making. The stability brought by this connection made me believe more in my intuition and thinking

Rhode Island School of Design

Evolvement of microRNAs as Therapeutic Targets for Malignant Gliomas

Author: Wu Sihan
Yan Guangmei
Zhu Wenbo
Publication venue: 'IntechOpen'
Publication date: 27/02/2013
Field of study

IntechOpen

Data visualization of virtual reality library user data

Author: Jutila R. (Risto)
Kopsa M. (Mirko)
Zhu S. (Sihan)
Publication venue: University of Oulu
Publication date: 18/06/2020
Field of study

Abstract. User research is an important part of developing software. In the gaming industry, different ways to analyse user behaviour is an increasingly important part of research. However, as game analytics are relatively new to the game industry, there is limited amount of research available. In this work, we discuss how to visualise collected data in virtual reality environments in a meaningful way to improve product quality and extract user behaviour patterns. We use clustering algorithms and analytical functions to have a more comprehensive look on test participants’ behaviour with our Data Visualization tool. This behaviour is then presented using different path maps, heat maps and data charts. Originally our aim was to conclude research on user behaviour in the Oulu Virtual Library application, but due to the COVID-19 pandemic, we had to change our focus from user research to designing and implementing a tool for researchers to analyse similar data sets as our example data. Even though we had no concrete user data, researchers can use the tool we developed with relative small modifications, when dealing with similar data cases in the future. Usability improvements and real-word experiences are still needed to make the tool more robust.Tiivistelmä. Käyttäjätutkimus on tärkeä osa ohjelmistokehitystä. Koska pelianalytiikka on suhteellisen uutta peliteollisuudessa ja saatavilla oleva tutkimus vähäistä, loppukäyttäjien toiminnan analysointi on yhä tärkeämpi osa peliteollisuuden kehitystä. Tässä tutkielmassa pohditaan, kuinka virtuaaliympäristöistä kerättyä dataa voidaan esittää, merkityksellisellä tavalla, tuotteiden kehittämiseksi ja käyttäjien erilaisten käyttäytymismallien tunnistamiseksi. Käytämme ryhmittelyalgoritmeja ja analyyttisia funktioita, jotta saamme esitettyä käyttäjien toimintaa datavisualisointityökaluamme hyödyntämällä. Käyttäjien toiminta esitetään erilaisten polku- ja lämpökarttojen sekä datakaavioiden avulla. Alkuperäisenä tarkoituksenamme oli tutkia käyttäjien toimintaa Oulun Virtuaalikirjasto-sovelluksessa, mutta COVID-19-pandemian takia jouduimme siirtämään painopisteen käyttäjätutkimuksesta tutkijoille suunnatun datavisualisointityökalun suunnitteluun ja kehitykseen. Vaikka emme saaneet konkreettista aineistoa, tutkijat voivat käyttää työkalua, suhteellisen pienillä muunnoksilla, esimerkkiaineistoa vastaavan aineiston käsitelyyn ja analysointiin tulevaisuudessa. Työkalu tarvitsee yhä käytettävyysparannuksia ja todellisia käyttökokemuksia työkalun käyttövarmuuden parantamiseksi

University of Oulu Repository - Jultika

Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation

Author: Chen Sihan
Liu Jiawei
Liu Jing
Wang Weining
Zhu Xinxin
Publication venue
Publication date: 29/03/2023
Field of study

As a combination of visual and audio signals, video is inherently multi-modal. However, existing video generation methods are primarily intended for the synthesis of visual frames, whereas audio signals in realistic videos are disregarded. In this work, we concentrate on a rarely investigated problem of text guided sounding video generation and propose the Sounding Video Generator (SVG), a unified framework for generating realistic videos along with audio signals. Specifically, we present the SVG-VQGAN to transform visual frames and audio melspectrograms into discrete tokens. SVG-VQGAN applies a novel hybrid contrastive learning method to model inter-modal and intra-modal consistency and improve the quantized representations. A cross-modal attention module is employed to extract associated features of visual frames and audio signals for contrastive learning. Then, a Transformer-based decoder is used to model associations between texts, visual frames, and audio signals at token level for auto-regressive sounding video generation. AudioSetCap, a human annotated text-video-audio paired dataset, is produced for training SVG. Experimental results demonstrate the superiority of our method when compared with existing textto-video generation methods as well as audio generation methods on Kinetics and VAS datasets

arXiv.org e-Print Archive

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Author: Chen Sihan
Li Handong
Liu Jing
Sun Mingzhen
Wang Qunbo
Zhao Zijia
Zhu Xinxin
Publication venue
Publication date: 29/05/2023
Field of study

Vision and text have been fully explored in contemporary video-text foundational models, while other modalities such as audio and subtitles in videos have not received sufficient attention. In this paper, we resort to establish connections between multi-modality video tracks, including Vision, Audio, and Subtitle, and Text by exploring an automatically generated large-scale omni-modality video caption dataset called VAST-27M. Specifically, we first collect 27 million open-domain video clips and separately train a vision and an audio captioner to generate vision and audio captions. Then, we employ an off-the-shelf Large Language Model (LLM) to integrate the generated captions, together with subtitles and instructional prompts into omni-modality captions. Based on the proposed VAST-27M dataset, we train an omni-modality video-text foundational model named VAST, which can perceive and process vision, audio, and subtitle modalities from video, and better support various tasks including vision-text, audio-text, and multi-modal video-text tasks (retrieval, captioning and QA). Extensive experiments have been conducted to demonstrate the effectiveness of our proposed VAST-27M corpus and VAST foundation model. VAST achieves 22 new state-of-the-art results on various cross-modality benchmarks. Code, model and dataset will be released at https://github.com/TXH-mercury/VAST.Comment: 23 pages, 5 figure

arXiv.org e-Print Archive

Transportation Density Reduction Caused by City Lockdowns Across the World during the COVID-19 Epidemic: From the View of High-resolution Remote Sensing Imagery

Author: Du Bo
Han Chengxi
Hu Meiqi
Lan Meng
Wu Chen
Yang Jiaqi
Zhang Lefei
Zhang Liangpei
Zhu Sihan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

As the COVID-19 epidemic began to worsen in the first months of 2020, stringent lockdown policies were implemented in numerous cities throughout the world to control human transmission and mitigate its spread. Although transportation density reduction inside the city was felt subjectively, there has thus far been no objective and quantitative study of its variation to reflect the intracity population flows and their corresponding relationship with lockdown policy stringency from the view of remote sensing images with the high resolution under 1m. Accordingly, we here provide a quantitative investigation of the transportation density reduction before and after lockdown was implemented in six epicenter cities (Wuhan, Milan, Madrid, Paris, New York, and London) around the world during the COVID-19 epidemic, which is accomplished by extracting vehicles from the multi-temporal high-resolution remote sensing images. A novel vehicle detection model combining unsupervised vehicle candidate extraction and deep learning identification was specifically proposed for the images with the resolution of 0.5m. Our results indicate that transportation densities were reduced by an average of approximately 50% (and as much as 75.96%) in these six cities following lockdown. The influences on transportation density reduction rates are also highly correlated with policy stringency, with an R^2 value exceeding 0.83. Even within a specific city, the transportation density changes differed and tended to be distributed in accordance with the city's land-use patterns. Considering that public transportation was mostly reduced or even forbidden, our results indicate that city lockdown policies are effective at limiting human transmission within cities.Comment: 14 pages, 7 figures, submitted to IEEE JSTAR

arXiv.org e-Print Archive

Directory of Open Access Journals