Search CORE

89 research outputs found

Audio Guidance to Enable Vision-Impaired Individuals to Move Independently

Author: Ayalon Dror
Burke Ryan
Hall Matt
Meron Tomer
Sirotenko Mikhail
Watkinson John
Yang Xuan
Publication venue: Technical Disclosure Commons
Publication date: 20/05/2021
Field of study

At present, when an individual who is blind or has low-vision runs or walks for exercise, they might use a treadmill, rely on a guide dog, or use a tethered human guide. Independent and safe exercise, whether walking or running, is one way to increase personal agency and improve the quality of life for vision-impaired persons. This disclosure describes techniques that use on-device machine learning to enable a vision-impaired individual to independently walk or run, e.g., for exercise. A tape or guideline is painted along the running path. A mobile device camera detects the guideline. An app on the phone estimates the user\u27s position to the left or to the right of the guideline. The app provides audio cues in stereo to direct the person to stay in close proximity to the guideline while walking or running

Technical Disclosure Common

PolyMaX: General Dense Prediction with Mask Transformer

Author: Adam Hartwig
Chen Liang-Chieh
Debats Stephanie
Gu Xiuye
Qiao Siyuan
Sharma Astuti
Sirotenko Mikhail
Wang Huisheng
Wilber Kimberly
Yang Xuan
Yuan Liangzhe
Publication venue
Publication date: 09/11/2023
Field of study

Dense prediction tasks, such as semantic segmentation, depth estimation, and surface normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or regression (continuous outputs). This per-pixel prediction paradigm has remained popular due to the prevalence of fully convolutional networks. However, on the recent frontier of segmentation task, the community has been witnessing a shift of paradigm from per-pixel prediction to cluster-prediction with the emergence of transformer architectures, particularly the mask transformers, which directly predicts a label for a mask instead of a pixel. Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction. Motivated by the success of DORN and AdaBins in depth estimation, achieved by discretizing the continuous output space, we propose to generalize the cluster-prediction based method to general dense prediction tasks. This allows us to unify dense prediction tasks with the mask transformer framework. Remarkably, the resulting model PolyMaX demonstrates state-of-the-art performance on three benchmarks of NYUD-v2 dataset. We hope our simple yet effective design can inspire more research on exploiting mask transformers for more dense prediction tasks. Code and model will be made available.Comment: WACV 202

arXiv.org e-Print Archive

VideoGLUE: Video General Understanding Evaluation of Foundation Models

Author: Adam Hartwig
Cui Yin
Friedman Luke
Gong Boqing
Gundavarapu Nitesh Bharadwaj
Jia Menglin
Jiang Lu
Liu Ting
Schroff Florian
Sirotenko Mikhail
Wang Huisheng
Weyand Tobias
Yang Ming-Hsuan
Yang Xuan
Yuan Liangzhe
Zhao Long
Zhou Hao
Publication venue
Publication date: 06/07/2023
Field of study

We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task. Moreover, we propose a scalar VideoGLUE score (VGS) to measure an FMs efficacy and efficiency when adapting to general video understanding tasks. Our main findings are as follows. First, task-specialized models significantly outperform the six FMs studied in this work, in sharp contrast to what FMs have achieved in natural language and image understanding. Second,video-native FMs, whose pretraining data contains the video modality, are generally better than image-native FMs in classifying motion-rich videos, localizing actions in time, and understanding a video of more than one action. Third, the video-native FMs can perform well on video tasks under light adaptations to downstream tasks(e.g., freezing the FM backbones), while image-native FMs win in full end-to-end finetuning. The first two observations reveal the need and tremendous opportunities to conduct research on video-focused FMs, and the last confirms that both tasks and adaptation methods matter when it comes to the evaluation of FMs

arXiv.org e-Print Archive

Guanidine Derivatives of Quinazoline-2,4(1<i>H</i>,3<i>H</i>)-Dione as NHE-1 Inhibitors and Anti-Inflammatory Agents

Author: Aida Kucheryavenko
Alena Taran
Alexander Borisov
Alexander Ozerov
Alexander Spasov
Alexey Smirnov
Darya Merezhkina
Denis Babkov
Elena Sokolova
Ludmila Naumenko
Mikhail Miroshnikov
Nadezhda Ovsyankina
Natalia Gurova
Vadim Kosolapov
Viktor Sirotenko
Vladlen Klochkov
Yulia Velikorodnaya
Publication venue: MDPI AG
Publication date: 01/10/2022
Field of study

Quinazolines are a rich source of bioactive compounds. Previously, we showed NHE-1 inhibitory, anti-inflammatory, antiplatelet, intraocular pressure lowering, and antiglycating activity for a series of quinazoline-2,4(1H,3H)-diones and quinazoline-4(3H)-one guanidine derivatives. In the present work, novel N1,N3-bis-substituted quinazoline-2,4(1H,3H)-dione derivatives bearing two guanidine moieties were synthesized and pharmacologically profiled. The most potent NHE-1 inhibitor 3a also possesses antiplatelet and intraocular-pressure-reducing activity. Compound 4a inhibits NO synthesis and IL-6 secretion in murine macrophages without immunotoxicity and alleviates neutrophil infiltration, edema, and tissue lesions in a model of LPS-induced acute lung injury. Hence, we considered quinazoline derivative 4a as a potential agent for suppression of cytokine-mediated inflammatory response and acute lung injury

Directory of Open Access Journals

PubMed Central