89 research outputs found

    Audio Guidance to Enable Vision-Impaired Individuals to Move Independently

    Get PDF
    At present, when an individual who is blind or has low-vision runs or walks for exercise, they might use a treadmill, rely on a guide dog, or use a tethered human guide. Independent and safe exercise, whether walking or running, is one way to increase personal agency and improve the quality of life for vision-impaired persons. This disclosure describes techniques that use on-device machine learning to enable a vision-impaired individual to independently walk or run, e.g., for exercise. A tape or guideline is painted along the running path. A mobile device camera detects the guideline. An app on the phone estimates the user\u27s position to the left or to the right of the guideline. The app provides audio cues in stereo to direct the person to stay in close proximity to the guideline while walking or running

    PolyMaX: General Dense Prediction with Mask Transformer

    Full text link
    Dense prediction tasks, such as semantic segmentation, depth estimation, and surface normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or regression (continuous outputs). This per-pixel prediction paradigm has remained popular due to the prevalence of fully convolutional networks. However, on the recent frontier of segmentation task, the community has been witnessing a shift of paradigm from per-pixel prediction to cluster-prediction with the emergence of transformer architectures, particularly the mask transformers, which directly predicts a label for a mask instead of a pixel. Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction. Motivated by the success of DORN and AdaBins in depth estimation, achieved by discretizing the continuous output space, we propose to generalize the cluster-prediction based method to general dense prediction tasks. This allows us to unify dense prediction tasks with the mask transformer framework. Remarkably, the resulting model PolyMaX demonstrates state-of-the-art performance on three benchmarks of NYUD-v2 dataset. We hope our simple yet effective design can inspire more research on exploiting mask transformers for more dense prediction tasks. Code and model will be made available.Comment: WACV 202

    VideoGLUE: Video General Understanding Evaluation of Foundation Models

    Full text link
    We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task. Moreover, we propose a scalar VideoGLUE score (VGS) to measure an FMs efficacy and efficiency when adapting to general video understanding tasks. Our main findings are as follows. First, task-specialized models significantly outperform the six FMs studied in this work, in sharp contrast to what FMs have achieved in natural language and image understanding. Second,video-native FMs, whose pretraining data contains the video modality, are generally better than image-native FMs in classifying motion-rich videos, localizing actions in time, and understanding a video of more than one action. Third, the video-native FMs can perform well on video tasks under light adaptations to downstream tasks(e.g., freezing the FM backbones), while image-native FMs win in full end-to-end finetuning. The first two observations reveal the need and tremendous opportunities to conduct research on video-focused FMs, and the last confirms that both tasks and adaptation methods matter when it comes to the evaluation of FMs

    Guanidine Derivatives of Quinazoline-2,4(1<i>H</i>,3<i>H</i>)-Dione as NHE-1 Inhibitors and Anti-Inflammatory Agents

    No full text
    Quinazolines are a rich source of bioactive compounds. Previously, we showed NHE-1 inhibitory, anti-inflammatory, antiplatelet, intraocular pressure lowering, and antiglycating activity for a series of quinazoline-2,4(1H,3H)-diones and quinazoline-4(3H)-one guanidine derivatives. In the present work, novel N1,N3-bis-substituted quinazoline-2,4(1H,3H)-dione derivatives bearing two guanidine moieties were synthesized and pharmacologically profiled. The most potent NHE-1 inhibitor 3a also possesses antiplatelet and intraocular-pressure-reducing activity. Compound 4a inhibits NO synthesis and IL-6 secretion in murine macrophages without immunotoxicity and alleviates neutrophil infiltration, edema, and tissue lesions in a model of LPS-induced acute lung injury. Hence, we considered quinazoline derivative 4a as a potential agent for suppression of cytokine-mediated inflammatory response and acute lung injury
    • …
    corecore