114 research outputs found
The Verbal and Non Verbal Signals of Depression -- Combining Acoustics, Text and Visuals for Estimating Depression Level
Depression is a serious medical condition that is suffered by a large number
of people around the world. It significantly affects the way one feels, causing
a persistent lowering of mood. In this paper, we propose a novel
attention-based deep neural network which facilitates the fusion of various
modalities. We use this network to regress the depression level. Acoustic, text
and visual modalities have been used to train our proposed network. Various
experiments have been carried out on the benchmark dataset, namely, Distress
Analysis Interview Corpus - a Wizard of Oz (DAIC-WOZ). From the results, we
empirically justify that the fusion of all three modalities helps in giving the
most accurate estimation of depression level. Our proposed approach outperforms
the state-of-the-art by 7.17% on root mean squared error (RMSE) and 8.08% on
mean absolute error (MAE).Comment: 10 pages including references, 2 figure
An EcoSage Assistant: Towards Building A Multimodal Plant Care Dialogue Assistant
In recent times, there has been an increasing awareness about imminent
environmental challenges, resulting in people showing a stronger dedication to
taking care of the environment and nurturing green life. The current $19.6
billion indoor gardening industry, reflective of this growing sentiment, not
only signifies a monetary value but also speaks of a profound human desire to
reconnect with the natural world. However, several recent surveys cast a
revealing light on the fate of plants within our care, with more than half
succumbing primarily due to the silent menace of improper care. Thus, the need
for accessible expertise capable of assisting and guiding individuals through
the intricacies of plant care has become paramount more than ever. In this
work, we make the very first attempt at building a plant care assistant, which
aims to assist people with plant(-ing) concerns through conversations. We
propose a plant care conversational dataset named Plantational, which contains
around 1K dialogues between users and plant care experts. Our end-to-end
proposed approach is two-fold : (i) We first benchmark the dataset with the
help of various large language models (LLMs) and visual language model (VLM) by
studying the impact of instruction tuning (zero-shot and few-shot prompting)
and fine-tuning techniques on this task; (ii) finally, we build EcoSage, a
multi-modal plant care assisting dialogue generation framework, incorporating
an adapter-based modality infusion using a gated mechanism. We performed an
extensive examination (both automated and manual evaluation) of the performance
exhibited by various LLMs and VLM in the generation of the domain-specific
dialogue responses to underscore the respective strengths and weaknesses of
these diverse models
- …