371 research outputs found
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
Few-shot Bioacoustic Event Detection with Machine Learning Methods
Few-shot learning is a type of classification through which predictions are
made based on a limited number of samples for each class. This type of
classification is sometimes referred to as a meta-learning problem, in which
the model learns how to learn to identify rare cases. We seek to extract
information from five exemplar vocalisations of mammals or birds and detect and
classify these sounds in field recordings [2]. This task was provided in the
Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge of
2021. Rather than utilize deep learning, as is most commonly done, we
formulated a novel solution using only machine learning methods. Various models
were tested, and it was found that logistic regression outperformed both linear
regression and template matching. However, all of these methods over-predicted
the number of events in the field recordings.Comment: 7 pages, 6 tables, 1 figur
Filler Word Detection and Classification: A Dataset and Benchmark
Filler words such as `uh' or `um' are sounds or words people use to signal
they are pausing to think. Finding and removing filler words from recordings is
a common and tedious task in media editing. Automatically detecting and
classifying filler words could greatly aid in this task, but few studies have
been published on this problem. A key reason is the absence of a dataset with
annotated filler words for training and evaluation. In this work, we present a
novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K
annotations of other sounds that commonly occur in podcasts such as breaths,
laughter, and word repetitions. We propose a pipeline that leverages VAD and
ASR to detect filler candidates and a classifier to distinguish between filler
word types. We evaluate our proposed pipeline on PodcastFillers, compare to
several baselines, and present a detailed ablation study. In particular, we
evaluate the importance of using ASR and how it compares to a
transcription-free approach resembling keyword spotting. We show that our
pipeline obtains state-of-the-art results, and that leveraging ASR strongly
outperforms a keyword spotting approach. We make PodcastFillers publicly
available, and hope our work serves as a benchmark for future research.Comment: Submitted to Insterspeech 202
Spatio-Temporal Multimedia Big Data Analytics Using Deep Neural Networks
With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era, where new opportunities and challenges appear with the high diversity multimedia data together with the huge amount of social data. Nowadays, multimedia data consisting of audio, text, image, and video has grown tremendously. With such an increase in the amount of multimedia data, the main question raised is how one can analyze this high volume and variety of data in an efficient and effective way. A vast amount of research work has been done in the multimedia area, targeting different aspects of big data analytics, such as the capture, storage, indexing, mining, and retrieval of multimedia big data. However, there is insufficient research that provides a comprehensive framework for multimedia big data analytics and management.
To address the major challenges in this area, a new framework is proposed based on deep neural networks for multimedia semantic concept detection with a focus on spatio-temporal information analysis and rare event detection. The proposed framework is able to discover the pattern and knowledge of multimedia data using both static deep data representation and temporal semantics. Specifically, it is designed to handle data with skewed distributions. The proposed framework includes the following components: (1) a synthetic data generation component based on simulation and adversarial networks for data augmentation and deep learning training, (2) an automatic sampling model to overcome the imbalanced data issue in multimedia data, (3) a deep representation learning model leveraging novel deep learning techniques to generate the most discriminative static features from multimedia data, (4) an automatic hyper-parameter learning component for faster training and convergence of the learning models, (5) a spatio-temporal deep learning model to analyze dynamic features from multimedia data, and finally (6) a multimodal deep learning fusion model to integrate different data modalities. The whole framework has been evaluated using various large-scale multimedia datasets that include the newly collected disaster-events video dataset and other public datasets
PharmacyGPT: The AI Pharmacist
In this study, we introduce PharmacyGPT, a novel framework to assess the
capabilities of large language models (LLMs) such as ChatGPT and GPT-4 in
emulating the role of clinical pharmacists. Our methodology encompasses the
utilization of LLMs to generate comprehensible patient clusters, formulate
medication plans, and forecast patient outcomes. We conduct our investigation
using real data acquired from the intensive care unit (ICU) at the University
of North Carolina Chapel Hill (UNC) Hospital. Our analysis offers valuable
insights into the potential applications and limitations of LLMs in the field
of clinical pharmacy, with implications for both patient care and the
development of future AI-driven healthcare solutions. By evaluating the
performance of PharmacyGPT, we aim to contribute to the ongoing discourse
surrounding the integration of artificial intelligence in healthcare settings,
ultimately promoting the responsible and efficacious use of such technologies
- …