5,731 research outputs found

    Tune in to your emotions: a robust personalized affective music player

    Get PDF
    The emotional power of music is exploited in a personalized affective music player (AMP) that selects music for mood enhancement. A biosignal approach is used to measure listeners’ personal emotional reactions to their own music as input for affective user models. Regression and kernel density estimation are applied to model the physiological changes the music elicits. Using these models, personalized music selections based on an affective goal state can be made. The AMP was validated in real-world trials over the course of several weeks. Results show that our models can cope with noisy situations and handle large inter-individual differences in the music domain. The AMP augments music listening where its techniques enable automated affect guidance. Our approach provides valuable insights for affective computing and user modeling, for which the AMP is a suitable carrier application

    Audiovisual Metadata Platform Pilot Development (AMPPD), Final Project Report

    Get PDF
    This report documents the experience and findings of the Audiovisual Metadata Platform Pilot Development (AMPPD) project, which has worked to enable more efficient generation of metadata to support discovery and use of digitized and born-digital audio and moving image collections. The AMPPD project was carried out by partners Indiana University Libraries, AVP, University of Texas at Austin, and New York Public Library between 2018-2021

    INTERPRETABILITY FOR ARTIFICIAL INTELLIGENCE IN SPEAKER RECOGNITION TASKS

    Get PDF
    In the future, the DOD will more frequently incorporate AI into tasks with high consequences and, in turn, be scrutinized for mistakes. So far, much AI development has occurred in the private sector for which the consequences of error are lower than in defense. The DOD faces different incentives for the interpretability of AI and needs AI to aid decision-makers instead of replacing them. My thesis project implements techniques to improve trust in a speaker recognition task by focusing on meaningful and theoretically sound feature extraction and model simplicity. My main result is expected and elicits action from DOD. I find that a convolutional neural network (CNN) model performs substantially better than a multilayer perceptron and that logistic regression cannot discern speaker identity. Thus, the DOD needs to focus on research to develop interpretable models for complex tasks. The other result I find is surprising and motivates interpretability: When I construct features using mel frequency cepstral coefficients (MFCCs) on a human speech signal with an improperly long window, my CNN achieves an accuracy of 92%. Ill-defined MFCCs are theoretically meaningless; however, I find that they help predict speaker identity. Further confounding this result, I find that when I construct the MFCCs using the suggested 30 ms window, the model's accuracy falls to 72%. Future research should explore disentangled CNN-based models and the concept of an MFCC as windowing time grows.Civilian, Department of the NavyApproved for public release. Distribution is unlimited

    Systematic AI Approach for AGI: Addressing Alignment, Energy, and AGI Grand Challenges

    Full text link
    AI faces a trifecta of grand challenges the Energy Wall, the Alignment Problem and the Leap from Narrow AI to AGI. Contemporary AI solutions consume unsustainable amounts of energy during model training and daily operations.Making things worse, the amount of computation required to train each new AI model has been doubling every 2 months since 2020, directly translating to increases in energy consumption.The leap from AI to AGI requires multiple functional subsystems operating in a balanced manner, which requires a system architecture. However, the current approach to artificial intelligence lacks system design; even though system characteristics play a key role in the human brain from the way it processes information to how it makes decisions. Similarly, current alignment and AI ethics approaches largely ignore system design, yet studies show that the brains system architecture plays a critical role in healthy moral decisions.In this paper, we argue that system design is critically important in overcoming all three grand challenges. We posit that system design is the missing piece in overcoming the grand challenges.We present a Systematic AI Approach for AGI that utilizes system design principles for AGI, while providing ways to overcome the energy wall and the alignment challenges.Comment: International Journal on Semantic Computing (2024) Categories: Artificial Intelligence; AI; Artificial General Intelligence; AGI; System Design; System Architectur

    Unveiling the frontiers of deep learning: innovations shaping diverse domains

    Full text link
    Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table

    The selection and evaluation of a sensory technology for interaction in a warehouse environment

    Get PDF
    In recent years, Human-Computer Interaction (HCI) has become a significant part of modern life as it has improved human performance in the completion of daily tasks in using computerised systems. The increase in the variety of bio-sensing and wearable technologies on the market has propelled designers towards designing more efficient, effective and fully natural User-Interfaces (UI), such as the Brain-Computer Interface (BCI) and the Muscle-Computer Interface (MCI). BCI and MCI have been used for various purposes, such as controlling wheelchairs, piloting drones, providing alphanumeric inputs into a system and improving sports performance. Various challenges are experienced by workers in a warehouse environment. Because they often have to carry objects (referred to as hands-full) it is difficult to interact with traditional devices. Noise undeniably exists in some industrial environments and it is known as a major factor that causes communication problems. This has reduced the popularity of using verbal interfaces with computer applications, such as Warehouse Management Systems. Another factor that effects the performance of workers are action slips caused by a lack of concentration during, for example, routine picking activities. This can have a negative impact on job performance and allow a worker to incorrectly execute a task in a warehouse environment. This research project investigated the current challenges workers experience in a warehouse environment and the technologies utilised in this environment. The latest automation and identification systems and technologies are identified and discussed, specifically the technologies which have addressed known problems. Sensory technologies were identified that enable interaction between a human and a computerised warehouse environment. Biological and natural behaviours of humans which are applicable in the interaction with a computerised environment were described and discussed. The interactive behaviours included the visionary, auditory, speech production and physiological movement where other natural human behaviours such paying attention, action slips and the action of counting items were investigated. A number of modern sensory technologies, devices and techniques for HCI were identified with the aim of selecting and evaluating an appropriate sensory technology for MCI. iii MCI technologies enable a computer system to recognise hand and other gestures of a user, creating means of direct interaction between a user and a computer as they are able to detect specific features extracted from a specific biological or physiological activity. Thereafter, Machine Learning (ML) is applied in order to train a computer system to detect these features and convert them to a computer interface. An application of biomedical signals (bio-signals) in HCI using a MYO Armband for MCI is presented. An MCI prototype (MCIp) was developed and implemented to allow a user to provide input to an HCI, in a hands-free and hands-full situation. The MCIp was designed and developed to recognise the hand-finger gestures of a person when both hands are free or when holding an object, such a cardboard box. The MCIp applies an Artificial Neural Network (ANN) to classify features extracted from the surface Electromyography signals acquired by the MYO Armband around the forearm muscle. The MCIp provided the results of data classification for gesture recognition to an accuracy level of 34.87% with a hands-free situation. This was done by employing the ANN. The MCIp, furthermore, enabled users to provide numeric inputs to the MCIp system hands-full with an accuracy of 59.7% after a training session for each gesture of only 10 seconds. The results were obtained using eight participants. Similar experimentation with the MYO Armband has not been found to be reported in any literature at submission of this document. Based on this novel experimentation, the main contribution of this research study is a suggestion that the application of a MYO Armband, as a commercially available muscle-sensing device on the market, has the potential as an MCI to recognise the finger gestures hands-free and hands-full. An accurate MCI can increase the efficiency and effectiveness of an HCI tool when it is applied to different applications in a warehouse where noise and hands-full activities pose a challenge. Future work to improve its accuracy is proposed
    • 

    corecore