16,890 research outputs found
Generalized Hybrid Evolutionary Algorithm Framework with a Mutation Operator Requiring no Adaptation
This paper presents a generalized hybrid evolutionary optimization structure that not only combines both nondeterministic and deterministic algorithms on their individual merits and distinct advantages, but also offers behaviors of the three originating classes of evolutionary algorithms (EAs). In addition, a robust mutation operator is developed in place of the necessity of mutation adaptation, based on the mutation properties of binary-coded individuals in a genetic algorithm. The behaviour of this mutation operator is examined in full and its performance is compared with adaptive mutations. The results show that the new mutation operator outperforms adaptive mutation operators while reducing complications of extra adaptive parameters in an EA representation
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Multimodal Convolutional Neural Networks for Matching Image and Sentence
In this paper, we propose multimodal convolutional neural networks (m-CNNs)
for matching image and sentence. Our m-CNN provides an end-to-end framework
with convolutional architectures to exploit image representation, word
composition, and the matching relations between the two modalities. More
specifically, it consists of one image CNN encoding the image content, and one
matching CNN learning the joint representation of image and sentence. The
matching CNN composes words to different semantic fragments and learns the
inter-modal relations between image and the composed fragments at different
levels, thus fully exploit the matching relations between image and sentence.
Experimental results on benchmark databases of bidirectional image and sentence
retrieval demonstrate that the proposed m-CNNs can effectively capture the
information necessary for image and sentence matching. Specifically, our
proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and
Microsoft COCO databases achieve the state-of-the-art performances.Comment: Accepted by ICCV 201
Continuous function optimization using hybrid ant colony approach with orthogonal design scheme
A hybrid Orthogonal Scheme Ant Colony Optimization (OSACO) algorithm for continuous function optimization (CFO) is presented in this paper. The methodology integrates the advantages of Ant Colony Optimization (ACO) and Orthogonal Design Scheme (ODS). OSACO is based on the following principles: a) each independent variable space (IVS) of CFO is dispersed into a number of random and movable nodes; b) the carriers of pheromone of ACO are shifted to the nodes; c) solution path can be obtained by choosing one appropriate node from each IVS by ant; d) with the ODS, the best solved path is further improved. The proposed algorithm has been successfully applied to 10 benchmark test functions. The performance and a comparison with CACO and FEP have been studied
Early Turn-taking Prediction with Spiking Neural Networks for Human Robot Collaboration
Turn-taking is essential to the structure of human teamwork. Humans are
typically aware of team members' intention to keep or relinquish their turn
before a turn switch, where the responsibility of working on a shared task is
shifted. Future co-robots are also expected to provide such competence. To that
end, this paper proposes the Cognitive Turn-taking Model (CTTM), which
leverages cognitive models (i.e., Spiking Neural Network) to achieve early
turn-taking prediction. The CTTM framework can process multimodal human
communication cues (both implicit and explicit) and predict human turn-taking
intentions in an early stage. The proposed framework is tested on a simulated
surgical procedure, where a robotic scrub nurse predicts the surgeon's
turn-taking intention. It was found that the proposed CTTM framework
outperforms the state-of-the-art turn-taking prediction algorithms by a large
margin. It also outperforms humans when presented with partial observations of
communication cues (i.e., less than 40% of full actions). This early prediction
capability enables robots to initiate turn-taking actions at an early stage,
which facilitates collaboration and increases overall efficiency.Comment: Submitted to IEEE International Conference on Robotics and Automation
(ICRA) 201
- …