1,149 research outputs found
Hybrid On-Device Cloud Scheme for Re-Identification of Persons Based on Shared Embedding Gallery
Generally, the present disclosure is directed to a system of facial and/or person recognition via machine learning and Internet of Things (IoT). In particular, in some implementations, the systems and methods of the present disclosure can include or otherwise leverage a machine learning and IoT system or device to track and/or identify a person based on video images taken by one or more device(s). For example, a hybrid on-device and cloud scheme can enable locally-derived embeddings from multiple camera devices to be sent to a shared cloud space which can cluster the embeddings to generate a person model for a given person. Later, a camera device participating in the scheme can again detect a face and can match an embedding generated for the face against the shared gallery of person models to (potentially) re-identify the previously observed person
Intelligent strategies for mobile robotics in laboratory automation
In this thesis a new intelligent framework is presented for the mobile robots in laboratory automation, which includes: a new multi-floor indoor navigation method is presented and an intelligent multi-floor path planning is proposed; a new signal filtering method is presented for the robots to forecast their indoor coordinates; a new human feature based strategy is proposed for the robot-human smart collision avoidance; a new robot power forecasting method is proposed to decide a distributed transportation task; a new blind approach is presented for the arm manipulations for the robots
Recommended from our members
Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem
of identifying a speaker from its voice regardless of the content (i.e.
text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system.
A novel approach towards speaker identification is developed using
wavelet analysis, and multiple neural networks including Probabilistic
Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND
voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state-
of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA).
Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear.
Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identi¯cation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that
the proposed scheme is one of the best candidates for the fusion of
face and voice due to its low computational time and high recognition accuracy
The effect of high variability and individual differences on phonetic training of Mandarin tones
High variability phonetic training (HVPT) has been found to be more effective than low variability phonetic training (LVPT) in learning various non-native phonetic contrasts. However, little research has considered whether this applies to the learning of tone contrasts. Two relevant studies suggested that the effect of high variability training depends on the perceptual aptitude of participants (Perrachione, Lee, Ha, & Wong, 2011; Sadakata & McQueen, 2014). It is also unclear how different types of individual difference measures interact with the learning of tonal language. What work there is, suggests that musical ability is related to discriminating tonal information and in general attention and working memory are linked to language learning. The present study extends these findings by examining the interaction between individual aptitude and input variability and between learning outcomes and individual measures using natural, meaningful L2 input (both previous studies used pseudowords). In Study 1, forty English speakers took part in an eight-session phonetic training paradigm. They were assigned to high/low variability training groups. High variability used four speakers during the training sessions while low variability used one. All participants learned real Mandarin tones and words. Individual aptitude was measured using an identification and a categorisation task. Learning was measured using a categorical discrimination task, an identification task and two production tasks. Overall, all groups improved in both production and perception of tones which transferred to novel voices and items, demonstrating the effectiveness of training despite the increased complexity of the training material compared with previous research. Although the low variability group exhibited better learning during training than the high variability group, there was no evidence that the different variability training conditions led to different performances in any of the tests of generalisation. Moreover, although performance on one of the aptitude tasks significantly predicted overall performance in categorical discrimination, identification and training tasks, it did not predict improvement from pre- to post- test. Critically, there was also no interaction between individual aptitude and variability-condition, contradicting with previous findings. One possibility was that the high variability condition was too difficult as speakers were randomly presented during training, resulting in low trial-by-trial consistency. This greater difficulty might block any advantage of variability for generalisation. In order to examine this, Study 2 recruited additional 20 native English speakers and tested them in a further condition, identical to the previous high variability condition except that each speaker was presented in their own block during the training. Although participants performed better in training compared with the high variability group from study 1, there was again no difference in generalisation compared with the previous conditions, and again no interaction between individual aptitude and variability-condition was found. Bayes Factors were also used to assess the null results. There was evidence for the null for the benefits of high variability for generalisation but only ambiguous evidence regarding whether there was interaction between variability and individual aptitude. The HPVT used in Study 1 and Study 2 did not replicate the interaction between variability-condition and aptitude found in previous studies. Moreover, although one of the measures of aptitude did correlate with the baseline measures of performance, there was no evidence that it predicted learning due to training. Additionally, the two individual aptitude measures used in Study 1 and 2 – taken from Perrachione, et al. (2011) and Sadakata and McQueen (2013) – are not comprehensive. They are natural language-related tasks which directly measure tone perception itself, rather than the underlying cognitive factors which could underpin this ability. Another interesting question is whether these different cognitive factors might contribute to learners at different stages differently, particularly since language training studies vary as to whether they use current learners of the language or naïve participants, a factor may contribute towards differing findings in the literature. To explore these issues, Study 3 investigated the relationship between a battery of cognitive individual difference measures and Mandarin tone learning. Sixty native English speakers (forty of whom were currently studying Mandarin at undergraduate level, twenty of whom were naïve learners) took part in a six-session training paradigm. With high-variability training stimuli similar to that used in Study 2 (four speakers blocked), their learning outcomes were assessed by identification, categorical discrimination and production tasks similar to Study 1. Their working memory, attention and musical ability were also measured. Overall, both groups showed improvements during training and in the generalisation tasks. Although Mandarin learner participants performed better than naïve participants overall, the improvements were not generally greater than naïve participants. Each of the individual difference measures was used to predict participant’s performance at pre-test and their improvement due to training. Bayes Factors were used as the key method of inference. For Mandarin learner participants, both performances at pre-test and pre- to- post improvement were strongly predicted by attention measures while for naïve speakers, musical ability was the dominant predictor for pre- to- post improvement. This series of studies demonstrates that Mandarin lexical tones can be trained using natural stimuli embedded in a word learning task and learning generalises to untrained voices and items as well as to production. Although there is no evidence in the current data that the type of training materials affected learning outcomes, tone learning is indeed affected by individual cognitive factors, such as attention and musical ability, with these playing a different role for learners at different stages
Evaluating indoor positioning systems in a shopping mall : the lessons learned from the IPIN 2018 competition
The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processing (off-site) situations, in a realistic environment unfamiliar to the prototype developers. For the IPIN 2018 conference, this competition was held on September 22nd, 2018, in Atlantis, a large shopping mall in Nantes (France). Four competition tracks (two on-site and two off-site) were designed. They consisted of several 1 km routes traversing several floors of the mall. Along these paths, 180 points were topographically surveyed with a 10 cm accuracy, to serve as ground truth landmarks, combining theodolite measurements, differential global navigation satellite system (GNSS) and 3D scanner systems. 34 teams effectively competed. The accuracy score corresponds to the third quartile (75th percentile) of an error metric that combines the horizontal positioning error and the floor detection. The best results for the on-site tracks showed an accuracy score of 11.70 m (Track 1) and 5.50 m (Track 2), while the best results for the off-site tracks showed an accuracy score of 0.90 m (Track 3) and 1.30 m (Track 4). These results showed that it is possible to obtain high accuracy indoor positioning solutions in large, realistic environments using wearable light-weight sensors without deploying any beacon. This paper describes the organization work of the tracks, analyzes the methodology used to quantify the results, reviews the lessons learned from the competition and discusses its future
- …