258 research outputs found

    The application of manifold based visual speech units for visual speech recognition

    Get PDF
    This dissertation presents a new learning-based representation that is referred to as a Visual Speech Unit for visual speech recognition (VSR). The automated recognition of human speech using only features from the visual domain has become a significant research topic that plays an essential role in the development of many multimedia systems such as audio visual speech recognition(AVSR), mobile phone applications, human-computer interaction (HCI) and sign language recognition. The inclusion of the lip visual information is opportune since it can improve the overall accuracy of audio or hand recognition algorithms especially when such systems are operated in environments characterized by a high level of acoustic noise. The main contribution of the work presented in this thesis is located in the development of a new learning-based representation that is referred to as Visual Speech Unit for Visual Speech Recognition (VSR). The main components of the developed Visual Speech Recognition system are applied to: (a) segment the mouth region of interest, (b) extract the visual features from the real time input video image and (c) to identify the visual speech units. The major difficulty associated with the VSR systems resides in the identification of the smallest elements contained in the image sequences that represent the lip movements in the visual domain. The Visual Speech Unit concept as proposed represents an extension of the standard viseme model that is currently applied for VSR. The VSU model augments the standard viseme approach by including in this new representation not only the data associated with the articulation of the visemes but also the transitory information between consecutive visemes. A large section of this thesis has been dedicated to analysis the performance of the new visual speech unit model when compared with that attained for standard (MPEG- 4) viseme models. Two experimental results indicate that: 1. The developed VSR system achieved 80-90% correct recognition when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 62-72%. 2. 15 words are identified when VSU and viseme are employed as the visual speech element. The accuracy rate for word recognition based on VSUs is 7%-12% higher than the accuracy rate based on visemes

    Dictionary-based lip reading classification

    Get PDF
    Visual lip reading recognition is an essential stage in many multimedia systems such as “Audio Visual Speech Recognition” [6], “Mobile Phone Visual System for deaf people”, “Sign Language Recognition System”, etc. The use of lip visual features to help audio or hand recognition is appropriate because this information is robust to acoustic noise. In this paper, we describe our work towards developing a robust technique for lip reading classification that extracts the lips in a colour image by using EMPCA feature extraction and k-nearest-neighbor classification. In order to reduce the dimensionality of the feature space the lip motion is characterized by three templates that are modelled based on different mouth shapes: closed template, semi-closed template, and wideopen template. Our goal is to classify each image sequence based on the distribution of the three templates and group the words into different clusters. The words that form the database were grouped into three different clusters as follows: group1(‘I’, ‘high’, ‘lie’, ‘hard’, ‘card’, ‘bye’), group2(‘you, ‘owe’, ‘word’), group3(‘bird’)

    Mask-Wearing Behaviors in Air Travel During Coronavirus Pandemic – An Extended Theory of Planned Behavior Model

    Get PDF
    The COVID-19 pandemic has devastated the air transport industry, forcing airlines to take measures to ensure the safety of passengers and crew members. Among the many protective measures, a mask mandate onboard the airplane is an important one, but travelers’ maskwearing intentions during flight remain uninvestigated especially in the US where mask use is a topic of on-going debate. This study focused on the mask use of airline passengers when they fly during COVID-19, using the theory of planned behavior (TPB) model to examine the relationship between nine predicting factors and the mask-wearing intention in the aircraft cabin. A survey instrument was developed to collect data from 1,124 air travelers on Amazon Mechanical Turk (MTurk), and the data was statistically analyzed using structural equation modeling and logistic regression. Results showed that attitude, descriptive norms, risk avoidance, and information seeking significantly influenced the travelers’ intention to wear a mask during flight during COVID-19. Group analysis indicated that the four factors influenced mask-wearing intentions differently in young, middle-aged, and senior travelers. The results further show a significant impact of the three factors on mask-wearing intention and a strong mediating effect of attitude, indicating that attitude can be used to better understand the relationships between the factors. When five demographic characteristics – age, gender, education, income, and ethnicity were considered, all except gender could help to explain the group variations in factor impact and the mediating effect in mask-wearing intentions. It was also found that demographic and travel characteristics including age, education, income, and travel frequency can be used to predict if the airline passenger was willing to pay a large amount to switch to airlines that adopted different mask policies during COVID-19. The findings of this study fill the research gap of air travelers’ intentions to wear a mask when flying during a global pandemic and provide recommendations for mask wearing policies to help the air transport industry recover from COVID-19

    Cardiovascular risks and bleeding with non-vitamin K antagonist oral anticoagulant versus warfarin in patients with type 2 diabetes : a tapered matching cohort study

    Get PDF
    Background: We compared the risk of bleeding and cardiovascular disease (CVD) events between non-vitamin K antagonist oral anticoagulant (NOAC) and warfarin in people with type 2 diabetes (T2DM). Methods: 862 Incident NOAC users and 626 incident warfarin users with T2DM were identified from within 40 UK general practice (1/4/2017-30/9/2018). Outcomes included incident hospitalisation for bleeding, CVD and re-hospitalisation for CVD within 12 months since first anticoagulant prescription, identified from linked hospitalisation data. A tapered matching method was applied to form comparison cohorts: coarsened exact matching restricted the comparison to areas of sufficient overlap in missingness and characteristics: (i) demographic characteristics; (ii) clinical measurements; (iii) prior bleeding and CVD history; (iv) prescriptions with bleeding; (v) anti-hypertensive treatment(s); (vi) anti-diabetes treatment(s). Entropy balancing sequentially balanced NOAC and warfarin users on their distribution of (i-vi). Weighted logistic regression modelling estimated outcome odds ratios (ORs), using entropy balancing weights from steps i-vi. Results: The 12-month ORs of bleeding with NOAC (n = 582) vs matched/balanced warfarin (n = 486) were 1.93 (95% confidence interval 0.97-3.84), 2.14 (1.03-4.44), 2.31 (1.10-4.85), 2.42 (1.14-5.14), 2.41 (1.12-5.18), and 2.51 (1.17-5.38) through steps i-vi. ORs for CVD re-hospitalisation was increased with NOAC treatment through steps i-vi: 2.21 (1.04-4.68), 2.13 (1.01-4.52), 2.47 (1.08-5.62), 2.46 (1.02-5.94), 2.51 (1.01-6.20), and 2.66 (1.02-6.94). Conclusions: Incident NOAC use among T2DM is associated with increased risk of bleeding hospitalisation and CVD re-hospitalisation compared with incident warfarin use. For T2DM, caution is required in prescribing NOACs as first anticoagulant treatment. Further large-scale replication studies in external datasets are warranted

    A new visual speech modelling approach for visual speech recognition

    Get PDF
    In this paper we propose a new learning-based representation that is referred to as Visual Speech Unit (VSU) for visual speech recognition (VSR). The new Visual Speech Unit concept proposes an extension of the standard viseme model that is currently applied for VSR by including in this representation not only the data associated with the visemes, but also the transitory information between consecutive visemes. The developed speech recognition system consists of several computational stages: (a) lips segmentation, (b) construction of the Expectation-Maximization Principal Component Analysis (EM-PCA) manifolds from the input video image, (c) registration between the models of the VSUs and the EM-PCA data constructed from the input image sequence and (d) recognition of the VSUs using a standard Hidden Markov Model (HMM) classification scheme. In this paper we were particularly interested to evaluate the classification accuracy obtained for our new VSU models when compared with that attained for standard (MPEG-4) viseme models. The experimental results indicate that we achieved 90% recognition rate when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 52%

    A PCA based manifold representation for visual speech recognition

    Get PDF
    In this paper, we discuss a new Principal Component Analysis (PCA)-based manifold representation for visual speech recognition. In this regard, the real time input video data is compressed using Principal Component Analysis and the low-dimensional points calculated for each frame define the manifold. Since the number of frames that form the video sequence is dependent on the word complexity, in order to use these manifolds for visual speech classification it is required to re-sample them into a fixed pre-defined number of key-points. These key-points are used as input for a Hidden Markov Model (HMM) classification scheme. We have applied the developed visual speech recognition system to a database containing a group of English words and the experimental data indicates that the proposed approach is able to produce accurate classification results

    Unusual acceleration and size effects in grain boundary migration with shear coupling

    Full text link
    Grain boundary (GB) migration plays a crucial role in the thermal and mechanical responses of polycrystalline materials, particularly in ultrafine-grained and nano-grained materials exhibiting grain size-dependent properties. This study investigates the migration behaviors of a set of GBs in Ni through atomistic simulations, employing synthetic driving forces and shear stress. Surprisingly, the displacements of some shear-coupling GBs do not follow the widely assumed linear or approximately linear relation with time; instead, they exhibit a noticeable acceleration tendency. Furthermore, as the bicrystal size perpendicular to the GB plane increases, the boundary velocity significantly decreases. These observations are independent of the magnitude and type of driving force but are closely linked to temperature, unique to shear-coupling GBs that display a rise in the kinetic energy component along the shear direction. By adopting a specific boundary condition, the acceleration in migration and size effect can be largely alleviated. However, the continuous rise in kinetic energy persists, leading to the true driving force for GB migration being lower than the applied value. To address this, we propose a technique to extract the true driving force based on a quantitative analysis of the work-energy relation in the bicrystal system. The calculated true mobility reveals that the recently proposed mobility tensor may not be symmetric at relatively large driving forces. These discoveries advance our understanding of GB migration and offer a scheme to extract the true mobility, crucial for meso- and continuum-scale simulations of GB migration-related phenomena such as crack propagation, recrystallization, and grain growth.Comment: 28 pages, 10 Figure

    Enhanced Thermal Conductivity for Nanofluids Containing Silver Nanowires with Different Shapes

    Get PDF
    Nanofluids are the special agents to enhance the heat transfer property of the common fluids, and most of the thermal additives are the spherical nanoparticles. Up to now, the 1D thermal additives are not well exploited. In this paper, a kind of silver nanowires (AgNWs) with well-distributed shape and aspect ratio is synthesized. The results show that when we use the AgNWs prepared by the poly-vinyl-pyrrolidone (PVP) with a specific molecular weight of 40000, the thermal conductivity enhancement of nanofluids prepared by that kind of silver nanowires is as high as 13.42% when loading 0.46 vol.% AgNWs, and the value of the thermal conductivity is 0.2843 W/m·K, which is far more than the case when loading the same volume of spherical silver particles. Besides, we use H&C model to fit the experimental results and the experimental results are consistent with the model

    Dialysate glucose response phenotypes during peritoneal equilibration test and their association with cardiovascular death : a cohort study

    Get PDF
    Different measures of rates of transfer of glucose during the peritoneal equilibrium test (PET), undertaken during peritoneal dialysis (PD) might provide additional information regarding a patient's risk of future cardiovascular mortality. This study aimed to characterize the heterogeneity of dialysate glucose (DG) response phenotypes during the PET and compare the cardiovascular mortality rates associated with the different phenotypes. Our cohort was derived from Henan peritoneal dialysis registry. A total of 3477 patients initiating PD in 2007 to 2014 had the DG measured at 0, 2-hour and 4-hour (D0, D2, and D4 respectively) during the PET for estimation of D2/D0 and D4/D0. Deaths mainly due to CVD within 2 years since the initiation of PD were defined as the outcome. Latent class mixed-effect models were fitted to identify distinct phenotypes of the DG response during the PET. Multivariable unconditional Logistic regression models with adjustment for cardiometabolic risk factors were used to compare the 2-year risk of cardiovascular mortality among patients in the different latent classes. Three distinct DG response phenotypes during the PET were identified. Those with consistently high D2/D0 and D4/D0 ratios had a 1.22 [95% confidence interval: 1.02, 1.35] excess risk of a cardiovascular death within 2 years of commencing PD compared with patients with the lowest D2/D0 ratio and decreased D4/D0 ratio after adjustment for cardiometabolic risk factors. Consistently elevated D2/D0 and D4/D0 ratios during the PET are associated with an increased risk of 2-year cardiovascular mortality independent of other cardiometabolic risk factors. In view of the potential bias due to unmeasured confounders (eg, Family history of cardiovascular diseases, and dietary patterns), this association should be further validated in other external cohorts
    • 

    corecore