770 research outputs found

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    Enhancing Emotion Classification in Malayalam Accented Speech: An In-Depth Clustering Approach

    Get PDF
    Accurate emotion classification in accented speech for the Malayalam language poses a unique challenge in the realm of speech recognition. In this study, we explore the application of various clustering algorithms to this specific dataset, evaluating their effectiveness using the Silhouette Score as a measure of cluster quality. Our findings reveal significant insights into the performance of these algorithms. Among the clustering methods, Affinity Propagation emerged as the frontrunner, achieving the highest Silhouette Score of 0.5255. This result indicates a superior cluster quality characterized by well-defined and distinct groups. OPTICS and Mean Shift Clustering also demonstrated strong performance with scores of 0.4029 and 0.2511, respectively, indicating the presence of relatively distinct and well-formed clusters. In addition, we introduced Ensemble Clustering (Majority Voting), which achieved a score of 0.2399, indicating moderate cluster distinction. These findings provide a valuable perspective on the potential advantages of ensemble methods in this context. Our experiment results shed light on the effectiveness of various clustering methods in the context of emotion classification in accented Malayalam speech. This study contributes to the advancement of speech recognition technology and lays the groundwork for further research in this area.

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Review on Active and Passive Remote Sensing Techniques for Road Extraction

    Get PDF
    Digital maps of road networks are a vital part of digital cities and intelligent transportation. In this paper, we provide a comprehensive review on road extraction based on various remote sensing data sources, including high-resolution images, hyperspectral images, synthetic aperture radar images, and light detection and ranging. This review is divided into three parts. Part 1 provides an overview of the existing data acquisition techniques for road extraction, including data acquisition methods, typical sensors, application status, and prospects. Part 2 underlines the main road extraction methods based on four data sources. In this section, road extraction methods based on different data sources are described and analysed in detail. Part 3 presents the combined application of multisource data for road extraction. Evidently, different data acquisition techniques have unique advantages, and the combination of multiple sources can improve the accuracy of road extraction. The main aim of this review is to provide a comprehensive reference for research on existing road extraction technologies.Peer reviewe

    Research on Emotion Classification Based on Multi-modal Fusion

    Get PDF
    في الوقت الحاضر، لم يعد تعبير الأشخاص على الإنترنت يقتصر على النصوص، خاصة مع ظهور طفرة الفيديو القصير، مما أدى إلى ظهور عدد كبير من البيانات النموذجية مثل النصوص والصور والصوت والفيديو. بالمقارنة مع بيانات الوضع الفردي، تحتوي البيانات متعددة الوسائط دائمًا على معلومات ضخمة. يمكن أن تساعد عملية التنقيب في المعلومات متعددة الوسائط أجهزة الكمبيوتر على فهم الخصائص العاطفية البشرية بشكل أفضل. ومع ذلك، نظرًا لأن البيانات متعددة الوسائط تُظهر ميزات سلسلة زمنية ديناميكية واضحة، فمن الضروري حل مشكلة الارتباط الديناميكي داخل وضع واحد وبين أوضاع مختلفة في نفس مشهد التطبيق أثناء عملية الدمج. لحل هذه المشكلة، في هذا البحث، تم إنشاء إطار استخراج ميزة للتوسع الديناميكي ثلاثي الأبعاد بناءً على البيانات المشتركة متعددة الوسائط، على سبيل المثال الفيديو والصوت والنص. إطار عمل مطابق يعتمد على تحسين الميزات المكانية والزمانية، على التوالي لحل الارتباط الديناميكي داخل الأوضاع وفيما بينها، ومن ثم نمذجة معلومات الارتباط الديناميكي قصيرة وطويلة المدى بين الأوضاع المختلفة بناءً على الإطار المقترح. تُظهر التجارب الجماعية المتعددة التي تم إجراؤها على مجموعات بيانات MOSI  أن نموذج التعرف على المشاعر الذي تم إنشاؤه بناءً على الإطار المقترح هنا في هذه الدراسة يمكنه الاستفادة بشكل أفضل من المعلومات التكميلية الأكثر تعقيدًا بين البيانات المشروطة المختلفة. بالمقارنة مع نماذج دمج البيانات متعددة الوسائط الأخرى، فإن إطار دمج البيانات متعدد الوسائط القائم على الاهتمام المكاني والزماني المقترح في هذه الورقة يحسن بشكل كبير معدل التعرف على المشاعر ودقتها عند تطبيقها على تحليل المشاعر متعدد الوسائط، لذلك فهو أكثر جدوى وفعالية.Nowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of the three-dimensional dynamic expansion is established based on the common multi-modal data, for example video , sound ,text.Based on the framework, a multi-modal fusion-matched framework based on spatial and temporal feature enhancement, respectively to solve the dynamic correlation within and between modes, and then model the short and long term dynamic correlation information between different modes based on the proposed framework. Multiple group experiments performed on MOSI datasets show that the emotion recognition model constructed based on the framework proposed here in this paper can better utilize the more complex complementary information between different modal data. Compared with other multi-modal data fusion models, the spatial-temporal attention-based multimodal data fusion framework proposed in this paper significantly improves the emotion recognition rate and accuracy when applied to multi-modal emotion analysis, so it is more feasible and effective

    Deep learning for internet of underwater things and ocean data analytics

    Get PDF
    The Internet of Underwater Things (IoUT) is an emerging technological ecosystem developed for connecting objects in maritime and underwater environments. IoUT technologies are empowered by an extreme number of deployed sensors and actuators. In this thesis, multiple IoUT sensory data are augmented with machine intelligence for forecasting purposes
    corecore