Search CORE

275 research outputs found

Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches

Author: Khin Sandar Kyaw
Publication venue: 'Faculty of Medicine Prince of Songkla University'
Publication date: 01/01/2020
Field of study

Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC. In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, Naïve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model

Invariance in deep representations

Author: Ilse M.
Publication venue
Publication date: 01/01/2022
Field of study

In this thesis, Invariance in Deep Representations, we propose novel solutions to the problem of learning invariant representations. We adopt two distinct notions of invariance. One is rooted in symmetry groups and the other in causality. Last, despite being developed independently from each other, we aim to take a first step towards unifying the two notions of invariance. The thesis consists of four main sections where: (i) We propose a neural network-based permutation-invariant aggregation operator that corresponds to the attention mechanism. We develop a novel approach for set classification. (ii) We demonstrate that causal concepts can be used to explain the success of data augmentation by describing how they can weaken the spurious correlation between the observed domains and the task labels. We demonstrate that data augmentation can serve as a tool for simulating interventional data. (iii) We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders with a single latent confounder that lives in the same space as the treatment variable without changing the observational and interventional distributions entailed by the causal model. After the reduction, we parameterize the reduced causal model using a flexible class of transformations, so-called normalizing flows. (iv) We propose the Domain Invariant Variational Autoencoder, a generative model that tackles the problem of domain shifts by learning three independent latent subspaces, one for the domain, one for the class, and one for any residual variations

UvA-DARE

Leveraging eXtented Reality & Human-Computer Interaction for User Experi- ence in 360◦ Video

Author: Bala Paulo Alexandre Câmara
Publication venue
Publication date: 01/10/2020
Field of study

EXtended Reality systems have resurged as a medium for work and entertainment. While 360o video has been characterized as less immersive than computer-generated VR, its realism, ease of use and affordability mean it is in widespread commercial use. Based on the prevalence and potential of the 360o video format, this research is focused on improving and augmenting the user experience of watching 360o video. By leveraging knowledge from Extented Reality (XR) systems and Human-Computer Interaction (HCI), this research addresses two issues affecting user experience in 360o video: Attention Guidance and Visually Induced Motion Sickness (VIMS). This research work relies on the construction of multiple artifacts to answer the de- fined research questions: (1) IVRUX, a tool for analysis of immersive VR narrative expe- riences; (2) Cue Control, a tool for creation of spatial audio soundtracks for 360o video, as well as enabling the collection and analysis of captured metrics emerging from the user experience; and (3) VIMS mitigation pipeline, a linear sequence of modules (including optical flow and visual SLAM among others) that control parameters for visual modi- fications such as a restricted Field of View (FoV). These artifacts are accompanied by evaluation studies targeting the defined research questions. Through Cue Control, this research shows that non-diegetic music can be spatialized to act as orientation for users. A partial spatialization of music was deemed ineffective when used for orientation. Addi- tionally, our results also demonstrate that diegetic sounds are used for notification rather than orientation. Through VIMS mitigation pipeline, this research shows that dynamic restricted FoV is statistically significant in mitigating VIMS, while mantaining desired levels of Presence. Both Cue Control and the VIMS mitigation pipeline emerged from a Research through Design (RtD) approach, where the IVRUX artifact is the product of de- sign knowledge and gave direction to research. The research presented in this thesis is of interest to practitioners and researchers working on 360o video and helps delineate future directions in making 360o video a rich design space for interaction and narrative.Sistemas de Realidade EXtendida ressurgiram como um meio de comunicação para o tra- balho e entretenimento. Enquanto que o vídeo 360o tem sido caracterizado como sendo menos imersivo que a Realidade Virtual gerada por computador, o seu realismo, facili- dade de uso e acessibilidade significa que tem uso comercial generalizado. Baseado na prevalência e potencial do formato de vídeo 360o, esta pesquisa está focada em melhorar e aumentar a experiência de utilizador ao ver vídeos 360o. Impulsionado por conhecimento de sistemas de Realidade eXtendida (XR) e Interacção Humano-Computador (HCI), esta pesquisa aborda dois problemas que afetam a experiência de utilizador em vídeo 360o: Orientação de Atenção e Enjoo de Movimento Induzido Visualmente (VIMS). Este trabalho de pesquisa é apoiado na construção de múltiplos artefactos para res- ponder as perguntas de pesquisa definidas: (1) IVRUX, uma ferramenta para análise de experiências narrativas imersivas em VR; (2) Cue Control, uma ferramenta para a criação de bandas sonoras de áudio espacial, enquanto permite a recolha e análise de métricas capturadas emergentes da experiencia de utilizador; e (3) canal para a mitigação de VIMS, uma sequência linear de módulos (incluindo fluxo ótico e SLAM visual entre outros) que controla parâmetros para modificações visuais como o campo de visão restringido. Estes artefactos estão acompanhados por estudos de avaliação direcionados para às perguntas de pesquisa definidas. Através do Cue Control, esta pesquisa mostra que música não- diegética pode ser espacializada para servir como orientação para os utilizadores. Uma espacialização parcial da música foi considerada ineficaz quando usada para a orientação. Adicionalmente, os nossos resultados demonstram que sons diegéticos são usados para notificação em vez de orientação. Através do canal para a mitigação de VIMS, esta pesquisa mostra que o campo de visão restrito e dinâmico é estatisticamente significante ao mitigar VIMS, enquanto mantem níveis desejados de Presença. Ambos Cue Control e o canal para a mitigação de VIMS emergiram de uma abordagem de Pesquisa através do Design (RtD), onde o artefacto IVRUX é o produto de conhecimento de design e deu direcção à pesquisa. A pesquisa apresentada nesta tese é de interesse para profissionais e investigadores tra- balhando em vídeo 360o e ajuda a delinear futuras direções em tornar o vídeo 360o um espaço de design rico para a interação e narrativa

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes