13 research outputs found

    On data mining in context : cases, fusion and evaluation

    Get PDF
    Data mining can be seen as a process, with modeling as the core step. However, other steps such as planning, data preparation, evaluation and deployment are of key importance for applications. This thesis studies data mining in the context of these other steps with the goal of improving data mining applicability. We introduce cases that provide an end to end overview and serve as motivating examples, and then focus on specific research topics. We discuss the problem of data mining across multiple sources, with data fusion as a potential solution. This is an interesting research topic, as it removes barriers for applications and data mining can be used to carry out the fusion. We then analyze a large scale experiment in real world data mining. We use the bias variance evaluation framework across all steps in the process to investigate the large spread in results for a data mining competition. We conclude with a study advocating model profiling for novel classifiers. Given that it is unlikely that a novel classifier outperforms all competing classifiers across all problems, it is more interesting to characterize on what problems it performs best and to what other algorithms its behavior is most similar.LEI Universiteit LeidenAlgorithms and the Foundations of Software technolog

    Sign and search: sign search functionality for sign language lexica

    Get PDF
    Sign language lexica are a useful resource for researchers and people learning sign languages. Current implementations allow a user to search a sign either by its gloss or by selecting its primary features such as handshape and location. This study focuses on exploring a reverse search functionality where a user can sign a query sign in front of a webcam and retrieve a set of matching signs. By extracting different body joints combinations (upper body, dominant hand's arm and wrist) using the pose estimation framework OpenPose, we compare four techniques (PCA, UMAP, DTW and Euclidean distance) as distance metrics between 20 query signs, each performed by eight participants on a 1200 sign lexicon. The results show that UMAP and DTW can predict a matching sign with an 80\% and 71\% accuracy respectively at the top-20 retrieved signs using the movement of the dominant hand arm. Using DTW and adding more sign instances from other participants in the lexicon, the accuracy can be raised to 90\% at the top-10 ranking. Our results suggest that our methodology can be used with no training in any sign language lexicon regardless of its size.Computer Systems, Imagery and Medi

    Sign and search: sign search functionality for sign language lexica

    Get PDF
    Sign language lexica are a useful resource for researchers and people learning sign languages. Current implementations allow a user to search a sign either by its gloss or by selecting its primary features such as handshape and location. This study focuses on exploring a reverse search functionality where a user can sign a query sign in front of a webcam and retrieve a set of matching signs. By extracting different body joints combinations (upper body, dominant hand's arm and wrist) using the pose estimation framework OpenPose, we compare four techniques (PCA, UMAP, DTW and Euclidean distance) as distance metrics between 20 query signs, each performed by eight participants on a 1200 sign lexicon. The results show that UMAP and DTW can predict a matching sign with an 80\% and 71\% accuracy respectively at the top-20 retrieved signs using the movement of the dominant hand arm. Using DTW and adding more sign instances from other participants in the lexicon, the accuracy can be raised to 90\% at the top-10 ranking. Our results suggest that our methodology can be used with no training in any sign language lexicon regardless of its size.Computer Systems, Imagery and Medi

    Tracing political positioning of Dutch newspapers

    Get PDF
    Computer Systems, Imagery and Medi

    Trust me on this one: conforming to conversational assistants

    Get PDF
    Conversational artificial agents and artificially intelligent (AI) voice assistants are becoming increasingly popular. Digital virtual assistants such as Siri, or conversational devices such as Amazon Echo or Google Home are permeating everyday life, and are designed to be more and more humanlike in their speech. This study investigates the effect this can have on one’s conformity with an AI assistant. In the 1950s, Solomon Asch’s already demonstrated the power and danger of conformity amongst people. In these classical experiments test persons were asked to answer relatively simple questions, whilst others pretending to be participants tried to convince the test person to give wrong answers. These studies were later replicated with embodied robots, but these physical robots are still rare. In light of our increasing reliance on AI assistants, this study investigates to what extent an individual will conform to a disembodied virtual assistant. We also investigate if there is a difference between a group that interacts with an assistant that communicates through text, one that has a robotic voice and one that has a humanlike voice. The assistant attempts to subtly influence participants’ final responses in a general knowledge quiz, and we measure how often participants change their answer after having been given advice. Results show that participants conformed significantly more often to the assistant with a human voice than the one that communicated through text.Computer Systems, Imagery and MediaAlgorithms and the Foundations of Software technolog

    "Disciples of the Heinous Path: Exploring Label Structure in Heavy Metal Genres"

    Get PDF
    Heavy Metal is a popular sub culture, and in itself is highly tribalized, which makes it an interesting domain to research how cultures and sub cultures relate and evolve. To study this, we scrape the Encyclopaedia Metallum heavy metal music archive website to generate a large scale networked data set. Bands are linked through shared musicians, and each band can be labelled with multiple user contributed genres. By applying Word2Vec on genre co-occurences, and hierarchical network clustering on the band collaboration graph, we gain insight into how music genres relate to each other. While the Word2Vec results show some interesting patterns with regards to the observed clusters, the hierarchical clustering proves to be more inconclusive, partially caused by factors beyond genre that generate the network. From a machine learning point of view, this case is an instance of the more general problem of understanding label structure in networked data.Computer Systems, Imagery and Medi

    Signing as input for a dictionary query: matching signs based on joint positions of the dominant hand

    Get PDF
    This study presents a new methodology to search sign language lexica, using a full sign as input for a query. Thus, a dictionary user can look up information about a sign by signing the sign to a webcam. The recorded sign is then compared to potential matching signs in the lexicon. As such, it provides a new way of searching sign language dictionaries to complement existing methods based on (spoken language) glosses or phonological features, like handshape or location. The method utilizes OpenPose to extract the body and finger joint positions. Dynamic Time Warping (DTW) is used to quantify the variation of the trajectory of the dominant hand and the average trajectories of the fingers. Ten people with various degrees of sign language proficiency have participated in this study. Each subject viewed a set of 20 signs from the newly compiled Ghanaian sign language lexicon and was asked to replicate the signs. The results show that DTW can predict the matching sign with 87% and 74% accuracy at the Top-10 and Top-5 ranking level respectively by using only the trajectory of the dominant hand. Additionally, more proficient signers obtain 90% accuracy at the Top-10 ranking. The methodology has the potential to be used also as a variation measurement tool to quantify the difference in signing between different signers or sign languages in general.Computer Systems, Imagery and Medi

    Towards a user-friendly tool for automated sign annotation: identification and annotation of time slots, number of hands, and handshape

    Get PDF
    The annotation process of sign language corpora in terms of glosses, is a highly labor-intensive task, but a condition for a reliable quantitative analysis. During the annotation process the researcher typically defines the precise time slot in which a sign occurs and then enters the appropriate gloss for the sign. The aim of this project is to develop a set of tools to assist the annotation of the signs and their formal features in a video irrespectively of its content and quality. Recent advances in the field of deep learning have led to the development of accurate and fast pose estimation frameworks. In this study, such a framework (namely OpenPose) has been used to develop three different methods and tools to facilitate the annotation process. The first tool estimates the span of a sign sequence and creates empty slots in an annotation file. The second tool detects whether a sign is one- or two-handed. The last tool recognizes the different handshapes presented in a video sample. All tools can be easily re-trained to fit the needs of the researcher.</p
    corecore