18 research outputs found

    Brute - Force Sentence Pattern Extortion from Harmful Messages for Cyberbullying Detection

    Get PDF
    Cyberbullying, or humiliating people using the Internet, has existed almost since the beginning ofInternet communication.The relatively recent introduction of smartphones and tablet computers has caused cyberbullying to evolve into a serious social problem. In Japan, members of a parent-teacher association (PTA)attempted to address the problem by scanning the Internet for cyber bullying entries. To help these PTA members and other interested parties confront this difficult task we propose a novel method for automatic detection of malicious Internet content. This method is based on a combinatorial approach resembling brute-force search algorithms, but applied in language classification. The method extracts sophisticated patterns from sentences and uses them in classification. The experiments performed on actual cyberbullying data reveal an advantage of our method vis-à-visprevious methods. Next, we implemented the method into an application forAndroid smartphones to automatically detect possible harmful content in messages. The method performed well in the Android environment, but still needs to be optimized for time efficiency in order to be used in practic

    The effect of component recognition on flexibility and speech recognition performance in a spoken question answering system

    Get PDF
    A spoken question answering system that recognizes questions as full sentences performs well when users ask one of the questions defined. A system that recognizes component words and finds an equivalent defined question might be more flexible, but is likely to have decreased speech recognition performance, leading to a loss in overall system success. The research described in this document compares the advantage in flexibility to the loss in recognition performance when using component recognition. Questions posed by participants were processed by a system of each type. As expected, the component system made frequent recognition errors while detecting words (word error rate of 31%). In comparison, the full system made fewer errors while detecting full sentences (sentence error rate of 10%). Nevertheless, the component system succeeded in providing proper responses to 76% of the queries posed, while the full system responded properly to only 46%. Four variations of the traditional tf-idf weighting method were compared as applied to the matching of short text strings (fewer than 10 words). It was found that the general approach was successful in finding matches, and that all four variations compensated for the loss in speech recognition performance to a similar degree. No significant difference due to the variations in weighting was detected in the results

    Investigating Cultural Dimensions via Developers Artefacts: The Utility of Repository Mining

    Get PDF
    A growing body of research is using artefacts from online development communities to explore the impact of developers’ behaviours on the software development process. Although this research has produced many insights, researchers have yet to fully explore the impact of developers’ cultural backgrounds on their behaviours in an online community, although such understandings could be useful for helping the community to understand and plan for team dynamics. This study utilised a pragmatic case study to explore the relationship between culture and online behaviour among developers from the United States (U.S.), China, and Russia—three countries that differ in their orientations as individualistic or collectivist cultures. The data for the study comprised artefacts supplied over an 11-year period by users of Stack Overflow1, a popular online programming community that addresses questions from members by providing them with rapid access to the knowledge and expertise of their peers. Artefacts consisted of developers’ questions and answers, personal profiles, Up and Down voting records, online reputations, and earned badges. Data mining techniques, as well as statistical, linguistic, and content analysis were used to compare artefacts from the three groups of developers based on their cultural orientation as individualistic or collectivistic, attitudes, and interaction and knowledge sharing patterns. The findings revealed differences among the three groups that were consistent with their cultural backgrounds. U.S. developers, who are from an individualistic culture, asked and responded to more questions, had higher average reputations, used the pronoun “I” more frequently, and were more task- focused. Conversely, Chinese developers, who are from a collectivistic culture, provided more extensive commenting and editing of posts, used the pronouns “we” and “you” more frequently, and were more likely to engage in information exchange. Russian developers had been using Stack Overflow the longest and were the most reflective. The cultural patterns identified in this study have numerous implications for enhancing in- group interactions and behaviour management among software development communities

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

    Caliphate and Kingship in a Fifteenth-Century Literary History of Muslim Leadership and Pilgrimage

    Get PDF
    In Caliphate and Kingship Jo Van Steenbergen presents a revisionist cultural biography, a critical edition and an annotated translation of al-Ḏahab al-Masbūk, a summary history of the ḥağğ and Muslim rule by Egypt’s leading historian al-Maqrīzī (d. 1442 CE). Readership: All interested in the history of the ḥağğ, in Muslim intellectual history and political theory, in premodern Arabic literature, and in medieval and Mamluk political and socio-cultural histor

    Recent Advances in Research on Island Phenomena

    Get PDF
    In natural languages, filler-gap dependencies can straddle across an unbounded distance. Since the 1960s, the term “island” has been used to describe syntactic structures from which extraction is impossible or impeded. While examples from English are ubiquitous, attested counterexamples in the Mainland Scandinavian languages have continuously been dismissed as illusory and alternative accounts for the underlying structure of such cases have been proposed. However, since such extractions are pervasive in spoken Mainland Scandinavian, these languages may not have been given the attention that they deserve in the syntax literature. In addition, recent research suggests that extraction from certain types of island structures in English might not be as unacceptable as previously assumed either. These findings break new empirical ground, question perceived knowledge, and may indeed have substantial ramifications for syntactic theory. This volume provides an overview of state-of-the-art research on island phenomena primarily in English and the Scandinavian languages, focusing on how languages compare to English, with the aim to shed new light on the nature of island constraints from different theoretical perspectives

    Participative Urban Health and Healthy Aging in the Age of AI

    Get PDF
    This open access book constitutes the refereed proceedings of the 18th International Conference on String Processing and Information Retrieval, ICOST 2022, held in Paris, France, in June 2022. The 15 full papers and 10 short papers presented in this volume were carefully reviewed and selected from 33 submissions. They cover topics such as design, development, deployment, and evaluation of AI for health, smart urban environments, assistive technologies, chronic disease management, and coaching and health telematics systems

    Learn to automate GUI tasks from demonstration

    Get PDF
    This thesis explores and extends Computer Vision applications in the context of Graphical User Interface (GUI) environments to address the challenges of Programming by Demonstration (PbD). The challenges are explored in PbD which could be addressed through innovations in Computer Vision, when GUIs are treated as an application domain, analogous to automotive or factory settings. Existing PbD systems were restricted by domain applications or special application interfaces. Although they use the term Demonstration, the systems did not actually see what the user performs. Rather they listen to the demonstrations through internal communications via operating system. Machine Vision and Human in the Loop Machine Learning are used to circumvent many restrictions, allowing the PbD system to watch the demonstration like another human observer would. This thesis will demonstrate that our prototype PbD systems allow non-programmer users to easily create their own automation scripts for their repetitive and looping tasks. Our PbD systems take their input from sequences of screenshots, and sometimes from easily available keyboard and mouse sniffer software. It will also be shown that the problem of inconsistent human demonstration can be remedied with our proposed Human in the Loop Computer Vision techniques. Lastly, the problem is extended to learn from demonstration videos. Due to the sheer complexity of computer desktop GUI manipulation videos, attention is focused on the domain of video game environments. The initial studies illustrate that it is possible to teach a computer to watch gameplay videos and to estimate what buttons the user pressed
    corecore