1,039 research outputs found

    Efficient filtering of adult content using textual information

    Full text link
    Nowadays adult content represents a non negligible proportion of the Web content. It is of the utmost importance to protect children from this content. Search engines, as an entry point for Web navigation are ideally placed to deal with this issue. In this paper, we propose a method that builds a safe index i.e. adult-content free for search engines. This method is based on a filter that uses only textual information from the web page and the associated URL

    Advanced quantum based neural network classifier and its application for objectionable web content filtering

    Full text link
    © 2013 IEEE. In this paper, an Advanced Quantum-based Neural Network Classifier (AQNN) is proposed. The proposed AQNN is used to form an objectionable Web content filtering system (OWF). The aim is to design a neural network with a few numbers of hidden layer neurons with the optimal connection weights and the threshold of neurons. The proposed algorithm uses the concept of quantum computing and genetic concept to evolve connection weights and the threshold of neurons. Quantum computing uses qubit as a probabilistic representation which is the smallest unit of information in the quantum computing concept. In this algorithm, a threshold boundary parameter is also introduced to find the optimal value of the threshold of neurons. The proposed algorithm forms neural network architecture which is used to form an objectionable Web content filtering system which detects objectionable Web request by the user. To judge the performance of the proposed AQNN, a total of 2000 (1000 objectionable + 1000 non-objectionable) Website's contents have been used. The results of AQNN are also compared with QNN-F and well-known classifiers as backpropagation, support vector machine (SVM), multilayer perceptron, decision tree algorithm, and artificial neural network. The results show that the AQNN as classifier performs better than existing classifiers. The performance of the proposed objectionable Web content filtering system (OWF) is also compared with well-known objectionable Web filtering software and existing models. It is found that the proposed OWF performs better than existing solutions in terms of filtering objectionable content

    A Survey of Social Network Forensics

    Get PDF
    Social networks in any form, specifically online social networks (OSNs), are becoming a part of our everyday life in this new millennium especially with the advanced and simple communication technologies through easily accessible devices such as smartphones and tablets. The data generated through the use of these technologies need to be analyzed for forensic purposes when criminal and terrorist activities are involved. In order to deal with the forensic implications of social networks, current research on both digital forensics and social networks need to be incorporated and understood. This will help digital forensics investigators to predict, detect and even prevent any criminal activities in different forms. It will also help researchers to develop new models / techniques in the future. This paper provides literature review of the social network forensics methods, models, and techniques in order to provide an overview to the researchers for their future works as well as the law enforcement investigators for their investigations when crimes are committed in the cyber space. It also provides awareness and defense methods for OSN users in order to protect them against to social attacks

    Proceedings of the 6th Dutch-Belgian Information Retrieval Workshop

    Get PDF

    Information overload in structured data

    Get PDF
    Information overload refers to the difficulty of making decisions caused by too much information. In this dissertation, we address information overload problem in two separate structured domains, namely, graphs and text. Graph kernels have been proposed as an efficient and theoretically sound approach to compute graph similarity. They decompose graphs into certain sub-structures, such as subtrees, or subgraphs. However, existing graph kernels suffer from a few drawbacks. First, the dimension of the feature space associated with the kernel often grows exponentially as the complexity of sub-structures increase. One immediate consequence of this behavior is that small, non-informative, sub-structures occur more frequently and cause information overload. Second, as the number of features increase, we encounter sparsity: only a few informative sub-structures will co-occur in multiple graphs. In the first part of this dissertation, we propose to tackle the above problems by exploiting the dependency relationship among sub-structures. First, we propose a novel framework that learns the latent representations of sub-structures by leveraging recent advancements in deep learning. Second, we propose a general smoothing framework that takes structural similarity into account, inspired by state-of-the-art smoothing techniques used in natural language processing. Both the proposed frameworks are applicable to popular graph kernel families, and achieve significant performance improvements over state-of-the-art graph kernels. In the second part of this dissertation, we tackle information overload in text. We first focus on a popular social news aggregation website, Reddit, and design a submodular recommender system that tailors a personalized frontpage for individual users. Second, we propose a novel submodular framework to summarize videos, where both transcript and comments are available. Third, we demonstrate how to apply filtering techniques to select a small subset of informative features from virtual machine logs in order to predict resource usage

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse

    Computational ethology for primate sociality: a novel paradigm for computer-vision-based analysis of animal behaviour

    Get PDF
    Research in the biological and wildlife sciences is increasingly reliant on video data for measuring animal behaviour, however large-scale analysis is often limited by the time and resources it takes to process video archives. Computer vision holds serious potential to unlock these datasets to analyse behaviour at an unprecedented level of scale, depth and reliability, however thus far a framework for processing and analysing behaviour from large-scale video datasets is lacking. This thesis attempts to solve this problem by developing the theory and methods for capturing long-term sociality of animal populations from longitudinal video archives, laying the foundations for an emerging field; computational ethology of animals in the wild. It makes several key contributions by a) establishing the first unified longitudinal video dataset of wild chimpanzee stone tool use across a 30 year period, and building a framework for collaborative research using cloud-technology b) developing a set of computational tools to allow for processing of large volumes of video data for automated individual identification and behaviour recognition c) applying these automated methods to validate use for social network analysis and d) measuring the social dynamics and behaviour of a group of wild chimpanzees living in the forest of Bossou, Guinea, West Africa. In Chapter 1 I introduce the theoretical and historical context for the thesis, and outline the novel methodological framework for using computer vision to measure animal social behaviour in video. In Chapter 2 I introduce the methodology for processing and managing a longitudinal video archive, and future directions for a new framework for collaborative research workflows in the wildlife sciences using cloud technology. In Chapter 3 I lay the foundations of this framework for analysing behaviour and unlocking video datasets, using deep learning and face recognition. In Chapter 4 I evaluate the robustness of the method for modelling long-term sociality and social networks at Bossou and test whether life history variables predict individual-level sociality patterns. In Chapter 5 I introduce the final component to this framework for measuring long-term animal behaviour, through audiovisual behavioural recognition of chimpanzee nut-cracking. In my final chapter (6) I discuss the main contributions, limitations and future directions for research. Overall this thesis integrates a diverse range of interdisciplinary methods and concepts from primatology, ethology, engineering, and computer vision, to build the foundations for further exploration of cognition, ecology and evolution in wild animals using automated methods

    Understanding video through the lens of language

    Get PDF
    The increasing abundance of video data online necessitates the development of systems capable of understanding such content. However, building these systems poses significant challenges, including the absence of scalable and robust supervision signals, computational complexity, and multimodal modelling. To address these issues, this thesis explores the role of language as a complementary learning signal for video, drawing inspiration from the success of self-supervised Large Language Models (LLMs) and image-language models. First, joint video-language representations are examined under the text-to-video retrieval task. This includes the study of pre-extracted multimodal features, the influence of contextual information, joint end-to-end learning of both image and video representations, and various frame aggregation methods for long-form videos. In doing so, state-of-the-art performance is achieved across a range of established video-text benchmarks. Second, this work explores the automatic generation of audio description (AD) – narrations describing the visual happenings in a video, for the benefit of visually impaired audiences. An LLM, prompted with multimodal information, including past predictions, and pretrained with partial data sources, is employed for the task. In the process, substantial advancements are achieved in the following areas: efficient speech transcription, long-form visual storytelling, referencing character names, and AD time-point prediction. Finally, audiovisual behaviour recognition is applied to the field of wildlife conservation and ethology. The approach is used to analyse vast video archives of wild primates, revealing insights into individual and group behaviour variations, with the potential for monitoring the effects of human pressures on animal habitats
    • …
    corecore