1,009 research outputs found

    Tapjacking Threats and Mitigation Techniques for Android Applications

    Get PDF
    With the increased dependency on web applications through mobile devices, malicious attack techniques have now shifted from traditional web applications running on desktop or laptop (allowing mouse click- based interactions) to mobile applications running on mobile devices (allowing touch-based interactions). Clickjacking is a type of malicious attack originating in web applications, where victims are lured to click on seemingly benign objects in web pages. However, when clicked, unintended actions are performed without the user’s knowledge. In particular, it is shown that users are lured to touch an object of an application triggering unintended actions not actually intended by victims. This new form of clickjacking on mobile devices is called tapjacking. There is little research that thoroughly investigates attacks and mitigation techniques due to tapjacking in mobile devices. In this thesis, we identify coding practices that can be helpful for software practitioners to avoid malicious attacks and define a detection techniques to prevent the consequence of malicious attacks for the end users. We first find out where tapjacking attack type falls within the broader literature of malware, in particular for Android malware. In this direction, we propose a classification of Android malware. Then, we propose a novel technique based on Kullback-Leibler Divergence (KLD) to identify possible tapjacking behavior in applications. We validate the approach with a set of benign and malicious android applications. We also implemented a prototype tool for detecting tapjacking attack symptom using the KLD based measurement. The evaluation results show that tapjacking can be detected effectively with KLD

    Frouros: A Python library for drift detection in machine learning systems

    Full text link
    Frouros is an open-source Python library capable of detecting drift in machine learning systems. It provides a combination of classical and more recent algorithms for drift detection: both concept and data drift. We have designed it with the objective of making it compatible with any machine learning framework and easily adaptable to real-world use cases. The library is developed following a set of best development and continuous integration practices to ensure ease of maintenance and extensibility. The source code is available at https://github.com/IFCA/frouros.Comment: 11 pages, 1 tabl

    A literature survey of active machine learning in the context of natural language processing

    Get PDF
    Active learning is a supervised machine learning technique in which the learner is in control of the data used for learning. That control is utilized by the learner to ask an oracle, typically a human with extensive knowledge of the domain at hand, about the classes of the instances for which the model learned so far makes unreliable predictions. The active learning process takes as input a set of labeled examples, as well as a larger set of unlabeled examples, and produces a classifier and a relatively small set of newly labeled data. The overall goal is to create as good a classifier as possible, without having to mark-up and supply the learner with more data than necessary. The learning process aims at keeping the human annotation effort to a minimum, only asking for advice where the training utility of the result of such a query is high. Active learning has been successfully applied to a number of natural language processing tasks, such as, information extraction, named entity recognition, text categorization, part-of-speech tagging, parsing, and word sense disambiguation. This report is a literature survey of active learning from the perspective of natural language processing

    Using the organizational and narrative thread structures in an e-book to support comprehension.

    Get PDF
    Stories, themes, concepts and references are organized structurally and purposefully in most books. A person reading a book needs to understand themes and concepts within the context. Schanks Dynamic Memory theory suggested that building on existing memory structures is essential to cognition and learning. Pirolli and Card emphasized the need to provide people with an independent and improved ability to access and understand information in their information seeking activities. Through a review of users reading behaviours and of existing e-Book user interfaces, we found that current e-Book browsers provide minimal support for comprehending the content of large and complex books. Readers of an e-Book need user interfaces that present and relate the organizational and narrative structures, and moreover, reveal the thematic structures. This thesis addresses the problem of providing readers with effective scaffolding of multiple structures of an e-Book in the user interface to support reading for comprehension. Recognising a story or topic as the basic unit in a book, we developed novel story segmentation techniques for discovering narrative segments, and adapted story linking techniques for linking narrative threads in semi-structured linear texts of an e-Book. We then designed an e-Book user interface to present the complex structures of the e-Book, as well as to assist the reader to discover these structures. We designed and developed evaluation methodologies to investigate reading and comprehension in e-Books, in order to assess the effectiveness of this user interface. We designed semi-directed reading tasks using a Story-Theme Map, and a set of corresponding measurements for the answers. We conducted user evaluations with book readers. Participants were asked to read stories, to browse and link related stories, and to identify major themes of stories in an e-Book. This thesis reports the experimental design and results in detail. The results confirmed that the e-Book interface helped readers perform reading tasks more effectively. The most important and interesting finding is that the interface proved to be more helpful to novice readers who had little background knowledge of the book. In addition, each component that supported the user interface was evaluated separately in a laboratory setting and, these results too are reported in the thesis

    Querying knowledge graphs in natural language.

    Get PDF
    Knowledge graphs are a powerful concept for querying large amounts of data. These knowledge graphs are typically enormous and are often not easily accessible to end-users because they require specialized knowledge in query languages such as SPARQL. Moreover, end-users need a deep understanding of the structure of the underlying data models often based on the Resource Description Framework (RDF). This drawback has led to the development of Question-Answering (QA) systems that enable end-users to express their information needs in natural language. While existing systems simplify user access, there is still room for improvement in the accuracy of these systems. In this paper we propose a new QA system for translating natural language questions into SPARQL queries. The key idea is to break up the translation process into 5 smaller, more manageable sub-tasks and use ensemble machine learning methods as well as Tree-LSTM-based neural network models to automatically learn and translate a natural language question into a SPARQL query. The performance of our proposed QA system is empirically evaluated using the two renowned benchmarks-the 7th Question Answering over Linked Data Challenge (QALD-7) and the Large-Scale Complex Question Answering Dataset (LC-QuAD). Experimental results show that our QA system outperforms the state-of-art systems by 15% on the QALD-7 dataset and by 48% on the LC-QuAD dataset, respectively. In addition, we make our source code available

    User profiling and classification for fraud detection in mobile communications networks

    Get PDF
    The topic of this thesis is fraud detection in mobile communications networks by means of user profiling and classification techniques. The goal is to first identify relevant user groups based on call data and then to assign a user to a relevant group. Fraud may be defined as a dishonest or illegal use of services, with the intention to avoid service charges. Fraud detection is an important application, since network operators lose a relevant portion of their revenue to fraud. Whereas the intentions of the mobile phone users cannot be observed, it is assumed that the intentions are reflected in the call data. The call data is subsequently used in describing behavioral patterns of users. Neural networks and probabilistic models are employed in learning these usage patterns from call data. These models are used either to detect abrupt changes in established usage patterns or to recognize typical usage patterns of fraud. The methods are shown to be effective in detecting fraudulent behavior by empirically testing the methods with data from real mobile communications networks.reviewe

    Drift Detection using Uncertainty Distribution Divergence

    Get PDF
    Data generated from naturally occurring processes tends to be non-stationary. For example, seasonal and gradual changes in climate data and sudden changes in financial data. In machine learning the degradation in classifier performance due to such changes in the data is known as concept drift and there are many approaches to detecting and handling it. Most approaches to detecting concept drift, however, make the assumption that true classes for test examples will be available at no cost shortly after classification and base the detection of concept drift on measures relying on these labels. The high labelling cost in many domains provides a strong motivation to reduce the number of labelled instances required to detect and handle concept drift. Triggered detection approaches that do not require labelled instances to detect concept drift show great promise for achieving this. In this paper we present Confidence Distribution Batch Detection (CDBD), an approach that provides a signal correlated to changes in concept without using labelled data. This signal combined with a trigger and a rebuild policy can maintain classifier accuracy which, in most cases, matches the accuracy achieved using classification error based detection techniques but using only a limited amount of labelled data
    corecore