Search CORE

68 research outputs found

The main features that influence the academic success of bachelors’ students at Nova School of Business and Economics

Author: Jacob David Lázaro
Publication venue
Publication date: 18/01/2022
Field of study

Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe prediction of academic success is a major topic in higher education, especially among the academic community. In this dissertation, we are going to present a data mining approach taking into consideration the features that are the most relevant in terms of successful academic achievement of the Bachelors’ programs at Nova School of Business and Economics (Nova SBE). Initially, we are going to perform a literature review in order to understand the framework of academic success and also to make a summary of previous research on the field of educational data mining when used to assess student success. Subsequently, the empirical approach will start being developed with the extraction of socio-economic, socio-demographic, and academic data of students, which will result in our main dataset. Later, and after the data discovery, data cleansing, and transformation activities, a set of features are going to be taken into consideration according to their relevance for the subject. Based on the dataset containing these features, several predictive data-driven techniques are going to be applied, resulting in models which are going to be assessed in order to understand if the selected features are relevant enough to answer our problem or if there is a need to substitute them by other attributes. This process will result in several iterations that will confer credibility and robustness to the model that demonstrates the best performance in classifying students’ academic success. In the end, it is intended that the insights extracted from the model will provide the school key stakeholders with enough knowledge to capacitate them to take actions that will result in the maximization of the students learning success

Repositório da Universidade Nova de Lisboa

Feature based dynamic intra-video indexing

Author: Asghar Muhammad Nabeel
Publication venue: University of Bedfordshire
Publication date: 01/09/2014
Field of study

A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

University of Bedfordshire Repository

Defeat data poisoning attacks on facial recognition applications

Author: Cole Dalton Russell
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

In the modern era, facial photos are used for a wide array of applications, from logging into a smartphone to bragging about a weekend getaway. With the vast amount of use cases for facial images, adversaries will attack these applications for profit. This dissertation focuses on two major applications of facial photos: facial authentication and deepfakes. Facial authentication has become increasingly popular on personal devices. Due to the ease of use, it has great potential to be widely deployed for web-service authentication in the near future in which people can easily log on to online accounts from different devices without memorizing lengthy passwords. However, the growing number of attacks targeting machine learning, especially Deep Neural Networks (DNN), which is commonly used for facial recognition, imposes big challenges on the successful roll-out of such web-service facial authentication. We demonstrate a new data poisoning attack, called replacement data poisoning, which does not require the adversary to have any knowledge of the server-side and simply needs a handful of malicious photo injections to enable an attacker to impersonate the victim in existing facial authentication systems. We then propose a novel defensive approach called DEFEAT that leverages deep learning techniques to automatically detect such attacks. Our experiments using real-world datasets achieve a detection accuracy of over 90 percent. Deepfakes target specific individuals to cause shame or misinformation. With the spread of fake news, deepfakes have become incredibly prevalent in recent years. With deepfakes, an adversary could have photographic or even video-graphic \proof" of someone, such as a politician, committing a devious act or saying untrue words. Our deepfake work consists of two parts. First, we propose a label ipping data poisoning attack targeting deepfake detectors. With over a 99 percent poison success rate in most cases, this attack demonstrates the devastating effects a data poisoning attack can have on deepfake detectors and how important a need to defend against this assault is. Our second contribution revolves around defending deepfake detectors from such an attack. We propose several defense strategies, most notably a convolutional neural network (CNN) based strategy to detect poisoned images. Our CNN-based approach achieves a greater than 98 percent poison detection rate while keeping the number of false positives to a minimum with a precision rate of over 99 percent in most cases.Includes bibliographical references

University of Missouri: MOspace

Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

Author: Bunte Kerstin
Canducci Marco
De Rijcke Sven
Mastropietro Michele
Peletier Reynier
Taghribi Albolfazl
Tino Peter
Yin H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2021
Field of study

The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

University of Birmingham Research Portal

Dissertations of the University of Groningen

Data Driven Mobility

Author: Slik Jesper Siem
Publication venue: Ipskamp Drukkers
Publication date: 18/05/2022
Field of study

VU Research Portal

Techniques for the Analysis of Modern Web Page Traffic using Anonymized TCP/IP Headers

Author: Sanders Sean
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/01/2017
Field of study

Analysis of traces of network traffic is a methodology that has been widely adopted for studying the Web for several decades. However, due to recent privacy legislation and increasing adoption of traffic encryption, often only anonymized TCP/IP headers are accessible in traffic traces. For traffic traces to remain useful for analysis, techniques must be developed to glean insight using this limited header information. This dissertation evaluates approaches for classifying individual web page downloads — referred to as web page classification — when only anonymized TCP/IP headers are available. The context in which web page classification is defined and evaluated in this dissertation is different from prior traffic classification methods in three ways. First, the impact of diversity in client platforms (browsers, operating systems, device type, and vantage point) on network traffic is explicitly considered. Second, the challenge of overlapping traffic from multiple web pages is explicitly considered and demultiplexing approaches are evaluated (web page segmentation). And lastly, unlike prior work on traffic classification, four orthogonal labeling schemes are considered (genre-based, device-based, navigation-based, and video streaming-based) — these are of value in several web-related applications, including privacy analysis, user behavior modeling, traffic forecasting, and potentially behavioral ad-targeting. We conduct evaluations using large collections of both synthetically generated data, as well as browsing data from real users. Our analysis shows that the client platform choice has a statistically significant impact on web traffic. It also shows that change point detection methods, a new class of segmentation approach, outperform existing idle time-based methods. Overall, this work establishes that web page classification performance can be improved by: (i) incorporating client platform differences in the feature selection and training methodology, and (ii) utilizing better performing web page segmentation approaches. This research increases the overall awareness on the challenges associated with the analysis of modern web traffic. It shows and advocates for considering real-world factors, such as client platform diversity and overlapping traffic from multiple streams, when developing and evaluating traffic analysis techniques.Doctor of Philosoph

Carolina Digital Repository

Cyber Law and Espionage Law as Communicating Vessels

Author: Lubin Asaf
Publication venue: Digital Repository @ Maurer Law
Publication date: 01/01/2018
Field of study

Professor Lubin\u27s contribution is Cyber Law and Espionage Law as Communicating Vessels, pp. 203-225. Existing legal literature would have us assume that espionage operations and “below-the-threshold” cyber operations are doctrinally distinct. Whereas one is subject to the scant, amorphous, and under-developed legal framework of espionage law, the other is subject to an emerging, ever-evolving body of legal rules, known cumulatively as cyber law. This dichotomy, however, is erroneous and misleading. In practice, espionage and cyber law function as communicating vessels, and so are better conceived as two elements of a complex system, Information Warfare (IW). This paper therefore first draws attention to the similarities between the practices – the fact that the actors, technologies, and targets are interchangeable, as are the knee-jerk legal reactions of the international community. In light of the convergence between peacetime Low-Intensity Cyber Operations (LICOs) and peacetime Espionage Operations (EOs) the two should be subjected to a single regulatory framework, one which recognizes the role intelligence plays in our public world order and which adopts a contextual and consequential method of inquiry. The paper proceeds in the following order: Part 2 provides a descriptive account of the unique symbiotic relationship between espionage and cyber law, and further explains the reasons for this dynamic. Part 3 places the discussion surrounding this relationship within the broader discourse on IW, making the claim that the convergence between EOs and LICOs, as described in Part 2, could further be explained by an even larger convergence across all the various elements of the informational environment. Parts 2 and 3 then serve as the backdrop for Part 4, which details the attempt of the drafters of the Tallinn Manual 2.0 to compartmentalize espionage law and cyber law, and the deficits of their approach. The paper concludes by proposing an alternative holistic understanding of espionage law, grounded in general principles of law, which is more practically transferable to the cyber realmhttps://www.repository.law.indiana.edu/facbooks/1220/thumbnail.jp

Indiana University Bloomington Maurer School of Law

An Approach to Guide Users Towards Less Revealing Internet Browsers

Author: Mohammad Lena
Mohsen Fadi
Naser Riham
Shtayyeh Adel
Struijk Marten
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/07/2022
Field of study

When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Previous research has shown that there are numerous privacy and security risks result from exposing sensitive information in the User-Agent string. For example, it enables device and browser fingerprinting and user tracking and identification. Our large analysis of thousands of User-Agent strings shows that browsers differ tremendously in the amount of information they include in their User-Agent strings. As such, our work aims at guiding users towards using less exposing browsers. In doing so, we propose to assign an exposure score to browsers based on the information they expose and vulnerability records. Thus, our contribution in this work is as follows: first, provide a full implementation that is ready to be deployed and used by users. Second, conduct a user study to identify the effectiveness and limitations of our proposed approach. Our implementation is based on using more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available and the solution has been deployed

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

AI for Cybersecurity: Robust models for Authentication, Threat and Anomaly Detection

Author: [Olga Tushkanova et al.]
Bergadano Francesco
Giacinto Giorgio
Publication venue: place:Basel
Publication date: 01/01/2023
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

Recommended from our members

Text-based document geolocation and its application to the digital humanities

Author: Wing Benjamin Patai
Publication venue
Publication date: 14/09/2016
Field of study

This dissertation investigates automatic geolocation of documents (i.e. identification of their location, expressed as latitude/longitude coordinates), based on the text of those documents rather than metadata. I assert that such geolocation can be performed using text alone, at a sufficient accuracy for use in real-world applications. Although in some corpora metadata is found in abundance (e.g. home location, time zone, friends, followers, etc. in Twitter), it is lacking in others, such as many corpora of primary-source documents in the digital humanities, an area to which document geolocation has hardly been applied. To this end, I first develop methods for accurate text-based geolocation and then apply them to newly-annotated corpora in the digital humanities. The geolocation methods I develop use both uniform and adaptive (k-d tree) grids over the Earth’s surface, culminating in a hierarchical logistic-regression-based technique that achieves state of the art results on well-known corpora (Twitter user feeds, Wikipedia articles and Flickr image tags). In the second part of the dissertation I develop a new NLP task, text-based geolocation of historical corpora. Because there are no existing corpora to test on, I create and annotate two new corpora of significantly different natures (a 19th-century travel log and a large set of Civil War archives). I show how my methods produce good geolocation accuracy even given the relatively small amount of annotated data available, which can be further improved using domain adaptation. I then use the predictions on the much larger unannotated portion of the Civil War archives to generate and analyze geographic topic models, showing how they can be mined to produce interesting revelations concerning various Civil War-related subjects. Finally, I develop a new geolocation technique for text-only corpora involving co-training between document-geolocation and toponym- resolution models, using a gazetteer to inject additional information into the training process. To evaluate this technique I develop a new metric, the closest toponym error distance, on which I show improvements compared with a baseline geolocator.Linguistic

Texas ScholarWorks