223 research outputs found

    Contextual Passive DNS Resolution

    Get PDF
    Pasivní DNS je jeden z nejběžnejších nastrojů pro analyzu bezpečnostních incidentů z telemetrii, kde se vyskytují IP adresy. Bez aktivního dotazování DNS resolveru dává informaci o nejpravděpodobnějším doménovém jménu, které mohlo byt při přístupu na IP adresu použito. Tato práce navrhuje použití dodatečných informací obsažených v NetFlow telemetrii k extrakcí dodatečných přiznaků a použití metod strojového učení pro zlepšení přesnosti predikce nejpravděpodobnějšího doménového jména. Navržené řešení je porovnáno s řešením nejběžnejšího pDNS systému využívajícího pouze statisticky nejpravděpodobnejší hodnoty.Passive DNS is one of the most common tools for analyzing security incidents from telemetry where IP addresses occur. Without active querying of the DNS resolver it gives information about the most likely domain name that could be used during accessing the IP address. This work proposes the use of additional information contained in NetFlow telemetry to extract additional features and the use of machine learning methods to improve the accuracy of prediction of the most probable domain name. The proposed solution is compared with the solution of the most common pDNS system which uses only the statistically most probable values

    Visualization and Machine Learning Techniques for NASA’s EM-1 Big Data Problem

    Get PDF
    In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory creation from hours/days to minutes/seconds with an overall accuracy of 98%. Finally, we create an interactive, calendar-based Tableau visualization for EM-1 that summarizes trajectory data and considers multiple constraints on mission availability. The use of Tableau allows for sharing of visualization dashboards and would eventually be automatically updated upon generation of a new set of trajectory data. Therefore, we conclude that cloud technologies, machine learning, and big data visualization will benefit NASA’s engineering team. Successful implementation will further ensure mission success for the Exploration Program with a team of 20 people accomplishing what Apollo did with a team of 1000

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

    Using Supervised Learning to Predict English Premier League Match Results From Starting Line-up Player Data

    Get PDF
    Soccer is one of the most popular sports around the world. Many people, whether they are a fan of a soccer team, a player of online soccer games or even the professional coach of a soccer team, will attempt to use some relevant data to predict the result of a match. Many of these kinds of prediction models are built based on data from the match itself, such as the overall number of shots, yellow or red cards, fouls committed, etc. of the home and away teams. However, this research attempted to predict soccer game results (win, draw or loss) based on data from players in the starting line-up during the first 12 weeks of the 2018-2019 season of the English Premier League

    A Methodology for Mining Document-Enriched Heterogeneous Information Networks

    Full text link

    A Framework for Leveraging Artificial Intelligence in Project Management

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementThis dissertation aims to support the project manager in their daily tasks. As we use artificial intelligence (AI) and machine learning (ML) in everyday life, it is necessary to include them in business and change traditional ways of working. For the purpose of this study, it is essential to understand challenges and areas of project management and how artificial intelligence can contribute to them. A theoretical overview, applying the knowledge of project management, will show a holistic view of the current situation in the enterprises. The research is about artificial intelligence applications in project management, the common activities in project management, the biggest challenges, and how AI and ML can support it. Understanding project managers help create a framework that will contribute to optimizing their tasks. After designing and developing the framework for applying artificial intelligence to project management, the project managers were asked to evaluate. This study is essential to increase awareness among the stakeholders and enterprises on how automation of the processes can be improved and how AI and ML can decrease the possibility of risk and cost along with improving the happiness and efficiency of the employees

    A New Web Search Engine with Learning Hierarchy

    Get PDF
    Most of the existing web search engines (such as Google and Bing) are in the form of keyword-based search. Typically, after the user issues a query with the keywords, the search engine will return a flat list of results. When the query issued by the user is related to a topic, only the keyword matching may not accurately retrieve the whole set of webpages in that topic. On the other hand, there exists another type of search system, particularly in e-Commerce web- sites, where the user can search in the categories of different faceted hierarchies (e.g., product types and price ranges). Is it possible to integrate the two types of search systems and build a web search engine with a topic hierarchy? The main diffculty is how to classify the vast number of webpages on the Internet into the topic hierarchy. In this thesis, we will leverage machine learning techniques to automatically classify webpages into the categories in our hierarchy, and then utilize the classification results to build the new search engine SEE. The experimental results demonstrate that SEE can achieve better search results than the traditional keyword-based search engine in most of the queries, particularly when the query is related to a topic. We also conduct a small-scale usability study which further verifies that SEE is a promising search engine. To further improve SEE, we also propose a new active learning framework with several novel strategies for hierarchical classification