115,043 research outputs found

    Visual Analysis of Complex Networks for Business Intelligence with Gephi

    No full text
    International audiencePlatforms which combine data mining algorithms and interactive visualizations play a key role in the discovery process from complex networks data, e.g. Web and Online Social Networks data. Here we illustrate the use of Gephi, an open source software for networks visual exploration, for the visual analysis of Business Intelligence data modeled as complex networks

    Investigating an open methodology for designing domain-specific language collections

    Get PDF
    With this research and design paper, we are proposing that Open Educational Resources (OERs) and Open Access (OA) publications give increasing access to high quality online educational and research content for the development of powerful domain-specific language collections that can be further enhanced linguistically with the Flexible Language Acquisition System (FLAX, http://flax.nzdl.org). FLAX uses the Greenstone digital library system, which is a widely used open-source software that enables end users to build collections of documents and metadata directly onto the Web (Witten, Bainbridge, & Nichols, 2010). FLAX offers a powerful suite of interactive text-mining tools, using Natural Language Processing and Artificial Intelligence designs, to enable novice collections builders to link selected language content to large pre-processed linguistic databases. An open methodology trialed at Queen Mary University of London in collaboration with the OER Research Hub at the UK Open University demonstrates how applying open corpus-based designs and technologies can enhance open educational practices among language teachers and subject academics for the preparation and delivery of courses in English for Specific Academic Purposes (ESAP)

    CLUO: Web-Scale Text Mining System for Open Source Intelligence Purposes

    Get PDF
    The amount of textual information published on the Internet is considered tobe in billions of web pages, blog posts, comments, social media updates andothers. Analyzing such quantities of data requires high level of distribution –both data and computing. This is especially true in case of complex algorithms,often used in text mining tasks.The paper presents a prototype implementation of CLUO – an Open SourceIntelligence (OSINT) system, which extracts and analyzes significant quantitiesof openly available information

    SELENIUM FRAMEWORK FOR WEB AUTOMATION TESTING: A SYSTEMATIC LITERATURE REVIEW

    Get PDF
    Software Testing plays a crucial role in making high-quality products. The process of manual testing is often inaccurate, unreliable, and needed more than automation testing. One of these tools, Selenium, is an open-source framework that used along with different programming languages: (python, ruby, java, PHP, c#, etc.) to automate the test cases of web applications. The purpose of this study is to summarize the research in the area of selenium automation testing to benefit the readers in designing and delivering automated software testing with Selenium. We conducted the standard systematic literature review method employing a manual search of 2408 papers, and applying a set of inclusion/exclusion criteria the final literature included 16 papers published between 2009 and 2020. The result is using Selenium as a UI for web automation, not only all of the app functionality that has been tested, But also it can be applied with added some method or other algorithms like data mining, artificial intelligence, and machine learning. Furthermore, it can be implemented for security testing. In the future research for selenium framework automation testing, the implementation should more focus on finding effective and maintainability on the application of Selenium in other methodologies and is applied with the better improvement that can be matched for web automation testing

    Mining Threat Intelligence about Open-Source Projects and Libraries from Code Repository Issues and Bug Reports

    Full text link
    Open-Source Projects and Libraries are being used in software development while also bearing multiple security vulnerabilities. This use of third party ecosystem creates a new kind of attack surface for a product in development. An intelligent attacker can attack a product by exploiting one of the vulnerabilities present in linked projects and libraries. In this paper, we mine threat intelligence about open source projects and libraries from bugs and issues reported on public code repositories. We also track library and project dependencies for installed software on a client machine. We represent and store this threat intelligence, along with the software dependencies in a security knowledge graph. Security analysts and developers can then query and receive alerts from the knowledge graph if any threat intelligence is found about linked libraries and projects, utilized in their products

    The Hidden Web, XML and Semantic Web: A Scientific Data Management Perspective

    Get PDF
    The World Wide Web no longer consists just of HTML pages. Our work sheds light on a number of trends on the Internet that go beyond simple Web pages. The hidden Web provides a wealth of data in semi-structured form, accessible through Web forms and Web services. These services, as well as numerous other applications on the Web, commonly use XML, the eXtensible Markup Language. XML has become the lingua franca of the Internet that allows customized markups to be defined for specific domains. On top of XML, the Semantic Web grows as a common structured data source. In this work, we first explain each of these developments in detail. Using real-world examples from scientific domains of great interest today, we then demonstrate how these new developments can assist the managing, harvesting, and organization of data on the Web. On the way, we also illustrate the current research avenues in these domains. We believe that this effort would help bridge multiple database tracks, thereby attracting researchers with a view to extend database technology.Comment: EDBT - Tutorial (2011

    A very simple and fast way to access and validate algorithms in reproducible research

    Get PDF
    The reproducibility of research in bioinformatics refers to the notion that new methodologies/ algorithms and scientific claims have to be published together with their data and source code, in a way that other researchers may verify the findings to further build more knowledge upon them. The replication and corroboration of research results are key to the scientific process and many journals are discussing the matter nowadays, taking concrete steps in this direction. In this journal itself, a very recent opinion note has appeared highlighting the increasing importance of this topic in bioinformatics and computational biology, inviting the community to further discuss the matter. In agreement with that article, we would like to propose here another step into that direction with a tool that allows the automatic generation of a web interface, named web-demo, directly from source code in a very simple and straightforward way. We believe this contribution can help make research not only reproducible but also more easily accessible. A web-demo associated to a published paper can accelerate an algorithm validation with real data, wide-spreading its use with just a few clicks.Fil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Pividori, Milton Damián. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    A First Look at the Crypto-Mining Malware Ecosystem: A Decade of Unrestricted Wealth

    Get PDF
    Illicit crypto-mining leverages resources stolen from victims to mine cryptocurrencies on behalf of criminals. While recent works have analyzed one side of this threat, i.e.: web-browser cryptojacking, only commercial reports have partially covered binary-based crypto-mining malware. In this paper, we conduct the largest measurement of crypto-mining malware to date, analyzing approximately 4.5 million malware samples (1.2 million malicious miners), over a period of twelve years from 2007 to 2019. Our analysis pipeline applies both static and dynamic analysis to extract information from the samples, such as wallet identifiers and mining pools. Together with OSINT data, this information is used to group samples into campaigns. We then analyze publicly-available payments sent to the wallets from mining-pools as a reward for mining, and estimate profits for the different campaigns. All this together is is done in a fully automated fashion, which enables us to leverage measurement-based findings of illicit crypto-mining at scale. Our profit analysis reveals campaigns with multi-million earnings, associating over 4.4% of Monero with illicit mining. We analyze the infrastructure related with the different campaigns, showing that a high proportion of this ecosystem is supported by underground economies such as Pay-Per-Install services. We also uncover novel techniques that allow criminals to run successful campaigns.Comment: A shorter version of this paper appears in the Proceedings of 19th ACM Internet Measurement Conference (IMC 2019). This is the full versio
    corecore