115,043 research outputs found
Visual Analysis of Complex Networks for Business Intelligence with Gephi
International audiencePlatforms which combine data mining algorithms and interactive visualizations play a key role in the discovery process from complex networks data, e.g. Web and Online Social Networks data. Here we illustrate the use of Gephi, an open source software for networks visual exploration, for the visual analysis of Business Intelligence data modeled as complex networks
Investigating an open methodology for designing domain-specific language collections
With this research and design paper, we are proposing that Open Educational Resources (OERs) and Open Access (OA) publications give increasing access to high quality online educational and research content for the development of powerful domain-specific language collections that can be further enhanced linguistically with the Flexible Language Acquisition System (FLAX, http://flax.nzdl.org). FLAX uses the Greenstone digital library system, which is a widely used open-source software that enables end users to build collections of documents and metadata directly onto the Web (Witten, Bainbridge, & Nichols, 2010). FLAX offers a powerful suite of interactive text-mining tools, using Natural Language Processing and Artificial Intelligence designs, to enable novice collections builders to link selected language content to large pre-processed linguistic databases. An open methodology trialed at Queen Mary University of London in collaboration with the OER Research Hub at the UK Open University demonstrates how applying open corpus-based designs and technologies can enhance open educational practices among language teachers and subject academics for the preparation and delivery of courses in English for Specific Academic Purposes (ESAP)
CLUO: Web-Scale Text Mining System for Open Source Intelligence Purposes
The amount of textual information published on the Internet is considered tobe in billions of web pages, blog posts, comments, social media updates andothers. Analyzing such quantities of data requires high level of distribution –both data and computing. This is especially true in case of complex algorithms,often used in text mining tasks.The paper presents a prototype implementation of CLUO – an Open SourceIntelligence (OSINT) system, which extracts and analyzes significant quantitiesof openly available information
SELENIUM FRAMEWORK FOR WEB AUTOMATION TESTING: A SYSTEMATIC LITERATURE REVIEW
Software Testing plays a crucial role in making high-quality products. The process of manual testing is often inaccurate, unreliable, and needed more than automation testing. One of these tools, Selenium, is an open-source framework that used along with different programming languages: (python, ruby, java, PHP, c#, etc.) to automate the test cases of web applications. The purpose of this study is to summarize the research in the area of selenium automation testing to benefit the readers in designing and delivering automated software testing with Selenium. We conducted the standard systematic literature review method employing a manual search of 2408 papers, and applying a set of inclusion/exclusion criteria the final literature included 16 papers published between 2009 and 2020. The result is using Selenium as a UI for web automation, not only all of the app functionality that has been tested, But also it can be applied with added some method or other algorithms like data mining, artificial intelligence, and machine learning. Furthermore, it can be implemented for security testing. In the future research for selenium framework automation testing, the implementation should more focus on finding effective and maintainability on the application of Selenium in other methodologies and is applied with the better improvement that can be matched for web automation testing
Mining Threat Intelligence about Open-Source Projects and Libraries from Code Repository Issues and Bug Reports
Open-Source Projects and Libraries are being used in software development
while also bearing multiple security vulnerabilities. This use of third party
ecosystem creates a new kind of attack surface for a product in development. An
intelligent attacker can attack a product by exploiting one of the
vulnerabilities present in linked projects and libraries.
In this paper, we mine threat intelligence about open source projects and
libraries from bugs and issues reported on public code repositories. We also
track library and project dependencies for installed software on a client
machine. We represent and store this threat intelligence, along with the
software dependencies in a security knowledge graph. Security analysts and
developers can then query and receive alerts from the knowledge graph if any
threat intelligence is found about linked libraries and projects, utilized in
their products
The Hidden Web, XML and Semantic Web: A Scientific Data Management Perspective
The World Wide Web no longer consists just of HTML pages. Our work sheds
light on a number of trends on the Internet that go beyond simple Web pages.
The hidden Web provides a wealth of data in semi-structured form, accessible
through Web forms and Web services. These services, as well as numerous other
applications on the Web, commonly use XML, the eXtensible Markup Language. XML
has become the lingua franca of the Internet that allows customized markups to
be defined for specific domains. On top of XML, the Semantic Web grows as a
common structured data source. In this work, we first explain each of these
developments in detail. Using real-world examples from scientific domains of
great interest today, we then demonstrate how these new developments can assist
the managing, harvesting, and organization of data on the Web. On the way, we
also illustrate the current research avenues in these domains. We believe that
this effort would help bridge multiple database tracks, thereby attracting
researchers with a view to extend database technology.Comment: EDBT - Tutorial (2011
A very simple and fast way to access and validate algorithms in reproducible research
The reproducibility of research in bioinformatics refers to the notion that new methodologies/ algorithms and scientific claims have to be published together with their data and source code, in a way that other researchers may verify the findings to further build more knowledge upon them. The replication and corroboration of research results are key to the scientific process and many journals are discussing the matter nowadays, taking concrete steps in this direction. In this journal itself, a very recent opinion note has appeared highlighting the increasing importance of this topic in bioinformatics and computational biology, inviting the community to further discuss the matter. In agreement with that article, we would like to propose here another step into that direction with a tool that allows the automatic generation of a web interface, named web-demo, directly from source code in a very simple and straightforward way. We believe this contribution can help make research not only reproducible but also more easily accessible. A web-demo associated to a published paper can accelerate an algorithm validation with real data, wide-spreading its use with just a few clicks.Fil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Pividori, Milton Damián. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin
A First Look at the Crypto-Mining Malware Ecosystem: A Decade of Unrestricted Wealth
Illicit crypto-mining leverages resources stolen from victims to mine
cryptocurrencies on behalf of criminals. While recent works have analyzed one
side of this threat, i.e.: web-browser cryptojacking, only commercial reports
have partially covered binary-based crypto-mining malware. In this paper, we
conduct the largest measurement of crypto-mining malware to date, analyzing
approximately 4.5 million malware samples (1.2 million malicious miners), over
a period of twelve years from 2007 to 2019. Our analysis pipeline applies both
static and dynamic analysis to extract information from the samples, such as
wallet identifiers and mining pools. Together with OSINT data, this information
is used to group samples into campaigns. We then analyze publicly-available
payments sent to the wallets from mining-pools as a reward for mining, and
estimate profits for the different campaigns. All this together is is done in a
fully automated fashion, which enables us to leverage measurement-based
findings of illicit crypto-mining at scale. Our profit analysis reveals
campaigns with multi-million earnings, associating over 4.4% of Monero with
illicit mining. We analyze the infrastructure related with the different
campaigns, showing that a high proportion of this ecosystem is supported by
underground economies such as Pay-Per-Install services. We also uncover novel
techniques that allow criminals to run successful campaigns.Comment: A shorter version of this paper appears in the Proceedings of 19th
ACM Internet Measurement Conference (IMC 2019). This is the full versio
- …