935 research outputs found
Vision of a Visipedia
The web is not perfect: while text is easily
searched and organized, pictures (the vast majority of the bits
that one can find online) are not. In order to see how one could
improve the web and make pictures first-class citizens of the
web, I explore the idea of Visipedia, a visual interface for
Wikipedia that is able to answer visual queries and enables
experts to contribute and organize visual knowledge. Five
distinct groups of humans would interact through Visipedia:
users, experts, editors, visual workers, and machine vision
scientists. The latter would gradually build automata able to
interpret images. I explore some of the technical challenges
involved in making Visipedia happen. I argue that Visipedia will
likely grow organically, combining state-of-the-art machine
vision with human labor
Design requirements for generating deceptive content to protect document repositories
For nearly 30 years, fake digital documents have been used to identify external intruders and malicious insider threats. Unfortunately, while fake files hold potential to assist in data theft detection, there is little evidence of their application outside of niche organisations and academic institutions. The barrier to wider adoption appears to be the difficulty in constructing deceptive content. The current generation of solutions principally: (1) use unrealistic random data; (2) output heavily formatted or specialised content, that is difficult to apply to other environments; (3) require users to manually build the content, which is not scalable, or (4) employ an existing production file, which creates a protection paradox. This paper introduces a set of requirements for generating automated fake file content: (1) enticing, (2) realistic, (3) minimise disruption, (4) adaptive, (5) scalable protective coverage, (6) minimise sensitive artefacts and copyright infringement, and (7) contain no distinguishable characteristics. These requirements have been drawn from literature on natural science, magical performances, human deceit, military operations, intrusion detection and previous fake file solutions. These requirements guide the design of an automated fake file content construction system, providing an opportunity for the next generation of solutions to find greater commercial application and widespread adoption
A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks
Software testing activities scrutinize the artifacts and the behavior of a
software product to find possible defects and ensure that the product meets its
expected requirements. Recently, Deep Reinforcement Learning (DRL) has been
successfully employed in complex testing tasks such as game testing, regression
testing, and test case prioritization to automate the process and provide
continuous adaptation. Practitioners can employ DRL by implementing from
scratch a DRL algorithm or using a DRL framework. DRL frameworks offer
well-maintained implemented state-of-the-art DRL algorithms to facilitate and
speed up the development of DRL applications. Developers have widely used these
frameworks to solve problems in various domains including software testing.
However, to the best of our knowledge, there is no study that empirically
evaluates the effectiveness and performance of implemented algorithms in DRL
frameworks. Moreover, some guidelines are lacking from the literature that
would help practitioners choose one DRL framework over another. In this paper,
we empirically investigate the applications of carefully selected DRL
algorithms on two important software testing tasks: test case prioritization in
the context of Continuous Integration (CI) and game testing. For the game
testing task, we conduct experiments on a simple game and use DRL algorithms to
explore the game to detect bugs. Results show that some of the selected DRL
frameworks such as Tensorforce outperform recent approaches in the literature.
To prioritize test cases, we run experiments on a CI environment where DRL
algorithms from different frameworks are used to rank the test cases. Our
results show that the performance difference between implemented algorithms in
some cases is considerable, motivating further investigation.Comment: Accepted for publication at EMSE (Empirical Software Engineering
journal) 202
The AI Family: The Information Security Managers Best Frenemy?
In this exploratory study, we deliberately pull apart the Artificial from the Intelligence, the material from the human. We first assessed the existing technological controls available to Information Security Managers (ISMs) to ensure their in-depth defense strategies. Based on the AI watch taxonomy, we then discuss each of the 15 technologies and their potential impact on the transformation of jobs in the field of security (i.e., AI trainers, AI explainers and AI sustainers). Additionally, in a pilot study we collect the evaluation and the narratives of the employees (n=6) of a small financial institution in a focus group session. We particularly focus on their perception of the role of AI systems in the future of cyber security
A Machine Learning-oriented Survey on Tiny Machine Learning
The emergence of Tiny Machine Learning (TinyML) has positively revolutionized
the field of Artificial Intelligence by promoting the joint design of
resource-constrained IoT hardware devices and their learning-based software
architectures. TinyML carries an essential role within the fourth and fifth
industrial revolutions in helping societies, economies, and individuals employ
effective AI-infused computing technologies (e.g., smart cities, automotive,
and medical robotics). Given its multidisciplinary nature, the field of TinyML
has been approached from many different angles: this comprehensive survey
wishes to provide an up-to-date overview focused on all the learning algorithms
within TinyML-based solutions. The survey is based on the Preferred Reporting
Items for Systematic Reviews and Meta-Analyses (PRISMA) methodological flow,
allowing for a systematic and complete literature survey. In particular,
firstly we will examine the three different workflows for implementing a
TinyML-based system, i.e., ML-oriented, HW-oriented, and co-design. Secondly,
we propose a taxonomy that covers the learning panorama under the TinyML lens,
examining in detail the different families of model optimization and design, as
well as the state-of-the-art learning techniques. Thirdly, this survey will
present the distinct features of hardware devices and software tools that
represent the current state-of-the-art for TinyML intelligent edge
applications. Finally, we discuss the challenges and future directions.Comment: Article currently under review at IEEE Acces
Automating Society : Taking Stock of Automated Decision-Making in the EU
This is the first comprehensive study regarding the state of automated decision-making in Europe. Experts have looked at the situation at the EU level but also in 12 Member States: Belgium, Denmark, Finland, France, Germany, Italy, Netherlands Poland, Slovenia, Spain, Sweden and the UK. They assessed not only the political discussions and initiatives in these countries but also present a section "ADM in Action" for all states, listing examples of automated decision-making already in use
Automating Society: Taking Stock of Automated Decision-Making in the EU. BertelsmannStiftung Studies 2019
Imagine you’re looking for a job. The company you are applying to says you can have a much easier application process if you provide them with your username and password for your personal email account. They can then just scan all your emails and develop a personality profile based on the result. No need to waste time filling out a boring questionnaire and, because it’s much harder to manipulate all your past emails than to try to give the ‘correct’ answers to a questionnaire, the results of the email scan will be much more accurate and truthful than any conventional personality profiling. Wouldn’t that be great? Everyone wins—the company looking for new personnel, because they can recruit people on the basis of more accurate profiles, you, because you save time and effort and don’t end up in a job you don’t like, and the company offering the profiling service because they have a cool new business model
Towards a set of metrics to guide the generation of fake computer file systems
Fake file systems are used in the field of cyber deception to bait intruders and fool forensic investigators. File system researchers also frequently generate their own synthetic document repositories, due to data privacy and copyright concerns associated with experimenting on real-world corpora. For both these fields, realism is critical. Unfortunately, after creating a set of files and folders, there are no current testing standards that can be applied to validate their authenticity, or conversely, reliably automate their detection. This paper reviews the previous 30 years of file system surveys on real world corpora, to identify a set of discrete measures for generating synthetic file systems. Statistical distributions, such as size, age and lifetime of files, common file types, compression and duplication ratios, directory distribution and depth (and its relationship with numbers of files and sub-directories) were identified and the respective merits discussed. Additionally, this paper highlights notable absences in these surveys, which could be beneficial, such as analysing, on mass, the text content distribution, file naming habits, and comparing file access times against traditional working hours
- …