10,280 research outputs found
A Survey on Forensics and Compliance Auditing for Critical Infrastructure Protection
The broadening dependency and reliance that modern societies have on essential services
provided by Critical Infrastructures is increasing the relevance of their trustworthiness. However, Critical
Infrastructures are attractive targets for cyberattacks, due to the potential for considerable impact, not just
at the economic level but also in terms of physical damage and even loss of human life. Complementing
traditional security mechanisms, forensics and compliance audit processes play an important role in ensuring
Critical Infrastructure trustworthiness. Compliance auditing contributes to checking if security measures are
in place and compliant with standards and internal policies. Forensics assist the investigation of past security
incidents. Since these two areas significantly overlap, in terms of data sources, tools and techniques, they can
be merged into unified Forensics and Compliance Auditing (FCA) frameworks. In this paper, we survey the
latest developments, methodologies, challenges, and solutions addressing forensics and compliance auditing
in the scope of Critical Infrastructure Protection. This survey focuses on relevant contributions, capable of
tackling the requirements imposed by massively distributed and complex Industrial Automation and Control
Systems, in terms of handling large volumes of heterogeneous data (that can be noisy, ambiguous, and
redundant) for analytic purposes, with adequate performance and reliability. The achieved results produced
a taxonomy in the field of FCA whose key categories denote the relevant topics in the literature. Also, the
collected knowledge resulted in the establishment of a reference FCA architecture, proposed as a generic
template for a converged platform. These results are intended to guide future research on forensics and
compliance auditing for Critical Infrastructure Protection.info:eu-repo/semantics/publishedVersio
Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers
This paper presents a sentiment analysis combining the lexicon-based and machine learning (ML)-based approaches in Turkish to investigate the public mood for the prediction of stock market behavior in BIST30, Borsa Istanbul. Our main motivation behind this study is to apply sentiment analysis to financial-related tweets in Turkish. We import 17189 tweets posted as "#Borsaistanbul, #Bist, #Bist30, #Bist100″ on Twitter between November 7, 2022, and November 15, 2022, via a MAXQDA 2020, a qualitative data analysis program. For the lexicon-based side, we use a multilingual sentiment offered by the Orange program to label the polarities of the 17189 samples as positive, negative, and neutral labels. Neutral labels are discarded for the machine learning experiments. For the machine learning side, we select 9076 data as positive and negative to implement the classification problem with six different supervised machine learning classifiers conducted in Python 3.6 with the sklearn library. In experiments, 80 % of the selected data is used for the training phase and the rest is used for the testing and validation phase. Results of the experiments show that the Support Vector Machine and Multilayer Perceptron classifier perform better than other classifiers with 0.89 and 0.88 accuracy and AUC values of 0.8729 and 0.8647 respectively. Other classifiers obtain approximately a 78,5 % accuracy rate. It is possible to increase sentiment analysis accuracy with parameter optimization on a larger, cleaner, and more balanced dataset by changing the pre-processing steps. This work can be expanded in the future to develop better sentiment analysis using deep learning approaches
Location Reference Recognition from Texts: A Survey and Comparison
A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of its specific applications is still missing. Further, there is a lack of a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching–based, statistical learning-–based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references worldwide. Results from this thorough evaluation can help inform future methodological developments and can help guide the selection of proper approaches based on application needs
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
More-than-words: Reconceptualising Two-year-old Children’s Onto-epistemologies Through Improvisation and the Temporal Arts
This thesis project takes place at a time of increasing focus upon two-year-old children and the words they speak. On the one hand there is a mounting pressure, driven by the school readiness agenda, to make children talk as early as possible. On the other hand, there is an increased interest in understanding children’s communication in order to create effective pedagogies. More-than-words (MTW) examines an improvised art-education practice that combines heterogenous elements: sound, movement and materials (such as silk, string, light) to create encounters for young children, educators and practitioners from diverse backgrounds. During these encounters, adults adopt a practice of stripping back their words in order to tune into the polyphonic ways that children are becoming-with the world.
For this research-creation, two MTW sessions for two-year-old children and their carers took place in a specially created installation. These sessions were filmed on a 360Ëš camera, nursery school iPad and on a specially made child-friendly Toddler-cam (Tcam) that rolled around in the installation-event with the children. Through using the frameless technology of 360Ëš film, I hoped to make tangible the relation and movement of an emergent and improvised happening and the way in which young children operate fluidly through multiple modes.
Travelling with posthuman, Deleuzio-Guattarian and feminist vital material philosophy, I wander and wonder speculatively through practice, memory, and film data as a bag lady, a Haraway-ian writer/artist/researcher-creator who resists the story of the wordless child as lacking and tragic; the story that positions the word as heroic. Instead, through returning to the uncertainty of improvisation, I attempt to tune into the savage, untamed and wild music of young children’s animistic onto-epistemologies
CSTS: Conditional Semantic Textual Similarity
Semantic textual similarity (STS) has been a cornerstone task in NLP that
measures the degree of similarity between a pair of sentences, with
applications in information retrieval, question answering, and embedding
methods. However, it is an inherently ambiguous task, with the sentence
similarity depending on the specific aspect of interest. We resolve this
ambiguity by proposing a novel task called conditional STS (C-STS) which
measures similarity conditioned on an aspect elucidated in natural language
(hereon, condition). As an example, the similarity between the sentences "The
NBA player shoots a three-pointer." and "A man throws a tennis ball into the
air to serve." is higher for the condition "The motion of the ball." (both
upward) and lower for "The size of the ball." (one large and one small).
C-STS's advantages are two-fold: (1) it reduces the subjectivity and ambiguity
of STS, and (2) enables fine-grained similarity evaluation using diverse
conditions. C-STS contains almost 20,000 instances from diverse domains and we
evaluate several state-of-the-art models to demonstrate that even the most
performant fine-tuning and in-context learning models (GPT-4, Flan, SimCSE)
find it challenging, with Spearman correlation scores of <50. We encourage the
community to evaluate their models on C-STS to provide a more holistic view of
semantic similarity and natural language understanding
Software Design Change Artifacts Generation through Software Architectural Change Detection and Categorisation
Software is solely designed, implemented, tested, and inspected by expert people, unlike other engineering projects where they are mostly implemented by workers (non-experts) after designing by engineers. Researchers and practitioners have linked software bugs, security holes, problematic integration of changes, complex-to-understand codebase, unwarranted mental pressure, and so on in software development and maintenance to inconsistent and complex design and a lack of ways to easily understand what is going on and what to plan in a software system. The unavailability of proper information and insights needed by the development teams to make good decisions makes these challenges worse. Therefore, software design documents and other insightful information extraction are essential to reduce the above mentioned anomalies. Moreover, architectural design artifacts extraction is required to create the developer’s profile to be available to the market for many crucial scenarios. To that end, architectural change detection, categorization, and change description generation are crucial because they are the primary artifacts to trace other software artifacts.
However, it is not feasible for humans to analyze all the changes for a single release for detecting change and impact because it is time-consuming, laborious, costly, and inconsistent. In this thesis, we conduct six studies considering the mentioned challenges to automate the architectural change information extraction and document generation that could potentially assist the development and maintenance teams. In particular, (1) we detect architectural changes using lightweight techniques leveraging textual and codebase properties, (2) categorize them considering intelligent perspectives, and (3) generate design change documents by exploiting precise contexts of components’ relations and change purposes which were previously unexplored. Our experiment using 4000+ architectural change samples and 200+ design change documents suggests that our proposed approaches are promising in accuracy and scalability to deploy frequently. Our proposed change detection approach can detect up to 100% of the architectural change instances (and is very scalable). On the other hand, our proposed change classifier’s F1 score is 70%, which is promising given the challenges. Finally, our proposed system can produce descriptive design change artifacts with 75% significance. Since most of our studies are foundational, our approaches and prepared datasets can be used as baselines for advancing research in design change information extraction and documentation
Software System Model Correctness using Graph Theory: A Review
The Unified Modeling Language UML is the de facto standard for object-oriented software model development The UML class diagram plays an essential role in design and specification of software systems The purpose of a class diagram is to display classes with their attributes and methods hierarchy generalization class relationships and associations general aggregation and composition between classes in one mode
Mapping the Focal Points of WordPress: A Software and Critical Code Analysis
Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods
An Overview of Context Capturing Techniques in NLP
In the NLP context identification has become a prominent way to overcome syntactic and semantic ambiguities. Ambiguities are unsolved problems but can be reduced to a certain level. This ambiguity reduction helps to improve the quality of several NLP processes, such as text translation, text simplification, text retrieval, word sense disambiguation, etc. Context identification, also known as contextualization, takes place in the preprocessing phase of NLP processes. The essence of this identification is to uniquely represent a word or a phrase to improve the decision-making during the transfer phase of the NLP processes. The improved decision-making helps to improve the quality of the output. This paper tries to provide an overview of different context-capturing mechanisms used in NLP
- …