6,685 research outputs found
Alʔilbīrī’s Book of the rational conclusions. Introduction, Critical Edition of the Arabic Text and Materials for the History of the Ḫawāṣṣic Genre in Early Andalus
[eng] The Book of the rational conclusions, written perhaps somewhen in the 10th c. by a physician from Ilbīrah (Andalus), is a multi-section medical pandect. The author brings together, from a diversity of sources, materials dealing with matters related to drug-handling, natural philosophy, therapeutics, medical applications of the specific properties of things, a regimen, and a dispensatory. This dissertation includes three different parts. First the transmission of the text, its contents, and its possible context are discussed. Then a critical edition of the Arabic text is offered. Last, but certainly not least, the subject of the specific properties is approached from several points of view. The analysis of Section III of the original book leads to an exploration of the early Andalusī assimilation of this epistemic tradition and to the establishment of a well-defined textual family in which our text must be inscribed. On the other hand, the concept itself of ‘specific property’ is often misconstrued and it is usually made synonymous to magic and superstition. Upon closer inspection, however, the alleged irrationality of the knowledge of these properties appears to be largely the result of anachronistic interpretation. As a complement of this particular research and as an illustration of the genre, a sample from an ongoing integral commentary on this section of the book is presented.[cat] El Llibre de les conclusions racionals d’un desconegut metge d’Ilbīrah (l’Àndalus) va ser compilat probablement durant la segona meitat del s. X. Es tracta d’un rudimentari però notablement complet kunnaix (un gènere epistèmic que és definit sovint com a ‘enciclopèdia mèdica’) en què l’autor aplega materials manllevats (sovint de manera literal i no-explícita) de diversos gèneres. El llibre obre amb una secció sobre apoteconomia (una mena de manual d’apotecaris) però se centra després en les diferents branques de la medicina. A continuació d’uns prolegòmens filosòfics l’autor copia, amb mínima adaptació lingüística, un tractat sencer de terapèutica, després un altre sobre les aplicacions mèdiques de les propietats específiques de les coses, una sèrie de fragments relacionats amb la dietètica (un règim en termes tradicionals) i, finalment, una col·lecció de receptes mèdiques. Cadascuna d’aquestes seccions mostren evidents lligams d’intertextualitat que apunten cap a una intensa activitat sintetitzadora de diverses tradicions aliades a la medicina a l’Àndalus califal. El text és, de fet, un magnífic objecte sobre el qual aplicar la metodologia de la crítica textual i de fonts. L’edició crítica del text incorpora la dimensió cronològica dins l’aparat, que esdevé així un element contextualitzador. Quant l’estudi de les fonts, si tot al llarg de la primera part d’aquesta tesi és només secundari, aquesta disciplina pren un protagonisme gairebé absolut en la tercera part, especialment en el capítol dedicat a l’anàlisi individual de cada passatge recollit en la secció sobre les propietats específiques de les coses
Measuring the Effects of Stack Overflow Code Snippet Evolution on Open-Source Software Security
This paper assesses the effects of Stack Overflow code snippet evolution on the security of open-source projects. Users on Stack Overflow actively revise posted code snippets, sometimes addressing bugs and vulnerabilities. Accordingly, developers that reuse code from Stack Overflow should treat it like any other evolving code dependency and be vigilant about updates. It is unclear whether developers are doing so, to what extent outdated code snippets from Stack Overflow are present in GitHub projects, and whether developers miss security-relevant updates to reused snippets. To shed light on those questions, we devised a method to 1) detect outdated code snippets versions from 1.5M Stack Overflow snippets in 11,479 popular GitHub projects and 2) detect security-relevant updates to those Stack Overflow code snippets not reflected in those GitHub projects. Our results show that developers do not update dependent code snippets when those evolved on Stack Overflow. We found that 2,405 code snippet versions reused in 2,109 GitHub projects were outdated, with 43 projects missing fixes to bugs and vulnerabilities on Stack Overflow. Those 43 projects containing outdated, insecure snippets were forked on average 1,085 times (max. 16,121), indicating that our results are likely a lower bound for affected code bases. An important insight from our work is that treating Stack Overflow code as purely static code impedes holistic solutions to the problem of copying insecure code from Stack Overflow. Instead, our results suggest that developers need tools that continuously monitor Stack Overflow for security warnings and code fixes to reused code snippets and not only warn during copy-pasting
TANDEM: taming failures in next-generation datacenters with emerging memory
The explosive growth of online services, leading to unforeseen scales, has made modern datacenters highly prone to failures. Taming these failures hinges on fast and correct recovery, minimizing service interruptions.
Applications, owing to recovery, entail additional measures to maintain a recoverable state of data and computation logic during their failure-free execution. However, these precautionary measures have
severe implications on performance, correctness, and programmability, making recovery incredibly challenging to realize in practice.
Emerging memory, particularly non-volatile memory (NVM) and disaggregated memory (DM), offers a promising opportunity to achieve fast recovery with maximum performance. However, incorporating these technologies into datacenter architecture presents significant challenges; Their distinct architectural attributes, differing significantly from traditional memory devices, introduce new semantic challenges for
implementing recovery, complicating correctness and programmability.
Can emerging memory enable fast, performant, and correct recovery in the datacenter? This thesis aims to answer this question while addressing the associated challenges.
When architecting datacenters with emerging memory, system architects face four key challenges: (1) how to guarantee correct semantics; (2) how to efficiently enforce correctness with optimal performance; (3) how to validate end-to-end correctness including recovery; and (4) how to preserve programmer productivity (Programmability).
This thesis aims to address these challenges through the following approaches: (a)
defining precise consistency models that formally specify correct end-to-end semantics
in the presence of failures (consistency models also play a crucial role in programmability); (b) developing new low-level mechanisms to efficiently enforce the prescribed models given the capabilities of emerging memory; and (c) creating robust testing frameworks to validate end-to-end correctness and recovery.
We start our exploration with non-volatile memory (NVM), which offers fast persistence capabilities directly accessible through the processor’s load-store (memory) interface. Notably, these capabilities can be leveraged to enable fast recovery for Log-Free Data Structures (LFDs) while maximizing performance. However, due to the complexity of modern cache hierarchies, data hardly persist in any specific order, jeop-
ardizing recovery and correctness. Therefore, recovery needs primitives that explicitly control the order of updates to NVM (known as persistency models). We outline the precise specification of a novel persistency model – Release Persistency (RP) – that provides a consistency guarantee for LFDs on what remains in non-volatile memory upon failure. To efficiently enforce RP, we propose a novel microarchitecture mechanism,
lazy release persistence (LRP). Using standard LFDs benchmarks, we show that LRP achieves fast recovery while incurring minimal overhead on performance.
We continue our discussion with memory disaggregation which decouples memory from traditional monolithic servers, offering a promising pathway for achieving very high availability in replicated in-memory data stores. Achieving such availability hinges on transaction protocols that can efficiently handle recovery in this setting, where
compute and memory are independent. However, there is a challenge: disaggregated memory (DM) fails to work with RPC-style protocols, mandating one-sided transaction protocols. Exacerbating the problem, one-sided transactions expose critical low-level
ordering to architects, posing a threat to correctness. We present a highly available transaction protocol, Pandora, that is specifically designed to achieve fast recovery in disaggregated key-value stores (DKVSes).
Pandora is the first one-sided transactional protocol that ensures correct, non-blocking, and fast recovery in DKVS. Our experimental implementation artifacts demonstrate that Pandora achieves fast recovery and high availability while causing minimal disruption to services.
Finally, we introduce a novel target litmus-testing framework – DART – to validate the end-to-end correctness of transactional protocols with recovery. Using DART’s target testing capabilities, we have found several critical bugs in Pandora, highlighting the need for robust end-to-end testing methods in the design loop to iteratively fix correctness bugs. Crucially, DART is lightweight and black-box, thereby eliminating
any intervention from the programmers
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
On the real world practice of Behaviour Driven Development
Surveys of industry practice over the last decade suggest that Behaviour Driven Development is a popular Agile practice. For example, 19% of respondents to the 14th State of Agile annual survey reported using BDD, placing it in the top 13 practices reported. As well as potential benefits, the adoption of BDD necessarily involves an additional cost of writing and maintaining Gherkin features and scenarios, and (if used for acceptance testing,) the associated step functions. Yet there is a lack of published literature exploring how BDD is used in practice and the challenges experienced by real world software development efforts. This gap is significant because without understanding current real world practice, it is hard to identify opportunities to address and mitigate challenges. In order to address this research gap concerning the challenges of using BDD, this thesis reports on a research project which explored: (a) the challenges of applying agile and undertaking requirements engineering in a real world context; (b) the challenges of applying BDD specifically and (c) the application of BDD in open-source projects to understand challenges in this different context.
For this purpose, we progressively conducted two case studies, two series of interviews, four iterations of action research, and an empirical study. The first case study was conducted in an avionics company to discover the challenges of using an agile process in a large scale safety critical project environment. Since requirements management was found to be one of the biggest challenges during the case study, we decided to investigate BDD because of its reputation for requirements management. The second case study was conducted in the company with an aim to discover the challenges of using BDD in real life. The case study was complemented with an empirical study of the practice of BDD in open source projects, taking a study sample from the GitHub open source collaboration site.
As a result of this Ph.D research, we were able to discover: (i) challenges of using an agile process in a large scale safety-critical organisation, (ii) current state of BDD in practice, (iii) technical limitations of Gherkin (i.e., the language for writing requirements in BDD), (iv) challenges of using BDD in a real project, (v) bad smells in the Gherkin specifications of open source projects on GitHub. We also presented a brief comparison between the theoretical description of BDD and BDD in practice. This research, therefore, presents the results of lessons learned from BDD in practice, and serves as a guide for software practitioners planning on using BDD in their projects
Configuration Management of Distributed Systems over Unreliable and Hostile Networks
Economic incentives of large criminal profits and the threat of legal consequences have pushed criminals to continuously improve their malware, especially command and control channels. This thesis applied concepts from successful malware command and control to explore the survivability and resilience of benign configuration management systems.
This work expands on existing stage models of malware life cycle to contribute a new model for identifying malware concepts applicable to benign configuration management. The Hidden Master architecture is a contribution to master-agent network communication. In the Hidden Master architecture, communication between master and agent is asynchronous and can operate trough intermediate nodes. This protects the master secret key, which gives full control of all computers participating in configuration management. Multiple improvements to idempotent configuration were proposed, including the definition of the minimal base resource dependency model, simplified resource revalidation and the use of imperative general purpose language for defining idempotent configuration.
Following the constructive research approach, the improvements to configuration management were designed into two prototypes. This allowed validation in laboratory testing, in two case studies and in expert interviews. In laboratory testing, the Hidden Master prototype was more resilient than leading configuration management tools in high load and low memory conditions, and against packet loss and corruption. Only the research prototype was adaptable to a network without stable topology due to the asynchronous nature of the Hidden Master architecture.
The main case study used the research prototype in a complex environment to deploy a multi-room, authenticated audiovisual system for a client of an organization deploying the configuration. The case studies indicated that imperative general purpose language can be used for idempotent configuration in real life, for defining new configurations in unexpected situations using the base resources, and abstracting those using standard language features; and that such a system seems easy to learn.
Potential business benefits were identified and evaluated using individual semistructured expert interviews. Respondents agreed that the models and the Hidden Master architecture could reduce costs and risks, improve developer productivity and allow faster time-to-market. Protection of master secret keys and the reduced need for incident response were seen as key drivers for improved security. Low-cost geographic scaling and leveraging file serving capabilities of commodity servers were seen to improve scaling and resiliency. Respondents identified jurisdictional legal limitations to encryption and requirements for cloud operator auditing as factors potentially limiting the full use of some concepts
Dataflow Programming and Acceleration of Computationally-Intensive Algorithms
The volume of unstructured textual information continues to grow due to recent technological advancements. This resulted in an exponential growth of information generated in various formats, including blogs, posts, social networking, and enterprise documents. Numerous Enterprise Architecture (EA) documents are also created daily, such as reports, contracts, agreements, frameworks, architecture requirements, designs, and operational guides. The processing and computation of this massive amount of unstructured information necessitate substantial computing capabilities and the implementation of new techniques. It is critical to manage this unstructured information through a centralized knowledge management platform. Knowledge management is the process of managing information within an organization. This involves creating, collecting, organizing, and storing information in a way that makes it easily accessible and usable. The research involved the development textual knowledge management system, and two use cases were considered for extracting textual knowledge from documents. The first case study focused on the safety-critical documents of a railway enterprise. Safety is of paramount importance in the railway industry. There are several EA documents including manuals, operational procedures, and technical guidelines that contain critical information. Digitalization of these documents is essential for analysing vast amounts of textual knowledge that exist in these documents to improve the safety and security of railway operations. A case study was conducted between the University of Huddersfield and the Railway Safety Standard Board (RSSB) to analyse EA safety documents using Natural language processing (NLP). A graphical user interface was developed that includes various document processing features such as semantic search, document mapping, text summarization, and visualization of key trends. For the second case study, open-source data was utilized, and textual knowledge was extracted. Several features were also developed, including kernel distribution, analysis offkey trends, and sentiment analysis of words (such as unique, positive, and negative) within the documents. Additionally, a heterogeneous framework was designed using CPU/GPU and FPGAs to analyse the computational performance of document mapping
Rights on news : expanding copyright on the internet
Defence date: 18 February 2020Examining Board: Prof. Giovanni Sartor, EUI (Supervisor); Prof. Pier Luigi Parcu, EUI; Prof. Lionel Bently, University of Cambridge; Prof. Christophe Geiger, University of StrasbourgThe internet and digital technologies have irreversibly changed the way we find and consume news. Legacy news organisations, publishers of newspapers, have moved to the internet. In the online news environment, however, they are no longer the exclusive suppliers of news. New digital intermediaries have emerged, search engines and news aggregators in particular. They select and display links and fragments of press publishers’ content as a part of their services, without seeking the news organisations’ prior consent. To shield themselves from exploitation by digital intermediaries, press publishers have begun to seek legal protection, and called for the introduction of a new right under the umbrella of copyright and related rights. Following these calls, the press publishers’ right was introduced into the EU copyright framework by the Directive on Copyright in the Digital Single Market in 2019
Energy storage design and integration in power systems by system-value optimization
Energy storage can play a crucial role in decarbonising power systems by balancing
power and energy in time. Wider power system benefits that arise from these
balancing technologies include lower grid expansion, renewable curtailment, and
average electricity costs. However, with the proliferation of new energy storage
technologies, it becomes increasingly difficult to identify which technologies are
economically viable and how to design and integrate them effectively.
Using large-scale energy system models in Europe, the dissertation shows that solely
relying on Levelized Cost of Storage (LCOS) metrics for technology assessments can
mislead and that traditional system-value methods raise important questions about
how to assess multiple energy storage technologies. Further, the work introduces a
new complementary system-value assessment method called the market-potential
method, which provides a systematic deployment analysis for assessing multiple
storage technologies under competition. However, integrating energy storage in
system models can lead to the unintended storage cycling effect, which occurs in
approximately two-thirds of models and significantly distorts results. The thesis
finds that traditional approaches to deal with the issue, such as multi-stage optimization
or mixed integer linear programming approaches, are either ineffective
or computationally inefficient. A new approach is suggested that only requires
appropriate model parameterization with variable costs while keeping the model
convex to reduce the risk of misleading results.
In addition, to enable energy storage assessments and energy system research around
the world, the thesis extended the geographical scope of an existing European opensource
model to global coverage. The new build energy system model ‘PyPSA-Earth’
is thereby demonstrated and validated in Africa. Using PyPSA-Earth, the thesis
assesses for the first time the system value of 20 energy storage technologies across
multiple scenarios in a representative future power system in Africa. The results offer
insights into approaches for assessing multiple energy storage technologies under
competition in large-scale energy system models. In particular, the dissertation
addresses extreme cost uncertainty through a comprehensive scenario tree and finds
that, apart from lithium and hydrogen, only seven energy storage are optimizationrelevant
technologies. The work also discovers that a heterogeneous storage design
can increase power system benefits and that some energy storage are more important
than others. Finally, in contrast to traditional methods that only consider single
energy storage, the thesis finds that optimizing multiple energy storage options
tends to significantly reduce total system costs by up to 29%.
The presented research findings have the potential to inform decision-making processes
for the sizing, integration, and deployment of energy storage systems in
decarbonized power systems, contributing to a paradigm shift in scientific methodology
and advancing efforts towards a sustainable future
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
- …