1,168 research outputs found
WARP Business Intelligence System
Continuous delivery (CD) facilitates the software releasing process. Because the use of continuous integration and deployment pipelines, allows software to be tested several times before going into production. In Business Intelligence (BI), software releases tend to be manual and deprived of pipelines, versions control might also be deficient because of the project nature, which involves data and it’s impossible to version. How to apply CD concepts to BI to an existing project where legacy code is extended and there is no version control over project objects? Only few organizations have an automated release process for their BI projects. Because due to projects nature it is difficult to implement CD to the full extent. Thus, the problem was tackled in stages, first the implementation of version control, that works for the organization, then the establishment of the necessary environments to proceed with the pipelines and finally the creation of a test pipeline for one of the BI projects, proving the success of this approach. To evaluate the success of this solution the main beneficiaries (stakeholders and engineers) were asked to answer some questionnaires regarding their experience with the data warehouse before and after the use of CD. Because each release is tested before going into production, the use of CD will improve software quality in the long run as well as it allows software to be released more frequently.Continuous Delivery (CD) permite que as releases de software aconteçam em qualquer momento sem problemas associados, utilizando pipelines de integração e de deployment. Desta forma, o software é testado várias vezes antes de ser instalado em produção. Em Business Intelligence (BI), as releases são tendencialmente manuais, sem pipelines e devido à natureza do projecto (dados) o controlo de versões tende a ser inexistente. Como aplicar o conceito de CD num contexto de BI a projetos de grandes dimensões, com legacy code extenso e sem controlo de versões? Apenas algumas organizações têm um processo automático de releases para os seus projectos de BI, porque devido à natureza dos projetos que envolvem dados, é difÃcil implementar CD. Tendo em conta os estes factores, o problema foi abordado por etapas, em primeiro lugar procedeu-se à implementação de um controlo de versões, que se adapte à s necessidades da organização. O passo seguinte foi a criação do ambiente necessário para prosseguir com a instalação de pipelines e para terminar, a terceira etapa, consistiu na criação de uma pipeline de teste para um dos projectos de BI, comprovando assim o sucesso da solução proposta. Para avaliar o sucesso desta solução os principais beneficiários (stakeholders e engenheiros) foram convidados a preencher questionários, que permitem avaliar a sua experiência com o data warehouse antes e depois da utilização da solução proposta neste trabalho. Como cada release é testada antes de ser instalada em produção, garantindo que possÃveis erros já foram encontrados previamente, o uso de CD melhorará a qualidade do software a longo prazo e permitirá que as releases ocorram com mais frequência
Scaling Data Science Solutions with Semantics and Machine Learning: Bosch Case
Industry 4.0 and Internet of Things (IoT) technologies unlock unprecedented
amount of data from factory production, posing big data challenges in volume
and variety. In that context, distributed computing solutions such as cloud
systems are leveraged to parallelise the data processing and reduce computation
time. As the cloud systems become increasingly popular, there is increased
demand that more users that were originally not cloud experts (such as data
scientists, domain experts) deploy their solutions on the cloud systems.
However, it is non-trivial to address both the high demand for cloud system
users and the excessive time required to train them. To this end, we propose
SemCloud, a semantics-enhanced cloud system, that couples cloud system with
semantic technologies and machine learning. SemCloud relies on domain
ontologies and mappings for data integration, and parallelises the semantic
data integration and data analysis on distributed computing nodes. Furthermore,
SemCloud adopts adaptive Datalog rules and machine learning for automated
resource configuration, allowing non-cloud experts to use the cloud system. The
system has been evaluated in industrial use case with millions of data,
thousands of repeated runs, and domain users, showing promising results.Comment: Paper accepted at ISWC2023 In-Use trac
Cost Estimation Approach for Scrum Agile Software Process Model
The software development in the industry is moving towards agile due to the advantages provided by the agile development process. Main advantages of agile software development process are: delivering high quality software in shorter intervals and embracing change. Testing is a vital activity in software development process model, for delivering a high quality software product. Often testing accounts for more project effort and time than any other software development activities. The software testing cost estimation is one of the most important managerial activities related to resource allocation, project planning and to control overall cost of the software development. Several models have been proposed by various authors to address the issue of effort and cost estimation. Most of the models are directly or indirectly depend on the source code of the software product. But, majority of the testing in software organizations is done in black-box environment, where the source code of the software is not available to the testing teams. In this paper, an alternative approach to software testing cost estimation for scrum agile software process model, by considering various testing activities involved in black-box testing environment is presented. The proposed approach is applied on four real world case studies and found that this approach provides more accurate estimation of the testing effort and cost, and helps the software managers in controlling the overrun of the project schedules and project costs
Automatic Software Repair: a Bibliography
This article presents a survey on automatic software repair. Automatic
software repair consists of automatically finding a solution to software bugs
without human intervention. This article considers all kinds of repairs. First,
it discusses behavioral repair where test suites, contracts, models, and
crashing inputs are taken as oracle. Second, it discusses state repair, also
known as runtime repair or runtime recovery, with techniques such as checkpoint
and restart, reconfiguration, and invariant restoration. The uniqueness of this
article is that it spans the research communities that contribute to this body
of knowledge: software engineering, dependability, operating systems,
programming languages, and security. It provides a novel and structured
overview of the diversity of bug oracles and repair operators used in the
literature
Supporting the grow-and-prune model for evolving software product lines
207 p.Software Product Lines (SPLs) aim at supporting the development of a whole family of software products through a systematic reuse of shared assets. To this end, SPL development is separated into two interrelated processes: (1) domain engineering (DE), where the scope and variability of the system is defined and reusable core-assets are developed; and (2) application engineering (AE), where products are derived by selecting core assets and resolving variability. Evolution in SPLs is considered to be more challenging than in traditional systems, as both core-assets and products need to co-evolve. The so-called grow-and-prune model has proven great flexibility to incrementally evolve an SPL by letting the products grow, and later prune the product functionalities deemed useful by refactoring and merging them back to the reusable SPL core-asset base. This Thesis aims at supporting the grow-and-prune model as for initiating and enacting the pruning. Initiating the pruning requires SPL engineers to conduct customization analysis, i.e. analyzing how products have changed the core-assets. Customization analysis aims at identifying interesting product customizations to be ported to the core-asset base. However, existing tools do not fulfill engineers needs to conduct this practice. To address this issue, this Thesis elaborates on the SPL engineers' needs when conducting customization analysis, and proposes a data-warehouse approach to help SPL engineers on the analysis. Once the interesting customizations have been identified, the pruning needs to be enacted. This means that product code needs to be ported to the core-asset realm, while products are upgraded with newer functionalities and bug-fixes available in newer core-asset releases. Herein, synchronizing both parties through sync paths is required. However, the state of-the-art tools are not tailored to SPL sync paths, and this hinders synchronizing core-assets and products. To address this issue, this Thesis proposes to leverage existing Version Control Systems (i.e. git/Github) to provide sync operations as first-class construct
A data mining-based framework for supply chain risk management
Increased risk exposure levels, technological developments and the growing information overload in supply chain networks drive organizations to embrace data-driven approaches in Supply Chain Risk Management (SCRM). Data Mining (DM) employs multiple analytical techniques for intelligent and timely decision making; however, its potential is not entirely explored for SCRM. The paper aims to develop a DM-based framework for the identification, assessment and mitigation of different type of risks in supply chains. A holistic approach integrates DM and risk management activities in a unique framework for effective risk management. The framework is validated with a case study based on a series of semi-structured interviews, discussions and a focus group study. The study showcases how DM supports in discovering hidden and useful information from unstructured risk data for making intelligent risk management decisions
Improving Automated Software Testing while re-engineering legacy systems in the absence of documentation
Legacy software systems are essential assets that contain an organizations' valuable business logic. Because
of outdated technologies and methods used in these systems, they are challenging to maintain and expand.
Therefore, organizations need to decide whether to redevelop or re-engineer the legacy system. Although
in most cases, re-engineering is the safer and less expensive choice, it has risks such as failure to meet the
expected quality and delays due to testing blockades. These risks are even more severe when the legacy
system does not have adequate documentation. A comprehensive testing strategy, which includes automated
tests and reliable test cases, can substantially reduce the risks. To mitigate the hazards associated with
re-engineering, we have conducted three studies in this thesis to improve the testing process.
Our rst study introduces a new testing model for the re-engineering process and investigates test automation
solutions to detect defects in the early re-engineering stages. We implemented this model on the
Cold Region Hydrological Model (CRHM) application and discovered bugs that would not likely have been
found manually. Although this approach helped us discover great numbers of software defects, designing test
cases is very time-consuming due to the lack of documentation, especially for large systems. Therefore, in
our second study, we investigated an approach to generate test cases from user footprints automatically. To
do this, we extended an existing tool to collect user actions and legacy system reactions, including database
and le system changes. Then we analyzed the data based on the order of user actions and time of them
and generated human-readable test cases. Our evaluation shows that this approach can detect more bugs
than other existing tools. Moreover, the test cases generated using this approach contain detailed oracles
that make them suitable for both black-box and white-box testing. Many scienti c legacy systems such as
CRHM are data-driven; they take large amounts of data as input and produce massive data after applying
mathematical models. Applying test cases and nding bugs is more demanding when we are dealing with
large amounts of data. Hence in our third study, we created a comparative visualization tool (ComVis) to
compare a legacy system's output after each change. Visualization helps testers to nd data issues resulting
from newly introduced bugs. Twenty participants took part in a user study in which they were asked to nd
data issued using ComVis and embedded CRHM visualization tool. Our user study shows that ComVis can
nd 51% more data issues than embedded visualization tools in the legacy system can. Also, results from
the NASA-TLX assessment and thematic analysis of open-ended questions about each task show users prefer
to use ComVis over the built-in visualization tool. We believe our introduced approaches and developed
systems will signi cantly reduce the risks associated with the re-engineering process.
i
Proceedings of the 4th Workshop of the MPM4CPS COST Action
Proceedings of the 4th Workshop of the
MPM4CPS COST Action with the presentations delivered during the workshop and papers with extended versions of some of them
Big data analytics for intra-logistics process planning in the automotive sector
The manufacturing sector is facing an important stage with Industry 4.0. This paradigm
shift impulses companies to embrace innovative technologies and to pursuit near-zero
fault, near real-time reactivity, better traceability, and more predictability, while working
to achieve cheaper product customization.
The scenario presented addresses multiple intra-logistic processes of the automotive factory
Volkswagen Autoeuropa, where different situations need to be addressed. The main
obstacle is the absence of harmonized and integrated data flows between all stages of the
intra-logistic process which leads to inefficiencies. The existence of data silos is heavily
contributing to this situation, which makes the planning of intra-logistics processes a
challenge.
The objective of the work presented here, is to integrate big data and machine learning
technologies over data generated by the several manufacturing systems present, and
thus support the management and optimisation of warehouse, parts transportation, sequencing
and point-of-fit areas. This will support the creation of a digital twin of the
intra-logistics processes. Still, the end goal is to employ deep learning techniques to
achieve predictive capabilities, all together with simulation, in order to optimize processes
planning and equipment efficiency.
The work presented on this thesis, is aligned with the European project BOOST 4.0, with
the objective to drive big data technologies in manufacturing domain, focusing on the
automotive use-case
- …