1,034 research outputs found

    Textual Data Mining For Knowledge Discovery and Data Classification: A Comparative Study

    Get PDF
    Business Intelligence solutions are key to enable industrial organisations (either manufacturing or construction) to remain competitive in the market. These solutions are achieved through analysis of data which is collected, retrieved and re-used for prediction and classification purposes. However many sources of industrial data are not being fully utilised to improve the business processes of the associated industry. It is generally left to the decision makers or managers within a company to take effective decisions based on the information available throughout product design and manufacture or from the operation of business or production processes. Substantial efforts and energy are required in terms of time and money to identify and exploit the appropriate information that is available from the data. Data Mining techniques have long been applied mainly to numerical forms of data available from various data sources but their applications to analyse semi-structured or unstructured databases are still limited to a few specific domains. The applications of these techniques in combination with Text Mining methods based on statistical, natural language processing and visualisation techniques could give beneficial results. Text Mining methods mainly deal with document clustering, text summarisation and classification and mainly rely on methods and techniques available in the area of Information Retrieval (IR). These help to uncover the hidden information in text documents at an initial level. This paper investigates applications of Text Mining in terms of Textual Data Mining (TDM) methods which share techniques from IR and data mining. These techniques may be implemented to analyse textual databases in general but they are demonstrated here using examples of Post Project Reviews (PPR) from the construction industry as a case study. The research is focused on finding key single or multiple term phrases for classifying the documents into two classes i.e. good information and bad information documents to help decision makers or project managers to identify key issues discussed in PPRs which can be used as a guide for future project management process

    Leveraging Intermediate Artifacts to Improve Automated Trace Link Retrieval

    Get PDF
    Software traceability establishes a network of connections between diverse artifacts such as requirements, design, and code. However, given the cost and effort of creating and maintaining trace links manually, researchers have proposed automated approaches using information retrieval techniques. Current approaches focus almost entirely upon generating links between pairs of artifacts and have not leveraged the broader network of interconnected artifacts. In this paper we investigate the use of intermediate artifacts to enhance the accuracy of the generated trace links – focus- ing on paths consisting of source, target, and intermediate artifacts. We propose and evaluate combinations of techniques for computing semantic similarity, scaling scores across multiple paths, and aggregating results from multiple paths. We report results from five projects, including one large industrial project. We find that leverag- ing intermediate artifacts improves the accuracy of end-to-end trace retrieval across all datasets and accuracy metrics. After further analysis, we discover that leveraging intermediate artifacts is only helpful when a project’s artifacts share a common vocabulary, which tends to occur in refinement and decomposition hierarchies of artifacts. Given our hybrid approach that integrates both direct and transitive links, we observed little to no loss of accuracy when intermediate artifacts lacked a shared vocabulary with source or target artifacts

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Fundamental Approaches to Software Engineering, FASE 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 23 full papers, 1 tool paper and 6 testing competition papers presented in this volume were carefully reviewed and selected from 81 submissions. The papers cover topics such as requirements engineering, software architectures, specification, software quality, validation, verification of functional and non-functional properties, model-driven development and model transformation, software processes, security and software evolution

    Property driven verification framework: application to real time property for UML MARTE software design

    Get PDF
    Les techniques formelles de la famille « vérification de modèles » (« model checking ») se heurtent au problème de l’explosion combinatoire. Ceci limite les perspectives d’exploitation dans des projets industriels. Ce problème est provoqué par la combinatoire dans la construction de l’espace des états possibles durant l’exécution des systèmes modélisés. Le nombre d’états pour des modèles de systèmes industriels réalistes dépasse régulièrement les capacités des ressources disponibles en calcul et stockage. Cette thèse défend l’idée qu’il est possible de réduire cette combinatoire en spécialisant les outils pour des familles de propriétés. Elle propose puis valide expérimentalement un ensemble de méthodes pour le développement de ce type d’outils en suivant une approche guidée par les propriétés appliquée au contexte temps réel. Il s’agit donc de construire des outils d’analyse performants pour des propriétés temps réel qui soient exploitables pour des modèles industriels de taille réaliste. Les langages considérés sont, d’une part UML étendu par le profil MARTE pour la modélisation par les utilisateurs, et d’autre part les réseaux de Petri temporisés comme support pour la vérification. Les propositions sont validées sur un cas d’étude industriel réaliste issu du monde avionique : l’étude de la latence et la fraicheur des données dans un système de gestion des alarmes exploitant les technologies d’Avionique Modulaire Intégrée. Ces propositions ont été mise en oeuvre comme une boite à outils qui intègre les cinq contributions suivantes: la définition de la sémantique d’exécution spécifiques aux propriétés temps réel pour les modèles d’architecture et de comportement spécifiés en UML/MARTE; la spécification des exigences temps réel en s’appuyant sur un ensemble de patrons de vérification atomiques dédiés aux propriété temps réel; une méthode itérative d’analyse à base d’observateurs pour des réseaux de Petri temporisés; des techniques de réduction de l’espace d’états spécifiques aux propriétés temps réel pour des Réseaux de Petri temporisés; une approche pour l’analyse des erreurs détectées par « vérification des modèles » en s’appuyant sur des idées inspirées de la « fouille de données » (« data mining »). ABSTRACT : Automatic formal verification such as model checking faces the combinatorial explosion issue. This limits its application in indus- trial projects. This issue is caused by the explosion of the number of states during system’s execution , as it may easily exceed the amount of available computing or storage resources. This thesis designs and experiments a set of methods for the development of scalable verification based on the property-driven approach. We propose efficient approaches based on model checking to verify real-time requirements expressed in large scale UML-MARTE real-time system designs. We rely on the UML and its profile MARTE as the end-user modeling language, and on the Time Petri Net (TPN) as the verification language. The main contribution of this thesis is the design and implementation of a property-driven verification prototype toolset dedicated to real-time properties verification for UML-MARTE real-time software designs. We validate this toolset using an avionic use case and its user requirements. The whole prototype toolset includes five contributions: definition of real-time property specific execution semantics for UML-MARTE architecture and behavior models; specification of real- time requirements relying on a set of verification dedicated atomic real- time property patterns; real-time property specific observer-based model checking approach in TPN; real-time property specific state space reduction approach for TPN; and fault localization approach in model checking

    A Model Driven Architecture Framework for Robot Design and Automatic Code Generation

    Get PDF
    International audienceThis work presents a research and development experiment in software engineering at the IMT Mines Ales, France. The goal is to define a framework allowing a system controller to be graphically designed and its java code to be automatically generated. This framework is expected to be a support for students following the system engineering curriculum, and who have to program LEGO Mindstorms EV3 robots although they have not already been trained to concurrent Java programming. The experimental methodology focuses on learning and implementing the following paradigms: model driven design, software architecture for event driven systems and reactive system programming using JAVA threads. We present the design framework defined during this experiment, and the feedback of students who have been involved in setting up the state of the art and developing the framework

    Knowledge-based design support and inductive learning

    Get PDF
    Designing and learning are closely related activities in that design as an ill-structure problem involves identifying the problem of the design as well as finding its solutions. A knowledge-based design support system should support learning by capturing and reusing design knowledge. This thesis addresses two fundamental problems in computational support to design activities: the development of an intelligent design support system architecture and the integration of inductive learning techniques in this architecture.This research is motivated by the belief that (1) the early stage of the design process can be modelled as an incremental learning process in which the structure of a design problem or the product data model of an artefact is developed using inductive learning techniques, and (2) the capability of a knowledge-based design support system can be enhanced by accumulating and storing reusable design product and process information.In order to incorporate inductive learning techniques into a knowledge-based design model and an integrated knowledge-based design support system architecture, the computational techniques for developing a knowledge-based design support system architecture and the role of inductive learning in Al-based design are investigated. This investigation gives a background to the development of an incremental learning model for design suitable for a class of design tasks whose structures are not well known initially.This incremental learning model for design is used as a basis to develop a knowledge-based design support system architecture that can be used as a kernel for knowledge-based design applications. This architecture integrates a number of computational techniques to support the representation and reasoning of design knowledge. In particular, it integrates a blackboard control system with an assumption-based truth maintenance system in an object-oriented environment to support the exploration of multiple design solutions by supporting the exploration and management of design contexts.As an integral part of this knowledge-based design support architecture, a design concept learning system utilising a number of unsupervised inductive learning techniques is developed. This design concept learning system combines concept formation techniques with design heuristics as background knowledge to build a design concept tree from raw data or past design examples. The design concept tree is used as a conceptual structure for the exploration of new designs.The effectiveness of this knowledge-based design support architecture and the design concept learning system is demonstrated through a realistic design domain, the design of small-molecule drugs one of the key tasks of which is to identify a pharmacophore description (the structure of a design problem) from known molecule examples.In this thesis, knowledge-based design and inductive learning techniques are first reviewed. Based on this review, an incremental learning model and an integrated architecture for intelligent design support are presented. The implementation of this architecture and a design concept learning system is then described. The application of the architecture and the design concept learning system in the domain of small-molecule drug design is then discussed. The evaluation of the architecture and the design concept learning system within and beyond this particular domain, and future research directions are finally discussed

    A Machine Learning Enhanced Scheme for Intelligent Network Management

    Get PDF
    The versatile networking services bring about huge influence on daily living styles while the amount and diversity of services cause high complexity of network systems. The network scale and complexity grow with the increasing infrastructure apparatuses, networking function, networking slices, and underlying architecture evolution. The conventional way is manual administration to maintain the large and complex platform, which makes effective and insightful management troublesome. A feasible and promising scheme is to extract insightful information from largely produced network data. The goal of this thesis is to use learning-based algorithms inspired by machine learning communities to discover valuable knowledge from substantial network data, which directly promotes intelligent management and maintenance. In the thesis, the management and maintenance focus on two schemes: network anomalies detection and root causes localization; critical traffic resource control and optimization. Firstly, the abundant network data wrap up informative messages but its heterogeneity and perplexity make diagnosis challenging. For unstructured logs, abstract and formatted log templates are extracted to regulate log records. An in-depth analysis framework based on heterogeneous data is proposed in order to detect the occurrence of faults and anomalies. It employs representation learning methods to map unstructured data into numerical features, and fuses the extracted feature for network anomaly and fault detection. The representation learning makes use of word2vec-based embedding technologies for semantic expression. Next, the fault and anomaly detection solely unveils the occurrence of events while failing to figure out the root causes for useful administration so that the fault localization opens a gate to narrow down the source of systematic anomalies. The extracted features are formed as the anomaly degree coupled with an importance ranking method to highlight the locations of anomalies in network systems. Two types of ranking modes are instantiated by PageRank and operation errors for jointly highlighting latent issue of locations. Besides the fault and anomaly detection, network traffic engineering deals with network communication and computation resource to optimize data traffic transferring efficiency. Especially when network traffic are constrained with communication conditions, a pro-active path planning scheme is helpful for efficient traffic controlling actions. Then a learning-based traffic planning algorithm is proposed based on sequence-to-sequence model to discover hidden reasonable paths from abundant traffic history data over the Software Defined Network architecture. Finally, traffic engineering merely based on empirical data is likely to result in stale and sub-optimal solutions, even ending up with worse situations. A resilient mechanism is required to adapt network flows based on context into a dynamic environment. Thus, a reinforcement learning-based scheme is put forward for dynamic data forwarding considering network resource status, which explicitly presents a promising performance improvement. In the end, the proposed anomaly processing framework strengthens the analysis and diagnosis for network system administrators through synthesized fault detection and root cause localization. The learning-based traffic engineering stimulates networking flow management via experienced data and further shows a promising direction of flexible traffic adjustment for ever-changing environments

    Execution platform modeling for system-level architecture performance analysis

    Get PDF
    Today's embedded systems are designed for more complex and more computationally-intensive applications than they were a decade ago. Most if not every embedded system designed today is a sort of parallel computing system - only called differently - a platform. A platform is essentially a heterogeneous system consisting of communicating processing units of different types and mostly distributed memory units. A platform can be anything: from multiprocessors comprising task-dedicated processors and a dedicated communication network, to a (semi-)programmable multiprocessor that can run parallel processes by means of both interleaving and overlapping. However, the specification, exploration and design of application multiprocessor system platforms from user requirements is still a painstaking process that takes too long and is too costly. Our answer to the above mentioned issues is the Archer approach. It embodies: Application representations (Symbolic Programs - SP), a platform-based library of the architecture components and their configurations (all-in-hardware, all-in-software, hybrid multiprocessor, with dedicated network, hared-bus, highway, burst-bus, or hybrid network), and a mapping methodology (managing the aforementioned representations while transforming application SPs to Archer architecture components), that we have been developing and experimenting with.UBL - phd migration 201
    • …
    corecore