227 research outputs found

    Human-like summaries from heterogeneous and time-windowed software development artefacts

    Get PDF
    First Online: 02 September 2020Automatic text summarisation has drawn considerable interest in the area of software engineering. It is challenging to summarise the activities related to a software project, (1) because of the volume and heterogeneity of involved software artefacts, and (2) because it is unclear what information a developer seeks in such a multi-document summary. We present the first framework for summarising multi-document software artefacts containing heterogeneous data within a given time frame. To produce human-like summaries, we employ a range of iterative heuristics to minimise the cosine-similarity between texts and high-dimensional feature vectors. A first study shows that users find the automatically generated summaries the most useful when they are generated using word similarity and based on the eight most relevant software artefacts.Mahfouth Alghamdi, Christoph Treude, Markus Wagne

    Multi-Document Summarisation from Heterogeneous Software Development Artefacts

    Get PDF
    Software engineers create a vast number of artefacts during project development; activities, consisting of related information exchanged between developers. Sifting a large amount of information available within a project repository can be time-consuming. In this dissertation, we proposed a method for multi-document summarisation from heterogeneous software development artefacts to help software developers by automatically generating summaries to help them target their information needs. To achieve this aim, we first had our gold-standard summaries created; we then characterised them, and used them to identify the main types of software artefacts that describe developers’ activities in GitHub project repositories. This initial step was important for the present study, as we had no prior knowledge about the types of artefacts linked to developers’ activities that could be used as sources of input for our proposed multi-document summarisation techniques. In addition, we used the gold-standard summaries later to evaluate the quality of our summarisation techniques. We then developed extractive-based multi- document summarisation approaches to automatically summarise software development artefacts within a given time frame by integrating techniques from natural language processing, software repository mining, and data-driven search-based software engineering. The generated summaries were then evaluated in a user study to investigate whether experts considered that the generated summaries mentioned every important project activity that appeared in the gold-standard summaries. The results of the user study showed that generating summaries from different kinds of software artefacts is possible, and the generated summaries are useful in describing a project’s development activities over a given time frame. Finally, we investigated the potential of using source code comments for summarisation by assessing the documented information of Java primitive variables in comments against three types of knowledge. Results showed that the source code comments did contain additional information and could be useful for summarisation of developers’ development activities.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202

    Development of a conceptual graphical user interface framework for the creation of XML metadata for digital archives

    Get PDF
    This dissertation is motivated by the DFG sponsored Jonas Cohn Archive digitization project at Steinheim-Institut whose aim was to preserve and provide digital access to structured handwritten historical archive material highlighting New Kantian philosophy scattered in the correspondence, diaries and private journals kept by and written to and by Jonas Cohn. The dissertation describes a framework for processing and presenting multi-standard digital archive material. A set of standard markup schema and semantic bibliographic descriptions have been chosen to illustrate the multiple standard and hence semantic heterogeneous digital archiving process. The standards include Text Encoding Initiative (TEI), Metadata Encoding and Transmission Standard (METS) and Metadata Object Description Schema (MODS). The chosen standards best illustrate the structural contrast between the systematic archive, digitized archive and digitized text standards. Furthermore, combined digital preservation and presentation approaches offer not only the digitized texts but also metadata structured variably sized images of the archive documents enabling virtual visualization. State of the art applications focus solely on either one of the structural areas neglecting the compound idea of a virtual digital archive. The content of this work describes the requirements analysis for managing multi-structured and therefore multi-standard digital archival artefacts in textual and image form. In addition to the architecture and design, an infrastructure suitable for processing, managing and presenting such scholarly archives is sought for recognition as a digital framework useful for the preservation and access to digitized cultural resources. The proposed solution therefore includes the instrumentation of a conglomerate of existing and novel XML technology for transformations based in a centralized application. The archive can then be managed via a client-server application thereby focusing archival activities on structured data collection and information preservation illustrated in the dissertation process by the: • Development of a prototype data model allowing the integration of the relevant markup schema • Implementation of a prototype client server application handling archive processing, management and presentation and based on the data model already mentioned • Development and implementation of a role archive access user interface Furthermore as an infrastructural development serving expert archivists from the humanities, the dissertation explores methods of binding the existing XML metadata creation process to other programming languages. In doing so, one opens further for channels simplifying the metadata creation process by integrating the use of graphical user interfaces. To this end the java programming language, its swing and AWT graphical user interface libraries, associated relational persistency and enterprise client server architecture resemble a suitable environment for integrating XML metadata into main stream computing. Hence the implementation of Java XML Data Binding as part of the metadata creation framework is part and parcel of the proposed solution.Diese Arbeit geht hervor aus dem von der DFG geförderten Projekt zu Digitalisierung des Jonas Cohn Archivs im Steinheim-Institut, dessen Ziel es ist, eine strukturierte Auswahl von Handschriften des Philosophen Jonas Cohns in digitaler Form zu bewahren und den Zugang zu ihnen zu erleichtern. Die Dissertation beschreibt ein Rahmenwerk für die digitale Verarbeitung und Präsentation digitalisierter Archivinhalte und ihrer Metadaten, strukturiert anhand von mehr als einem Beschreibungsstandard. Eine Auswahl von Standard Markup Schemata und bibliographisch semantischen Beschreibungen wurde getroffen, um die Problematik darzustellen, die aus der Berücksichtigung mehrerer Standards und damit aus semantischer Heterogenität des Digitalisierungsprozesses entsteht. Diese Auswahl umfasst unter anderem die Text Encoding Initiative (TEI), Metadata Encoding and Transmission Schema (METS) und Metadata Object Description Schema (MODS) als Beispiele für Beschreibungsstandards. Diese Standards sind am besten geeignet, die strukturellen und semantischen Unterschiede zwischen den Standards eines systematisch und semantisch zu digitalisierenden Archivs darzustellen. Zusätzlich verbindet der Ansatz die digitale Bewahrung und Präsentation von digitalisierten Texten und von Metadaten strukturierter Bilder der Archivinhalte. Dies ermöglicht eine virtuelle Präsentation des digitalen Archivs. Eine große Zahl bekannter Digitalisierungsanwendungen folgt nur einer der beiden Strukturierungsziele Bewahrung und Präsentation, wodurch der Ansatz eines vollständig virtuellen digitalen Archivs vernachlässigt wird. Der Schwerpunkt dieser Arbeit ist die Beschreibung einer Managementinfrastruktur für die Erfassung und Auszeichnung von Multi-Standard Metadaten für digitale Handschriftensammlungen. Zusätzlich zu der Architektur und dem Design wird nach einer geeigneten Infrastruktur gesucht für die Erfassung, Verarbeitung und die Präsentation wissenschaftlicher Archive als digitales Rahmenwerk für den Zugang zu und die Bewahrung von Kulturbesitz. Die hier vorgeschlagene Lösung sieht deshalb die Nutzung bestehender und neuer XML Technologien vor, verknüpft in einer zentralen Anwendung. So wird im Rahmen der Dissertation die Strukturierung des Archivs mittels einer Client-Server-Anwendung betrieben und die Bewahrungsmaßnahmen als Prozess herausgearbeitet. Die Arbeit verfolgt mehrere Zielsetzungen: • Die Entwicklung eines prototypischen Datenmodells mit der Einbindung relevanter Markup Schemata • Die Implementierung einer prototypischen Client Server Anwendung für die Bearbeitung, Erfassung und Präsentation der Archive anhand des beschriebenen Datenmodells • Die Entwicklung, Implementierung und Bewertung einer Benutzerschnittstelle für die Interaktion mit dem Rahmenwerk anhand einer Expertenevaluation

    Pattern identification of biomedical images with time series: contrasting THz pulse imaging with DCE-MRIs

    Get PDF
    Objective We provide a survey of recent advances in biomedical image analysis and classification from emergent imaging modalities such as terahertz (THz) pulse imaging (TPI) and dynamic contrast-enhanced magnetic resonance images (DCE-MRIs) and identification of their underlining commonalities. Methods Both time and frequency domain signal pre-processing techniques are considered: noise removal, spectral analysis, principal component analysis (PCA) and wavelet transforms. Feature extraction and classification methods based on feature vectors using the above processing techniques are reviewed. A tensorial signal processing de-noising framework suitable for spatiotemporal association between features in MRI is also discussed. Validation Examples where the proposed methodologies have been successful in classifying TPIs and DCE-MRIs are discussed. Results Identifying commonalities in the structure of such heterogeneous datasets potentially leads to a unified multi-channel signal processing framework for biomedical image analysis. Conclusion The proposed complex valued classification methodology enables fusion of entire datasets from a sequence of spatial images taken at different time stamps; this is of interest from the viewpoint of inferring disease proliferation. The approach is also of interest for other emergent multi-channel biomedical imaging modalities and of relevance across the biomedical signal processing community

    Improved 3D MR Image Acquisition and Processing in Congenital Heart Disease

    Get PDF
    Congenital heart disease (CHD) is the most common type of birth defect, affecting about 1% of the population. MRI is an essential tool in the assessment of CHD, including diagnosis, intervention planning and follow-up. Three-dimensional MRI can provide particularly rich visualization and information. However, it is often complicated by long scan times, cardiorespiratory motion, injection of contrast agents, and complex and time-consuming postprocessing. This thesis comprises four pieces of work that attempt to respond to some of these challenges. The first piece of work aims to enable fast acquisition of 3D time-resolved cardiac imaging during free breathing. Rapid imaging was achieved using an efficient spiral sequence and a sparse parallel imaging reconstruction. The feasibility of this approach was demonstrated on a population of 10 patients with CHD, and areas of improvement were identified. The second piece of work is an integrated software tool designed to simplify and accelerate the development of machine learning (ML) applications in MRI research. It also exploits the strengths of recently developed ML libraries for efficient MR image reconstruction and processing. The third piece of work aims to reduce contrast dose in contrast-enhanced MR angiography (MRA). This would reduce risks and costs associated with contrast agents. A deep learning-based contrast enhancement technique was developed and shown to improve image quality in real low-dose MRA in a population of 40 children and adults with CHD. The fourth and final piece of work aims to simplify the creation of computational models for hemodynamic assessment of the great arteries. A deep learning technique for 3D segmentation of the aorta and the pulmonary arteries was developed and shown to enable accurate calculation of clinically relevant biomarkers in a population of 10 patients with CHD

    Behaviour analysis in binary SoC data

    Get PDF

    Diffusion and perfusion MRI and applications in cerebral ischaemia

    Get PDF
    Two MRI techniques, namely diffusion and perfusion imaging, are becoming increasingly used for evaluation of the pathophysiology of stroke. This work describes the use of these techniques, together with more conventional MRI modalities (such as T1, and T2 imaging) in the investigation of cerebral ischaemia. The work was performed both in a paediatric population in a whole-body clinical MR system (1.5 T) and in an animal model of focal ischaemia at high magnetic field strength (8.5 T). For the paediatric studies, a single shot echo planar imaging (EPI) sequence was developed to enable the on-line calculation of maps of the trace of the diffusion tensor. In the process of this development, it was necessary to address two different imaging artefacts in these maps: eddy current induced image shifts, and residual Nyquist ghost artefacts. Perfusion imaging was implemented using an EPI sequence to follow the passage through the brain of a bolus of a paramagnetic contrast agent. Computer simulations were performed to evaluate the limitations of this technique in the quantification of cerebral blood flow when delay in the arrival and dispersion of the bolus of contrast agent are not accounted for. These MRI techniques were applied to paediatric patients to identify acute ischaemic events, as well as to differentiate between multiple acute events, or between acute and chronic events. Furthermore, the diffusion and perfusion findings were shown to contribute significantly to the management of patients with high risk of stroke, and in the evaluation of treatment outcome. In the animal experiments, permanent middle cerebral artery occlusion was performed in rats to investigate longitudinally the acute MRI changes (first 4-6 hours) following an ischaemic event. This longitudinal analysis contributed to the understanding of the evolution of the ischaemic lesion. Furthermore, the findings allowed the acute identification of tissue 'at risk' of infarction
    • …
    corecore