683 research outputs found

    The anatomy of a search and mining system for digital humanities : Search And Mining Tools for Language Archives (SAMTLA)

    Get PDF
    Humanities researchers are faced with an overwhelming volume of digitised primary source material, and "born digital" information, of relevance to their research as a result of large-scale digitisation projects. The current digital tools do not provide consistent support for analysing the content of digital archives that are potentially large in scale, multilingual, and come in a range of data formats. The current language-dependent, or project specific, approach to tool development often puts the tools out of reach for many research disciplines in the humanities. In addition, the tools can be incompatible with the way researchers locate and compare the relevant sources. For instance, researchers are interested in shared structural text patterns, known as \parallel passages" that describe a specific cultural, social, or historical context relevant to their research topic. Identifying these shared structural text patterns is challenging due to their repeated yet highly variable nature, as a result of differences in the domain, author, language, time period, and orthography. The contribution of the thesis is a novel infrastructure that directly addresses the need for generic, flexible, extendable, and sustainable digital tools that are applicable to a wide range of digital archives and research in the humanities. The infrastructure adopts a character-level n-gram Statistical Language Model (SLM), stored in a space-optimised k-truncated suffix tree data structure as its underlying data model. A character-level n-gram model is a relatively new approach that is competitive with word-level n-gram models, but has the added advantage that it is domain and language-independent, requiring little or no preprocessing of the document text unlike word-level models that require some form of language-dependent tokenisation and stemming. Character-level n-grams capture word internal features that are ignored by word-level n-gram models, which provides greater exibility in addressing the information need of the user through tolerant search, and compensation for erroneous query specification or spelling errors in the document text. Furthermore, the SLM provides a unified approach to information retrieval and text mining, where traditional approaches have tended to adopt separate data models that are often ad-hoc or based on heuristic assumptions. In addition, the performance of the character-level n-gram SLM was formally evaluated through crowdsourcing, which demonstrates that the retrieval performance of the SLM is close to that of the human level performance. The proposed infrastructure, supports the development of the Samtla (Search And Mining Tools for Language Archives), which provides humanities researchers digital tools for search, browsing, and text mining of digital archives in any domain or language, within a single system. Samtla supersedes many of the existing tools for humanities researchers, by supporting the same or similar functionality of the systems, but with a domain-independent and languageindependent approach. The functionality includes a browsing tool constructed from the metadata and named entities extracted from the document text, a hybrid-recommendation system for recommending related queries and documents. However, some tools are novel tools and developed in response to the specific needs of the researchers, such as the document comparison tool for visualising shared sequences between groups of related documents. Furthermore, Samtla is the first practical example of a system with a SLM as its primary data model that supports the real research needs of several case studies covering different areas of research in the humanities

    Mode-switching: a new technique for electronically varying the agglomeration position in an acoustic particle manipulator

    No full text
    Acoustic radiation forces offer a means of manipulating particles within a fluid. Much interest in recent years has focussed on the use of radiation forces in microfluidic (or “lab on a chip”) devices. Such devices are well matched to the use of ultrasonic standing waves in which the resonant dimensions of the chamber are smaller than the ultrasonic wavelength in use. However, such devices have typically been limited to moving particles to one or two predetermined planes, whose positions are determined by acoustic pressure nodes/anti-nodes set up in the ultrasonic standing wave. In most cases devices have been designed to move particles to either the centre or (more recently) the side of a flow channel using ultrasonic frequencies that produce a half or quarter wavelength over the channel, respectively.It is demonstrated here that by rapidly switching back and forth between half and quarter wavelength frequencies – mode-switching – a new agglomeration position is established that permits beads to be brought to any arbitrary point between the half and quarter-wave nodes. This new agglomeration position is effectively a position of stable equilibrium. This has many potential applications, particularly in cell sorting and manipulation. It should also enable precise control of agglomeration position to be maintained regardless of manufacturing tolerances, temperature variations, fluid medium characteristics and particle concentration

    The anatomy of a search and mining system for digital humanities : Search And Mining Tools for Language Archives (SAMTLA)

    Get PDF
    Humanities researchers are faced with an overwhelming volume of digitised primary source material, and "born digital" information, of relevance to their research as a result of large-scale digitisation projects. The current digital tools do not provide consistent support for analysing the content of digital archives that are potentially large in scale, multilingual, and come in a range of data formats. The current language-dependent, or project specific, approach to tool development often puts the tools out of reach for many research disciplines in the humanities. In addition, the tools can be incompatible with the way researchers locate and compare the relevant sources. For instance, researchers are interested in shared structural text patterns, known as \parallel passages" that describe a specific cultural, social, or historical context relevant to their research topic. Identifying these shared structural text patterns is challenging due to their repeated yet highly variable nature, as a result of differences in the domain, author, language, time period, and orthography. The contribution of the thesis is a novel infrastructure that directly addresses the need for generic, flexible, extendable, and sustainable digital tools that are applicable to a wide range of digital archives and research in the humanities. The infrastructure adopts a character-level n-gram Statistical Language Model (SLM), stored in a space-optimised k-truncated suffix tree data structure as its underlying data model. A character-level n-gram model is a relatively new approach that is competitive with word-level n-gram models, but has the added advantage that it is domain and language-independent, requiring little or no preprocessing of the document text unlike word-level models that require some form of language-dependent tokenisation and stemming. Character-level n-grams capture word internal features that are ignored by word-level n-gram models, which provides greater exibility in addressing the information need of the user through tolerant search, and compensation for erroneous query specification or spelling errors in the document text. Furthermore, the SLM provides a unified approach to information retrieval and text mining, where traditional approaches have tended to adopt separate data models that are often ad-hoc or based on heuristic assumptions. In addition, the performance of the character-level n-gram SLM was formally evaluated through crowdsourcing, which demonstrates that the retrieval performance of the SLM is close to that of the human level performance. The proposed infrastructure, supports the development of the Samtla (Search And Mining Tools for Language Archives), which provides humanities researchers digital tools for search, browsing, and text mining of digital archives in any domain or language, within a single system. Samtla supersedes many of the existing tools for humanities researchers, by supporting the same or similar functionality of the systems, but with a domain-independent and languageindependent approach. The functionality includes a browsing tool constructed from the metadata and named entities extracted from the document text, a hybrid-recommendation system for recommending related queries and documents. However, some tools are novel tools and developed in response to the specific needs of the researchers, such as the document comparison tool for visualising shared sequences between groups of related documents. Furthermore, Samtla is the first practical example of a system with a SLM as its primary data model that supports the real research needs of several case studies covering different areas of research in the humanities

    Temperature monitoring of through-thickness temperature gradients in thermal barrier coatings using ultrasonic guided waves

    Get PDF
    Ultrasonic guided waves offer a promising method of monitoring the online temperature of plate-like structures in extreme environments, such as aero-engine nozzle guide vanes (NGVs), and can provide the resolution, response rate, and robust operation that is required in aerospace. Previous investigations have shown the potential of such a system but the effect of the complex physical environment on wave propagation is yet to be considered. This article uses a numerical approach to investigate how thermal barrier coatings (TBCs) applied to the surface of many components designed for extreme thermal conditions will affect ultrasonic guided wave propagation, and how a system can be employed to monitor through-thickness temperature changes. The top coat/bond coat boundary in NGVs has been shown to be a temperature critical point that is difficult to monitor with traditional temperature sensors, which highlights the potential of ultrasonic guided waves. Differences in application method and layer thickness are considered, and analysis of through-thickness displacement profiles and dispersion curves are used to predict signal response and determine the most suitable mode of operation. Heat transfer simulations (COMSOL) have been used to predict temperature gradients within a TBC, and dispersion curves have been produced from the temperature dependant material properties. Time dependant simulations of wave propagation are in good agreement with dispersion curve predictions of wave velocity for the two lowest order modes in three thicknesses of TBC top coat (100, 250, and 500 ÎĽ ). When wave velocity measurements from the simulations are compared to dispersion curves generated at isotropic temperatures, the corresponding temperature represents the average temperature of a gradient system well. Such a measurement system could, in principle, be used in conjunction with surface temperature measurement systems to monitor through-thickness temperature changes

    Comparing “parallel passages” in digital archives

    Get PDF
    Purpose: The purpose of this paper is to present a language-agnostic approach to facilitate the discovery of “parallel passages” stored in historic and cultural heritage digital archives. Design/methodology/approach: The authors explore a novel, and relatively simple approach, using a character-based statistical language model combined with a tailored version of the Basic Local Alignment Tool to extract exact and approximate string patterns shared between groups of documents. Findings: The approach is applicable to a wide range of languages, and compensates for variability in the text of the documents as a result of differences in dialect, authorship, language change over time and errors due to inaccurate transcriptions and optical character recognition errors as a result of the digitisation process. Research limitations/implications: A number of case studies demonstrate that the approach is practical and generalisable to a wide range of archives with documents in different languages, domains and of varying quality. Practical implications: The approach described can be applied to any digital archive of modern and contemporary texts. This makes the approach applicable to digital archives recording historic texts, but also those composed of more recent news articles, for example. Social implications: The analysis of “parallel passages” enables researchers to quantify the presence and extent of text-reuse in a collection of documents, which can provide useful data on author style, text genres and cultural contexts. Originality/value: The approach is novel and addresses a need by humanities researchers for tools that can identify similar documents and local similarities represented by shared text sequences in a potentially vast large archive of documents. As far as the authors are aware, there are no tools currently exist that provide the same level of tolerance to the language of the documents

    Modelling and validation of a guided acoustic wave temperature monitoring system

    Get PDF
    The computer modelling of condition monitoring sensors can aide in their development, improve their performance, and allow for the analysis of sensor impact on component operation. This article details the development of a COMSOL model for a guided wave-based temperature monitoring system, with a view to using the technology in the future for the temperature monitoring of nozzle guide vanes, found in the hot section of aeroengines. The model is based on an experimental test system that acts as a method of validation for the model. Piezoelectric wedge transducers were used to excite the S0 Lamb wave mode in an aluminium plate, which was temperature controlled using a hot plate. Time of flight measurements were carried out in MATLAB and used to calculate group velocity. The results were compared to theoretical wave velocities extracted from dispersion curves. The assembly and validation of such a model can aide in the future development of guided wave based sensor systems, and the methods provided can act as a guide for building similar COMSOL models. The results show that the model is in good agreement with the experimental equivalent, which is also in line with theoretical predictions

    A novel bibliometric index with a simple geometric interpretation

    Get PDF
    We propose the χ-index as a bibliometric indicator that generalises the h-index. While the h-index is determined by the maximum square that fits under the citation curve of an author when plotting the number of citations in decreasing order, the χ-index is determined by the maximum area rectangle that fits under the curve. The height of the maximum rectangle is the number of citations ck to the kth most-cited publication, where k is the width of the rectangle. The χ-index is then defined as , for convenience of comparison with the h-index and other similar indices. We present a comprehensive empirical comparison between the χ-index and other bibliometric indices, focusing on a comparison with the h-index, by analysing two datasets—a large set of Google Scholar profiles and a small set of Nobel prize winners. Our results show that, although the χ and h indices are strongly correlated, they do exhibit significant differences. In particular, we show that, for these data sets, there are a substantial number of profiles for which χ is significantly larger than h. Furthermore, restricting these profiles to the cases when ck > k or ck < k corresponds to, respectively, classifying researchers as either tending to influential, i.e. having many more than h citations, or tending to prolific, i.e. having many more than h publications

    Towards in-flight temperature monitoring for nozzle guide vanes using ultrasonic guided waves

    Get PDF
    The temperature monitoring of nozzle guide vanes is a challenging task due to the extreme temperatures, gas pressures, and cramped conditions of aero-engines. Ultrasonic guided waves are an attractive method of temperature monitoring as the sensors can be placed outside of the gas path without influencing component operation. In this paper the suitability of using ultrasonic guided waves in the form of the S0 Lamb wave mode is investigated by comparing experimentally measured wave velocity change with temperature against theoretical wave velocity extracted from dispersion curves. Waves are transmitted through an aluminium plate using a pitch-catch wedge transducer configuration, and wave velocity is measured using across-correlation function. Temperature is controlled with a hot plate from room temperature to 100°C, and monitored using thermocouples. Results show that this transducer configuration is capable of monitoring a change in temperature based on a change in wave velocity, showing a good agreement with theoretical predictions, within 4.89+/-2.27 m/s on average. The temperature sensitivity of the system is 1.26–1.78 m/s/°C over the range 24°C–94°C. This shows the potential for a guided wave based temperature monitoring system, assuming a suitable transducer configuration can be found that is able to operate at higher temperatures. Further investigation will study the possibility of using Piezoelectric Wafer Active Sensors (PWAS) or waveguides for this application

    Temperature hotspot detection on printed circuit boards (pcbs) using ultrasonic guided waves—a machine learning approach

    Get PDF
    This paper addresses the challenging issue of achieving high spatial resolution in temperature monitoring of printed circuit boards (PCBs) without compromising the operation of electronic components. Traditional methods involving numerous dedicated sensors such as thermocouples are often intrusive and can impact electronic functionality. To overcome this, this study explores the application of ultrasonic guided waves, specifically utilising a limited number of cost-effective and unobtrusive Piezoelectric Wafer Active Sensors (PWAS). Employing COMSOL multiphysics, wave propagation is simulated through a simplified PCB while systematically varying the temperature of both components and the board itself. Machine learning algorithms are used to identify hotspots at component positions using a minimal number of sensors. An accuracy of 97.6% is achieved with four sensors, decreasing to 88.1% when utilizing a single sensor in a pulse–echo configuration. The proposed methodology not only provides sufficient spatial resolution to identify hotspots but also offers a non-invasive and efficient solution. Such advancements are important for the future electrification of the aerospace and automotive industries in particular, as they contribute to condition-monitoring technologies that are essential for ensuring the reliability and safety of electronic systems
    • …
    corecore