17,054 research outputs found

    Metric Learning for Temporal Sequence Alignment

    Get PDF
    In this paper, we propose to learn a Mahalanobis distance to perform alignment of multivariate time series. The learning examples for this task are time series for which the true alignment is known. We cast the alignment problem as a structured prediction task, and propose realistic losses between alignments for which the optimization is tractable. We provide experiments on real data in the audio to audio context, where we show that the learning of a similarity measure leads to improvements in the performance of the alignment task. We also propose to use this metric learning framework to perform feature selection and, from basic audio features, build a combination of these with better performance for the alignment

    Site investigation techniques for DNAPL source and plume zone characterisation

    Get PDF
    Establishing the location of the Source Area BioREmediation (SABRE) research cell was a primary objective of the site characterisation programme. This bulletin describes the development of a two-stage site characterisation methodology that combined qualitative and quantitative data to guide and inform an assessment of dense nonaqueous phase liquid (DNAPL) distribution at the site. DNAPL site characterisation has traditionally involved multiple phases of site investigation, characterised by rigid sampling and analysis programmes, expensive mobilisations and long decision-making timeframes (Crumbling, 2001a) , resulting in site investigations that are costly and long in duration. Here we follow the principles of an innovative framework, termed Triad (Crumbling, 2001a, 2001b; Crumbling et al., 2001, Crumbling et al. 2003), which describes a systematic approach for the characterisation and remediation of contaminated sites. The Triad approach to site characterisation focuses on three main components: a) systematic planning which is implemented with a preliminary conceptual site model from existing data. The desired outcomes are planned and decision uncertainties are evaluated; b) dynamic work strategies that focus on the need for flexibility as site characterisation progresses so that new information can guide the investigation in real-time and c) real-time measurement technologies that are critical in making dynamic work strategies possible. Key to this approach is the selection of suitable measurement technologies, of which there are two main categories (Crumbling et al., 2003). The first category provides qualitative, dense spatial data, often with detection limits over a preset value. These methods are generally of lower cost, produce real-time data and are primarily used to identify site areas that require further investigation. Examples of such "decisionquality" methods are laser induced fluorescence (Kram et al., 2001), membrane interface probing (McAndrews et al., 2003) and cone penetrometer testing (Robertson, 1990), all of which produce data in continuous vertical profiles. Because these methods are rapid, many profiles can be generated and hence the subsurface data density is greatly improved. These qualitative results are used to guide the sampling strategy for the application of the second category of technologies that generate quantitative, precise data that have low detection limits and are analyte-specific. These methods tend to be high cost with long turnaround times that preclude on-site decision making, hence applying them to quantify rather than produce a conceptual model facilitates a key cost saving. Examples include instrumental laboratory analyses such as soil solvent extractions (Parker et al., 2004)and water analyses (USEPA, 1996). Where these two categories of measurement technologies are used in tandem, a more complete and accurate dataset is achieved without additional site mobilisations. The aim of the site characterisation programme at the SABRE site was to delineate the DNAPL source zone rapidly and identify a location for the in situ research cell. The site characterisation objectives were to; a) test whether semi-quantitative measurement techniques could reliably determine geological interfaces, contaminant mass distribution and inform the initial site conceptual model; and b) quantitatively determine DNAPL source zone distribution, guided by the qualitative site conceptual model

    A Survey on Mapping Semi-Structured Data and Graph Data to Relational Data

    Get PDF
    The data produced by various services should be stored and managed in an appropriate format for gaining valuable knowledge conveniently. This leads to the emergence of various data models, including relational, semi-structured, and graph models, and so on. Considering the fact that the mature relational databases established on relational data models are still predominant in today's market, it has fueled interest in storing and processing semi-structured data and graph data in relational databases so that mature and powerful relational databases' capabilities can all be applied to these various data. In this survey, we review existing methods on mapping semi-structured data and graph data into relational tables, analyze their major features, and give a detailed classification of those methods. We also summarize the merits and demerits of each method, introduce open research challenges, and present future research directions. With this comprehensive investigation of existing methods and open problems, we hope this survey can motivate new mapping approaches through drawing lessons from eachmodel's mapping strategies, aswell as a newresearch topic - mapping multi-model data into relational tables.Peer reviewe

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    An efficient parallel immersed boundary algorithm using a pseudo-compressible fluid solver

    Full text link
    We propose an efficient algorithm for the immersed boundary method on distributed-memory architectures, with the computational complexity of a completely explicit method and excellent parallel scaling. The algorithm utilizes the pseudo-compressibility method recently proposed by Guermond and Minev [Comptes Rendus Mathematique, 348:581-585, 2010] that uses a directional splitting strategy to discretize the incompressible Navier-Stokes equations, thereby reducing the linear systems to a series of one-dimensional tridiagonal systems. We perform numerical simulations of several fluid-structure interaction problems in two and three dimensions and study the accuracy and convergence rates of the proposed algorithm. For these problems, we compare the proposed algorithm against other second-order projection-based fluid solvers. Lastly, the strong and weak scaling properties of the proposed algorithm are investigated

    Assessment of Semi-Mechanistic Bubble Departure Diameter Modelling for the CFD Simulation of Boiling Flows

    Get PDF
    In Eulerian-Eulerian two-fluid computational fluid dynamic (CFD) models, increasingly often applied to the prediction of nucleate boiling in nuclear reactor thermal hydraulics, boiling at the wall is usually accounted for by partitioning the heat flux between the different mechanisms of heat transfer involved. Between the numerous closures required, the bubble departure diameter in particular has a significant influence on the predicted interfacial area concentration and void distribution within the flow. In the present work, and following evidence of the limited accuracy and reliability of the empirically-based correlations which are applied normally in CFD models, more mechanistic formulations of bubble departure have been introduced into the STAR-CCM+ code. The performance of these models, based on a balance of the hydrodynamic forces acting on a bubble, and their compatibility with existing implementations in a CFD framework, are assessed against two different data sets for vertically upward subcooled boiling flows. In general, a significant amount of modelling is required by these mechanistic models and some recommendations are made on different modelling choices. The model is extended to include a more physically-consistent coupled calculation of the frequency of bubble departure and the modelling of the local subcooling acting on the bubble cap is analyzed. In general, predictions of void distribution and wall temperature reach a satisfactory accuracy, even if numerous numerical and modelling uncertainties are still present. In view of this, several areas for future work and modelling improvement are identified
    • …
    corecore