1,075,095 research outputs found

    Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology

    Get PDF
    Motivation: High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘large P, small n’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. Results: We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. Availability and implementation: The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.This work was supported by European Research Council Advanced Investigator Grant ERC-2013-AdG 340642 – TRIBE.This is the author accepted manuscript. The final version is available from Oxford University Press via http://dx.doi.org/10.1093/bioinformatics/btv56

    Design and Analysis of Self-Healing Tree-Based Hybrid Spectral Amplitude Coding OCDMA System

    Get PDF
    This paper presents an efficient tree-based hybrid spectral amplitude coding optical code division multiple access (SAC-OCDMA) system that is able to provide high capacity transmission along with fault detection and restoration throughout the passive optical network (PON). Enhanced multidiagonal (EMD) code is adapted to elevate system’s performance, which negates multiple access interference and associated phase induced intensity noise through efficient two-matrix structure. Moreover, system connection availability is enhanced through an efficient protection architecture with tree and star-ring topology at the feeder and distribution level, respectively. The proposed hybrid architecture aims to provide seamless transmission of information at minimum cost. Mathematical model based on Gaussian approximation is developed to analyze performance of the proposed setup, followed by simulation analysis for validation. It is observed that the proposed system supports 64 subscribers, operating at the data rates of 2.5 Gbps and above. Moreover, survivability and cost analysis in comparison with existing schemes show that the proposed tree-based hybrid SAC-OCDMA system provides the required redundancy at minimum cost of infrastructure and operation

    Effects of Alternative Traffic Input Levels on Interstate Pavement Performance in New Mexico

    Get PDF
    Traffic is one of the key inputs in pavement design. The pavement Mechanistic-Empirical (ME) design allows three different types of input level of traffic data based on the availability of the data. They are: site specific data (Level 1), regional data (Level 2), and the national data (Level 3). Level 1 inputs (e.g., load magnitude, configuration, and frequency) are generated from Weigh-in-Motion (WIM) station installed in each site. However, it is not always practically possible to install WIM station due to high cost of WIMs. Therefore, often time the designers have to rely on the Level 2 or Level 3 traffic data. But it is not known yet how good the national data or the regional data compared to New Mexico\u27s site specific data in predicting interstate pavement performances. To this end, this study examines the effects of different levels of traffic inputs on predicted pavement distresses in New Mexico. Two major interstate highways were considered in this study: Interstate-40 (I-40) and Interstate-25 (I-25). Site-specific inputs were developed using installed WIM stations at the pavement sites. WIM data was analyzed using an advanced and updated software developed by the UNM researchers. Traffic data were simulated through the ME design software for predicting pavement performances. Results show that axle load spectra (ALS) and lane distribution have a great influence on predicted interstate pavement performance. Vehicle class distribution (VCD), directional distribution, and standard deviation of lateral wander have a moderate impact on pavement performance. Monthly adjustment factor, axles per vehicle, axle spacing, and operational speed have very little effect on the predicted pavement performance. On the other hand, predicted pavement performance is insensitive to hourly distribution and wheelbase distribution. Hence, regional traffic data were developed from ten site specific data using both arithmetic average and clustering methods. Since, ALS and VCD are two inputs which affect the predicted distresses significantly, these two values were considered for this case. Finally, using the regional inputs, the national inputs, and the site-specific inputs of VCD and ALS, pavement ME predicted performances were determined. Results show that predicted performance by the cluster data are much closer to those by the site-specific data. Performance generated by the ME default values are significantly different from those generated by the site-specific or cluster values. When comparing performance by the ME design default to those by the statewide average data, the ME design default VCD produces less error than the ALS. Therefore, this study recommends using clustered data or site-specific WIM data instead of ME default or statewide average value. In addition, a guideline was successfully established to select appropriate axle load spectra inputs based on vehicle class data

    Amplifying Pathological Detection in EEG Signaling Pathways through Cross-Dataset Transfer Learning

    Full text link
    Pathology diagnosis based on EEG signals and decoding brain activity holds immense importance in understanding neurological disorders. With the advancement of artificial intelligence methods and machine learning techniques, the potential for accurate data-driven diagnoses and effective treatments has grown significantly. However, applying machine learning algorithms to real-world datasets presents diverse challenges at multiple levels. The scarcity of labelled data, especially in low regime scenarios with limited availability of real patient cohorts due to high costs of recruitment, underscores the vital deployment of scaling and transfer learning techniques. In this study, we explore a real-world pathology classification task to highlight the effectiveness of data and model scaling and cross-dataset knowledge transfer. As such, we observe varying performance improvements through data scaling, indicating the need for careful evaluation and labelling. Additionally, we identify the challenges of possible negative transfer and emphasize the significance of some key components to overcome distribution shifts and potential spurious correlations and achieve positive transfer. We see improvement in the performance of the target model on the target (NMT) datasets by using the knowledge from the source dataset (TUAB) when a low amount of labelled data was available. Our findings indicate a small and generic model (e.g. ShallowNet) performs well on a single dataset, however, a larger model (e.g. TCN) performs better on transfer and learning from a larger and diverse dataset

    Simulation model of load balancing in distributed computing systems

    Get PDF
    The availability of high-performance computing, high speed data transfer over the network and widespread of software for the design and pre-production in mechanical engineering have led to the fact that at the present time the large industrial enterprises and small engineering companies implement complex computer systems for efficient solutions of production and management tasks. Such computer systems are generally built on the basis of distributed heterogeneous computer systems. The analytical problems solved by such systems are the key models of research, but the system-wide problems of efficient distribution (balancing) of the computational load and accommodation input, intermediate and output databases are no less important. The main tasks of this balancing system are load and condition monitoring of compute nodes, and the selection of a node for transition of the user's request in accordance with a predetermined algorithm. The load balancing is one of the most used methods of increasing productivity of distributed computing systems through the optimal allocation of tasks between the computer system nodes. Therefore, the development of methods and algorithms for computing optimal scheduling in a distributed system, dynamically changing its infrastructure, is an important task

    A framework development to predict remaining useful life of a gas turbine mechanical component

    Get PDF
    Power-by-the-hour is a performance based offering for delivering outstanding service to operators of civil aviation aircraft. Operators need to guarantee to minimise downtime, reduce service cost and ensure value for money which requires an innovative advanced technology for predictive maintenance. Predictability, availability and reliability of the engine offers better service for operators, and the need to estimate the expected component failure prior to failure occurrence requires a proactive approach to predict the remaining useful life of components within an assembly. This research offers a framework for component remaining useful life prediction using assembly level data. The thesis presents a critical analysis on literature identifying the Weibull method, statistical technique and data-driven methodology relating to remaining useful life prediction, which are used in this research. The AS-IS practice captures relevant information based on the investigation conducted in the aerospace industry. The analysis of maintenance cycles relates to the examination of high-level events for engine availability, whereby more communications with industry showcase a through-life performance timeline visualisation. Overhaul sequence and activities are presented to gain insights of the timeline visualisation. The thesis covers the framework development and application to gas turbine single stage assembly, repair and replacement of components in single stage assembly, and multiple stage assembly. The framework is demonstrated in aerospace engines and power generation engines. The framework developed enables and supports domain experts to quickly respond to, and prepare for maintenance and on-time delivery of spare parts. The results of the framework show the probability of failure based on a pair of error values using the corresponding Scale and Shape parameters. The probability of failure is transformed into the remaining useful life depicting a typical Weibull distribution. The resulting Weibull curves developed with three scenarios of the case shows there are components renewals, therefore, the remaining useful life of the components are established. The framework is validated and verified through a case study with three scenarios and also through expert judgement

    eComVes: Enhancing ComVes using Data Piggybacking for Resource Discovery at the Network Edge

    Get PDF
    Over the past few years, Augmented Reality (AR) and Virtual Reality (VR) have emerged as highly popular technologies that demand rapid and efficient processing of data with low latency and high bandwidth, in order to enable seamless real-time interaction between users and the virtual environment. This presents challenges for network infrastructure design, which can be addressed through edge computing. However, edge computing also presents challenges, such as selecting the appropriate edge server for computing tasks in dynamic networks with rapidly changing resource availability. Named Data Networking (NDN) is a potential future Internet architecture that could provide a balanced distribution of edge services across servers, thereby preventing service disruptions. In this study, eComVes, a novel strategy that enhances ComVes, is proposed for information-centric edge applications that adopt a correction mechanism to ensure service execution on the highest resourced server. This mechanism allows users and intermediate routers to learn about the servers’ resource status directly from the server without using any explicit control messages or probing. We evaluated the performance of the eComVes against ComVes and observed an improvement in the success ratio with maintaining consistent response time, indicating an improvement in load balance across the servers

    Classification and mapping of paddy rice by combining Landsat and SAR time series data

    Get PDF
    Rice is an important food resource, and the demand for rice has increased as population has expanded. Therefore, accurate paddy rice classification and monitoring are necessary to identify and forecast rice production. Satellite data have been often used to produce paddy rice maps with more frequent update cycle (e.g., every year) than field surveys. Many satellite data, including both optical and SAR sensor data (e.g., Landsat, MODIS, and ALOS PALSAR), have been employed to classify paddy rice. In the present study, time series data from Landsat, RADARSAT-1, and ALOS PALSAR satellite sensors were synergistically used to classify paddy rice through machine learning approaches over two different climate regions (sites A and B). Six schemes considering the composition of various combinations of input data by sensor and collection date were evaluated. Scheme 6 that fused optical and SAR sensor time series data at the decision level yielded the highest accuracy (98.67% for site A and 93.87% for site B). Performance of paddy rice classification was better in site A than site B, which consists of heterogeneous land cover and has low data availability due to a high cloud cover rate. This study also proposed Paddy Rice Mapping Index (PMI) considering spectral and phenological characteristics of paddy rice. PMI represented well the spatial distribution of paddy rice in both regions. Google Earth Engine was adopted to produce paddy rice maps over larger areas using the proposed PMI-based approach

    Managing Well Integrity using Reliability Based Models

    Get PDF
    Imperial Users onl
    corecore