3,357 research outputs found
Partial mixture model for tight clustering of gene expression time-course
Background: Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to
this area of research. On the other hand, there has been extensive use of maximum likelihood techniques for model parameter estimation. By contrast, the minimum distance estimator has been largely ignored.
Results: In this paper we show the inherent robustness of the minimum distance estimator that makes it a powerful tool for parameter estimation in model-based time-course clustering. To apply minimum distance estimation, a partial mixture model that can naturally incorporate replicate
information and allow scattered genes is formulated. We provide experimental results of simulated data fitting, where the minimum distance estimator demonstrates superior performance to the maximum likelihood estimator. Both biological and statistical validations are conducted on a
simulated dataset and two real gene expression datasets. Our proposed partial regression clustering algorithm scores top in Gene Ontology driven evaluation, in comparison with four other popular clustering algorithms.
Conclusion: For the first time partial mixture model is successfully extended to time-course data analysis. The robustness of our partial regression clustering algorithm proves the suitability of the ombination of both partial mixture model and minimum distance estimator in this field. We show that tight clustering not only is capable to generate more profound understanding of the dataset
under study well in accordance to established biological knowledge, but also presents interesting new hypotheses during interpretation of clustering results. In particular, we provide biological evidences that scattered genes can be relevant and are interesting subjects for study, in contrast to prevailing opinion
A comparative study of the AHP and TOPSIS methods for implementing load shedding scheme in a pulp mill system
The advancement of technology had encouraged mankind to design and create useful
equipment and devices. These equipment enable users to fully utilize them in various
applications. Pulp mill is one of the heavy industries that consumes large amount of
electricity in its production. Due to this, any malfunction of the equipment might
cause mass losses to the company. In particular, the breakdown of the generator
would cause other generators to be overloaded. In the meantime, the subsequence
loads will be shed until the generators are sufficient to provide the power to other
loads. Once the fault had been fixed, the load shedding scheme can be deactivated.
Thus, load shedding scheme is the best way in handling such condition. Selected load
will be shed under this scheme in order to protect the generators from being
damaged. Multi Criteria Decision Making (MCDM) can be applied in determination
of the load shedding scheme in the electric power system. In this thesis two methods
which are Analytic Hierarchy Process (AHP) and Technique for Order Preference by
Similarity to Ideal Solution (TOPSIS) were introduced and applied. From this thesis,
a series of analyses are conducted and the results are determined. Among these two
methods which are AHP and TOPSIS, the results shown that TOPSIS is the best
Multi criteria Decision Making (MCDM) for load shedding scheme in the pulp mill
system. TOPSIS is the most effective solution because of the highest percentage
effectiveness of load shedding between these two methods. The results of the AHP
and TOPSIS analysis to the pulp mill system are very promising
Brain Connectivity Networks for the Study of Nonlinear Dynamics and Phase Synchrony in Epilepsy
Assessing complex brain activity as a function of the type of epilepsy and in the context of the 3D source of seizure onset remains a critical and challenging endeavor. In this dissertation, we tried to extract the attributes of the epileptic brain by looking at the modular interactions from scalp electroencephalography (EEG). A classification algorithm is proposed for the connectivity-based separation of interictal epileptic EEG from normal. Connectivity patterns of interictal epileptic discharges were investigated in different types of epilepsy, and the relation between patterns and the epileptogenic zone are also explored in focal epilepsy.
A nonlinear recurrence-based method is applied to scalp EEG recordings to obtain connectivity maps using phase synchronization attributes. The pairwise connectivity measure is obtained from time domain data without any conversion to the frequency domain. The phase coupling value, which indicates the broadband interdependence of input data, is utilized for the graph theory interpretation of local and global assessment of connectivity activities.
The method is applied to the population of pediatric individuals to delineate the epileptic cases from normal controls. A probabilistic approach proved a significant difference between the two groups by successfully separating the individuals with an accuracy of 92.8%. The investigation of connectivity patterns of the interictal epileptic discharges (IED), which were originated from focal and generalized seizures, was resulted in a significant difference ( ) in connectivity matrices. It was observed that the functional connectivity maps of focal IED showed local activities while generalized cases showed global activated areas. The investigation of connectivity maps that resulted from temporal lobe epilepsy individuals has shown the temporal and frontal areas as the most affected regions.
In general, functional connectivity measures are considered higher order attributes that helped the delineation of epileptic individuals in the classification process. The functional connectivity patterns of interictal activities can hence serve as indicators of the seizure type and also specify the irritated regions in focal epilepsy. These findings can indeed enhance the diagnosis process in context to the type of epilepsy and effects of relative location of the 3D source of seizure onset on other brain areas
Machine vibration monitoring for diagnostics through hypothesis testing
Nowadays, the subject of machine diagnostics is gathering growing interest in the research field as switching from a programmed to a preventive maintenance regime based on the real health conditions (i.e., condition-based maintenance) can lead to great advantages both in terms of safety and costs. Nondestructive tests monitoring the state of health are fundamental for this purpose. An effective form of condition monitoring is that based on vibration (vibration monitoring), which exploits inexpensive accelerometers to perform machine diagnostics. In this work, statistics and hypothesis testing will be used to build a solid foundation for damage detection by recognition of patterns in a multivariate dataset which collects simple time features extracted from accelerometric measurements. In this regard, data from high-speed aeronautical bearings were analyzed. These were acquired on a test rig built by the Dynamic and Identification Research Group (DIRG) of the Department of Mechanical and Aerospace Engineering at Politecnico di Torino. The proposed strategy was to reduce the multivariate dataset to a single index which the health conditions can be determined. This dimensionality reduction was initially performed using Principal Component Analysis, which proved to be a lossy compression. Improvement was obtained via Fisher’s Linear Discriminant Analysis, which finds the direction with maximum distance between the damaged and healthy indices. This method is still ineffective in highlighting phenomena that develop in directions orthogonal to the discriminant. Finally, a lossless compression was achieved using the Mahalanobis distance-based Novelty Indices, which was also able to compensate for possible latent confounding factors. Further, considerations about the confidence, the sensitivity, the curse of dimensionality, and the minimum number of samples were also tackled for ensuring statistical significance. The results obtained here were very good not only in terms of reduced amounts of missed and false alarms, but also considering the speed of the algorithms, their simplicity, and the full independence from human interaction, which make them suitable for real time implementation and integration in condition-based maintenance (CBM) regimes
Unsupervised learning for anomaly detection in Australian medical payment data
Fraudulent or wasteful medical insurance claims made by health care providers are costly for insurers. Typically, OECD healthcare organisations lose 3-8% of total expenditure due to fraud. As Australia’s universal public health insurer, Medicare Australia, spends approximately A1–2.7 billion could be expected.However, fewer than 1% of claims to Medicare Australia are detected as fraudulent, below international benchmarks.
Variation is common in medicine, and health conditions, along with their presentation and treatment, are heterogenous by nature. Increasing volumes of data and rapidly changing patterns bring challenges which require novel solutions. Machine learning and data mining are becoming commonplace in this field, but no gold standard is yet available.
In this project, requirements are developed for real-world application to compliance analytics at the Australian Government Department of Health and Aged Care (DoH), covering: unsupervised learning; problem generalisation; human interpretability; context discovery; and cost prediction. Three novel methods are presented which rank providers by potentially recoverable costs. These methods used association analysis, topic modelling, and sequential pattern mining to provide interpretable, expert-editable models of typical provider claims. Anomalous providers are identified through comparison to the typical models, using metrics based on costs of excess or upgraded services. Domain knowledge is incorporated in a machine-friendly way in two of the methods through the use of the MBS as an ontology. Validation by subject-matter experts and comparison to existing techniques shows that the methods perform well. The methods are implemented in a software framework which enables rapid prototyping and quality assurance. The code is implemented at the DoH, and further applications as decision-support systems are in progress. The developed requirements will apply to future work in this fiel
- …