Search CORE

955 research outputs found

Clustering of Leukemia Patients via Gene Expression Data Analysis

Author: Zhao Zhiyu
Publication venue: ScholarWorks@UNO
Publication date: 15/12/2006
Field of study

This thesis attempts to cluster some leukemia patients described by gene expression data, and discover the most discriminating a few genes that are responsible for the clustering. A combined approach of Principal Direction Divisive Partitioning and bisect K-means algorithms is applied to the clustering of the selected leukemia dataset, and both unsupervised and supervised methods are considered in order to get the optimal results. As shown by the experimental results and the predefined reference, the combination of PDDP and bisect K-means successfully clusters the leukemia patients, and efficiently discovers some significant genes that can serve as the discriminator of the clustering. The combined approach works well on the automatic clustering of leukemia patients depending merely on the gene expression information, and it has great potential on solving similar problems. The discovered a few genes may provide very important information for the diagnosis of the disease of leukemia

University of New Orleans

Robust and Efficient Algorithms for Protein 3-D Structure Alignment and Genome Sequence Comparison

Author: Zhao Zhiyu
Publication venue: ScholarWorks@UNO
Publication date: 07/08/2008
Field of study

Sequence analysis and structure analysis are two of the fundamental areas of bioinformatics research. This dissertation discusses, specifically, protein structure related problems including protein structure alignment and query, and genome sequence related problems including haplotype reconstruction and genome rearrangement. It first presents an algorithm for pairwise protein structure alignment that is tested with structures from the Protein Data Bank (PDB). In many cases it outperforms two other well-known algorithms, DaliLite and CE. The preliminary algorithm is a graph-theory based approach, which uses the concept of \stars to reduce the complexity of clique-finding algorithms. The algorithm is then improved by introducing \double-center stars in the graph and applying a self-learning strategy. The updated algorithm is tested with a much larger set of protein structures and shown to be an improvement in accuracy, especially in cases of weak similarity. A protein structure query algorithm is designed to search for similar structures in the PDB, using the improved alignment algorithm. It is compared with SSM and shows better performance with lower maximum and average Q-score for missing proteins. An interesting problem dealing with the calculation of the diameter of a 3-D sequence of points arose and its connection to the sublinear time computation is discussed. The diameter calculation of a 3-D sequence is approximated by a series of sublinear time deterministic, zero-error and bounded-error randomized algorithms and we have obtained a series of separations about the power of sublinear time computations. This dissertation also discusses two genome sequence related problems. A probabilistic model is proposed for reconstructing haplotypes from SNP matrices with incomplete and inconsistent errors. The experiments with simulated data show both high accuracy and speed, conforming to the theoretically provable e ciency and accuracy of the algorithm. Finally, a genome rearrangement problem is studied. The concept of non-breaking similarity is introduced. Approximating the exemplar non-breaking similarity to factor n1..f is proven to be NP-hard. Interestingly, for several practical cases, several polynomial time algorithms are presented

University of New Orleans

Recommended from our members

Influence of Gate Separation on IGZO Thin Film Transistor Behavior

Author: Zhao Zhiyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Metal oxide are attracting great interests in the electronics field as a promising active layercandidate for various uses including wearable sensors, flexible display, and LED displays. Thecurrent status of manufacturing relies on cleanroom manufacturing, which can be time consumingand costly. Consequently, a repeatable and reliable process to fabricate stable, large scale TFT isneeded for manufacturing and consumer’s need. Metal oxides have proven their values to be thenext generation display for their hi-performance electrical characteristic, abundance, and straightforward fabrication method. In particular, system consists of Indium-Gallium-Zinc-Oxide (IGZO)has demonstrated stability as well as high electrical performance. Science then, TFTs with IGZOsystems had prompt extensive research in the solution process field. Since the conventional methodare limited by sample size and processing time, solution-processing had opened gateway to moreflexible, even large-scale fabrication with way less steps and processing time. The major drawbackof solution processing the its instability, uncertainty, and weaker device performance comparingto those fabricated in the cleanroom environment. In this work, several methods were investigatedincluding direct light patterning and UV and ozone treatment of sample surface to improve deviceperformance. A gallium rich IGZO solution TFT with 2:2:1 molar ratio was made with direct lightpatterning method and compared to conventionally made IGZO TFT. It is shown that direct lightpattering could drastically enhance device stability and performances. Other factors such as clustersize, interface treatment, and etchant composition could greatly affect the outcome as well

eScholarship - University of California

Efficient protein alignment algorithm for protein search

Author: Bin Fu
Bmc Bioinformatics
Zaixin Lu
Zhiyu Zhao
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

CiteSeerX

Crossref

PubMed Central

Benign Adversarial Attack: Tricking Models for Goodness

Author: Lin Zhiyu
Sang Jitao
Zhang Jiaming
Zhao Xian
Publication venue
Publication date: 05/07/2022
Field of study

In spite of the successful application in many fields, machine learning models today suffer from notorious problems like vulnerability to adversarial examples. Beyond falling into the cat-and-mouse game between adversarial attack and defense, this paper provides alternative perspective to consider adversarial example and explore whether we can exploit it in benign applications. We first attribute adversarial example to the human-model disparity on employing non-semantic features. While largely ignored in classical machine learning mechanisms, non-semantic feature enjoys three interesting characteristics as (1) exclusive to model, (2) critical to affect inference, and (3) utilizable as features. Inspired by this, we present brave new idea of benign adversarial attack to exploit adversarial examples for goodness in three directions: (1) adversarial Turing test, (2) rejecting malicious model application, and (3) adversarial data augmentation. Each direction is positioned with motivation elaboration, justification analysis and prototype applications to showcase its potential.Comment: ACM MM2022 Brave New Ide

arXiv.org e-Print Archive

Two-Dimensional Numerical Modelling of a Moored Floating Body under Sloping Seabed Conditions

Author: Feng Aichun
Jiang Zhiyu
Kang Hooi Sang
Zhao Binbin
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

publishedVersio

Multidisciplinary Digital Publishing Institute

Universiti Teknologi Malaysia Institutional Repository

NORA - Norwegian Open Research Archives

ScholarBank@NUS

Agder University Research Archive

MGMAE: Motion Guided Masking for Video Masked Autoencoding

Author: Huang Bingkun
Qiao Yu
Wang Limin
Zhang Guozhen
Zhao Zhiyu
Publication venue
Publication date: 21/08/2023
Field of study

Masked autoencoding has shown excellent performance on self-supervised video representation learning. Temporal redundancy has led to a high masking ratio and customized masking strategy in VideoMAE. In this paper, we aim to further improve the performance of video masked autoencoding by introducing a motion guided masking strategy. Our key insight is that motion is a general and unique prior in video, which should be taken into account during masked pre-training. Our motion guided masking explicitly incorporates motion information to build temporal consistent masking volume. Based on this masking volume, we can track the unmasked tokens in time and sample a set of temporal consistent cubes from videos. These temporal aligned unmasked tokens will further relieve the information leakage issue in time and encourage the MGMAE to learn more useful structure information. We implement our MGMAE with an online efficient optical flow estimator and backward masking map warping strategy. We perform experiments on the datasets of Something-Something V2 and Kinetics-400, demonstrating the superior performance of our MGMAE to the original VideoMAE. In addition, we provide the visualization analysis to illustrate that our MGMAE can sample temporal consistent cubes in a motion-adaptive manner for more effective video pre-training.Comment: ICCV 2023 camera-ready versio

arXiv.org e-Print Archive