Search CORE

729 research outputs found

Recommended from our members

Computational Strategies for Scalable Genomics Analysis.

Author: Shi Lizhen
Wang Zhong
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications

eScholarship - University of California

An Investigation in Efficient Spatial Patterns Mining

Author: Wang Lizhen
Publication venue
Publication date
Field of study

The technical progress in computerized spatial data acquisition and storage results in the growth of vast spatial databases. Faced with large amounts of increasing spatial data, a terminal user has more difficulty in understanding them without the helpful knowledge from spatial databases. Thus, spatial data mining has been brought under the umbrella of data mining and is attracting more attention. Spatial data mining presents challenges. Differing from usual data, spatial data includes not only positional data and attribute data, but also spatial relationships among spatial events. Further, the instances of spatial events are embedded in a continuous space and share a variety of spatial relationships, so the mining of spatial patterns demands new techniques. In this thesis, several contributions were made. Some new techniques were proposed, i.e., fuzzy co-location mining, CPI-tree (Co-location Pattern Instance Tree), maximal co-location patterns mining, AOI-ags (Attribute-Oriented Induction based on Attributes’ Generalization Sequences), and fuzzy association prediction. Three algorithms were put forward on co-location patterns mining: the fuzzy co-location mining algorithm, the CPI-tree based co-location mining algorithm (CPI-tree algorithm) and the orderclique- based maximal prevalence co-location mining algorithm (order-clique-based algorithm). An attribute-oriented induction algorithm based on attributes’ generalization sequences (AOI-ags algorithm) is further given, which unified the attribute thresholds and the tuple thresholds. On the two real-world databases with time-series data, a fuzzy association prediction algorithm is designed. Also a cell-based spatial object fusion algorithm is proposed. Two fuzzy clustering methods using domain knowledge were proposed: Natural Method and Graph-Based Method, both of which were controlled by a threshold. The threshold was confirmed by polynomial regression. Finally, a prototype system on spatial co-location patterns’ mining was developed, and shows the relative efficiencies of the co-location techniques proposed The techniques presented in the thesis focus on improving the feasibility, usefulness, effectiveness, and scalability of related algorithm. In the design of fuzzy co-location Abstract mining algorithm, a new data structure, the binary partition tree, used to improve the process of fuzzy equivalence partitioning, was proposed. A prefix-based approach to partition the prevalent event set search space into subsets, where each sub-problem can be solved in main-memory, was also presented. The scalability of CPI-tree algorithm is guaranteed since it does not require expensive spatial joins or instance joins for identifying co-location table instances. In the order-clique-based algorithm, the co-location table instances do not need be stored after computing the Pi value of corresponding colocation, which dramatically reduces the executive time and space of mining maximal colocations. Some technologies, for example, partitions, equivalence partition trees, prune optimization strategies and interestingness, were used to improve the efficiency of the AOI-ags algorithm. To implement the fuzzy association prediction algorithm, the “growing window” and the proximity computation pruning were introduced to reduce both I/O and CPU costs in computing the fuzzy semantic proximity between time-series. For new techniques and algorithms, theoretical analysis and experimental results on synthetic data sets and real-world datasets were presented and discussed in the thesis

University of Huddersfield Repository

Recommended from our members

Deconvolute individual genomes from metagenome sequences through short read clustering.

Author: Deng Li
Li Kexue
Lu Yakang
Shi Lizhen
Wang Lili
Wang Zhong
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Metagenome assembly from short next-generation sequencing data is a challenging process due to its large scale and computational complexity. Clustering short reads by species before assembly offers a unique opportunity for parallel downstream assembly of genomes with individualized optimization. However, current read clustering methods suffer either false negative (under-clustering) or false positive (over-clustering) problems. Here we extended our previous read clustering software, SpaRC, by exploiting statistics derived from multiple samples in a dataset to reduce the under-clustering problem. Using synthetic and real-world datasets we demonstrated that this method has the potential to cluster almost all of the short reads from genomes with sufficient sequencing coverage. The improved read clustering in turn leads to improved downstream genome assembly quality

eScholarship - University of California

Acupuncture for Cancer Patients: Practice and Research

Author: Bao Ting
Wang Lizhen
Publication venue: 'IntechOpen'
Publication date: 06/03/2013
Field of study

IntechOpen

<Notes>On the Kinds and Idols of Ewenki People\u27s Deities in North-east China

Author: Wang Lizhen
汪立珍
Publication venue: 中国・中央民族大学小数民族文学芸術研究所講師
Publication date: 30/03/2000
Field of study

エヴァンキ族は、主として中国の内モンゴル自治区呼倫貝爾盟に分布している。1992年に行われた中国の人口調査によると、全国のエヴァンキ族の”エヴァンキ”という言葉は、この民族の自称であり、”山から降りた人”、或いは”山から降りてきた人達”という意味である。・・

Tsukuba Repository

On Power Law Scaling Dynamics for Time-fractional Phase Field Models during Coarsening

Author: Chen Lizhen
Wang Hong
Zhao Jia
Publication venue
Publication date: 14/03/2018
Field of study

In this paper, we study the phase field models with fractional-order in time. The phase field models have been widely used to study coarsening dynamics of material systems with microstructures. It is known that phase field models are usually derived from energy variation so that they obey some energy dissipation laws intrinsically. Recently, many works have been published on investigating fractional-order phase field models, but little is known of the corresponding energy dissipation laws. We focus on the time-fractional phase field models and report that the effective free energy and roughness obey a universal power-law scaling dynamics during coarsening. Mainly, the effective free energy and roughness in the time-fractional phase field models scale by following a similar power law as the integer phase field models, where the power is linearly proportional to the fractional order. This universal scaling law is verified numerically against several phase field models, including the Cahn-Hilliard equations with different variable mobilities and molecular beam epitaxy models. This new finding sheds light on potential applications of time fractional phase field models in studying coarsening dynamics and crystal growths

arXiv.org e-Print Archive

DigitalCommons@USU