Search CORE

312 research outputs found

FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome.

Author: Jiang Tao
Ma Shining
Wang Dongfang
Zeng Jianyang
Zhou Jianyu
Publication venue: eScholarship, University of California
Publication date: 09/11/2017
Field of study

Alternative splicing plays an important role in many cellular processes of eukaryotic organisms. The exon-inclusion ratio, also known as percent spliced in, is often regarded as one of the most effective measures of alternative splicing events. The existing methods for estimating exon-inclusion ratios at the genome scale all require the existence of a reference transcriptome. In this paper, we propose an alignment-free method, FreePSI, to perform genome-wide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. It uses a novel probabilistic generative model based on k-mer profiles to quantify the exon-inclusion ratios at the genome scale and an efficient expectation-maximization algorithm based on a divide-and-conquer strategy and ultrafast conjugate gradient projection descent method to solve the model. We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference transcriptome is not provided. Our results suggest that FreePSI may have important applications in performing alternative splicing analysis for organisms that do not have quality reference transcriptomes. FreePSI is implemented in C++ and freely available to the public on GitHub

Crossref

eScholarship - University of California

Integrated Governance of Scenarized Space and Community — Reform of Beijing Qianggen Community Service Station and Enlightenment

Author: Jianyang Zhou
Xuemei Wang
Publication venue: 'The Russian Presidential Academy of National Economy and Public Administration'
Publication date: 01/01/2020
Field of study

Community governance is significant for the grass-roots governance in China. Micro-governance and micro-reform starting from community service station is a meaningful measure to explore the improvement of grass-roots governance. Focusing on the reform of community service stations in Beijing, this paper, in consideration to the background of service station reform, describes the history, content and characteristics of the reform of comprehensive setting of Qianggen Community on G Subdistrict of Xicheng District, Beijing, in details, and conducts in-depth analysis based on “The Theory of Scenes” and “The Theory of Governance”. The author holds that community service stations, with new roles taken, new scenarios created and new mechanisms shaped after transformation and upgrading, are turned into governance centers that connect multiple parties, respond to needs of residents better and improve the effectiveness of community governance. The reform practice is committed to the generating of scenarized social space, promoting the manifestation of the integrated governance pattern. The author is inspired to consider the issues related to grassroots governance further and to put forward several suggestions for deepening reform

Directory of Open Access Journals

SSOAR - Social Science Open Access Repository

Knowledge Prompt-tuning for Sequential Recommendation

Author: Li Hui
Tian Yonghong
Wang Chang-Dong
Zhai Jianyang
Zheng Xiawu
Publication venue
Publication date: 14/08/2023
Field of study

Pre-trained language models (PLMs) have demonstrated strong performance in sequential recommendation (SR), which are utilized to extract general knowledge. However, existing methods still lack domain knowledge and struggle to capture users' fine-grained preferences. Meanwhile, many traditional SR methods improve this issue by integrating side information while suffering from information loss. To summarize, we believe that a good recommendation system should utilize both general and domain knowledge simultaneously. Therefore, we introduce an external knowledge base and propose Knowledge Prompt-tuning for Sequential Recommendation (\textbf{KP4SR}). Specifically, we construct a set of relationship templates and transform a structured knowledge graph (KG) into knowledge prompts to solve the problem of the semantic gap. However, knowledge prompts disrupt the original data structure and introduce a significant amount of noise. We further construct a knowledge tree and propose a knowledge tree mask, which restores the data structure in a mask matrix form, thus mitigating the noise problem. We evaluate KP4SR on three real-world datasets, and experimental results show that our approach outperforms state-of-the-art methods on multiple evaluation metrics. Specifically, compared with PLM-based methods, our method improves NDCG@5 and HR@5 by \textcolor{red}{40.65\%} and \textcolor{red}{36.42\%} on the books dataset, \textcolor{red}{11.17\%} and \textcolor{red}{11.47\%} on the music dataset, and \textcolor{red}{22.17\%} and \textcolor{red}{19.14\%} on the movies dataset, respectively. Our code is publicly available at the link: \href{https://github.com/zhaijianyang/KP4SR}{\textcolor{blue}{https://github.com/zhaijianyang/KP4SR}.

arXiv.org e-Print Archive

DREAM: Efficient Dataset Distillation by Representative Matching

Author: Gu Jianyang
Jiang Wei
Liu Yanqing
Wang Kai
You Yang
Zhu Zheng
Publication venue
Publication date: 30/08/2023
Field of study

Dataset distillation aims to synthesize small datasets with little information loss from original large-scale ones for reducing storage and training costs. Recent state-of-the-art methods mainly constrain the sample synthesis process by matching synthetic images and the original ones regarding gradients, embedding distributions, or training trajectories. Although there are various matching objectives, currently the strategy for selecting original images is limited to naive random sampling. We argue that random sampling overlooks the evenness of the selected sample distribution, which may result in noisy or biased matching targets. Besides, the sample diversity is also not constrained by random sampling. These factors together lead to optimization instability in the distilling process and degrade the training efficiency. Accordingly, we propose a novel matching strategy named as \textbf{D}ataset distillation by \textbf{RE}present\textbf{A}tive \textbf{M}atching (DREAM), where only representative original images are selected for matching. DREAM is able to be easily plugged into popular dataset distillation frameworks and reduce the distilling iterations by more than 8 times without performance drop. Given sufficient training time, DREAM further provides significant improvements and achieves state-of-the-art performances.Comment: Efficient matching for dataset distillatio

arXiv.org e-Print Archive

Universal Sleep Decoder: Aligning awake and sleep neural representation across subjects

Author: Chen Zhongtao
Liu Yunzhe
Wang Haiteng
Zheng Hui
Zheng Lin
Zhou Jianyang
Publication venue
Publication date: 28/09/2023
Field of study

Decoding memory content from brain activity during sleep has long been a goal in neuroscience. While spontaneous reactivation of memories during sleep in rodents is known to support memory consolidation and offline learning, capturing memory replay in humans is challenging due to the absence of well-annotated sleep datasets and the substantial differences in neural patterns between wakefulness and sleep. To address these challenges, we designed a novel cognitive neuroscience experiment and collected a comprehensive, well-annotated electroencephalography (EEG) dataset from 52 subjects during both wakefulness and sleep. Leveraging this benchmark dataset, we developed the Universal Sleep Decoder (USD) to align neural representations between wakefulness and sleep across subjects. Our model achieves up to 16.6% top-1 zero-shot accuracy on unseen subjects, comparable to decoding performances using individual sleep data. Furthermore, fine-tuning USD on test subjects enhances decoding accuracy to 25.9% top-1 accuracy, a substantial improvement over the baseline chance of 6.7%. Model comparison and ablation analyses reveal that our design choices, including the use of (i) an additional contrastive objective to integrate awake and sleep neural signals and (ii) the pretrain-finetune paradigm to incorporate different subjects, significantly contribute to these performances. Collectively, our findings and methodologies represent a significant advancement in the field of sleep decoding

arXiv.org e-Print Archive

PepGB: Facilitating peptide drug discovery via graph neural networks

Author: Fang Meng
Lei Yipin
Li Han
Li Xiang
Wang Xu
Zeng Jianyang
Publication venue
Publication date: 26/01/2024
Field of study

Peptides offer great biomedical potential and serve as promising drug candidates. Currently, the majority of approved peptide drugs are directly derived from well-explored natural human peptides. It is quite necessary to utilize advanced deep learning techniques to identify novel peptide drugs in the vast, unexplored biochemical space. Despite various in silico methods having been developed to accelerate peptide early drug discovery, existing models face challenges of overfitting and lacking generalizability due to the limited size, imbalanced distribution and inconsistent quality of experimental data. In this study, we propose PepGB, a deep learning framework to facilitate peptide early drug discovery by predicting peptide-protein interactions (PepPIs). Employing graph neural networks, PepGB incorporates a fine-grained perturbation module and a dual-view objective with contrastive learning-based peptide pre-trained representation to predict PepPIs. Through rigorous evaluations, we demonstrated that PepGB greatly outperforms baselines and can accurately identify PepPIs for novel targets and peptide hits, thereby contributing to the target identification and hit discovery processes. Next, we derive an extended version, diPepGB, to tackle the bottleneck of modeling highly imbalanced data prevalent in lead generation and optimization processes. Utilizing directed edges to represent relative binding strength between two peptide nodes, diPepGB achieves superior performance in real-world assays. In summary, our proposed frameworks can serve as potent tools to facilitate peptide early drug discovery

arXiv.org e-Print Archive