312 research outputs found
FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome.
Alternative splicing plays an important role in many cellular processes of eukaryotic organisms. The exon-inclusion ratio, also known as percent spliced in, is often regarded as one of the most effective measures of alternative splicing events. The existing methods for estimating exon-inclusion ratios at the genome scale all require the existence of a reference transcriptome. In this paper, we propose an alignment-free method, FreePSI, to perform genome-wide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. It uses a novel probabilistic generative model based on k-mer profiles to quantify the exon-inclusion ratios at the genome scale and an efficient expectation-maximization algorithm based on a divide-and-conquer strategy and ultrafast conjugate gradient projection descent method to solve the model. We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference transcriptome is not provided. Our results suggest that FreePSI may have important applications in performing alternative splicing analysis for organisms that do not have quality reference transcriptomes. FreePSI is implemented in C++ and freely available to the public on GitHub
Integrated Governance of Scenarized Space and Community — Reform of Beijing Qianggen Community Service Station and Enlightenment
Community governance is significant for the grass-roots governance in China. Micro-governance and micro-reform starting from community service station is a meaningful measure to explore the improvement of grass-roots governance. Focusing on the reform of community service stations in Beijing, this paper, in consideration to the background of service station reform, describes the history, content and characteristics of the reform of comprehensive setting of Qianggen Community on G Subdistrict of Xicheng District, Beijing, in details, and conducts in-depth analysis based on “The Theory of Scenes” and “The Theory of Governance”. The author holds that community service stations, with new roles taken, new scenarios created and new mechanisms shaped after transformation and upgrading, are turned into governance centers that connect multiple parties, respond to needs of residents better and improve the effectiveness of community governance. The reform practice is committed to the generating of scenarized social space, promoting the manifestation of the integrated governance pattern. The author is inspired to consider the issues related to grassroots governance further and to put forward several suggestions for deepening reform
Knowledge Prompt-tuning for Sequential Recommendation
Pre-trained language models (PLMs) have demonstrated strong performance in
sequential recommendation (SR), which are utilized to extract general
knowledge. However, existing methods still lack domain knowledge and struggle
to capture users' fine-grained preferences. Meanwhile, many traditional SR
methods improve this issue by integrating side information while suffering from
information loss. To summarize, we believe that a good recommendation system
should utilize both general and domain knowledge simultaneously. Therefore, we
introduce an external knowledge base and propose Knowledge Prompt-tuning for
Sequential Recommendation (\textbf{KP4SR}). Specifically, we construct a set of
relationship templates and transform a structured knowledge graph (KG) into
knowledge prompts to solve the problem of the semantic gap. However, knowledge
prompts disrupt the original data structure and introduce a significant amount
of noise. We further construct a knowledge tree and propose a knowledge tree
mask, which restores the data structure in a mask matrix form, thus mitigating
the noise problem. We evaluate KP4SR on three real-world datasets, and
experimental results show that our approach outperforms state-of-the-art
methods on multiple evaluation metrics. Specifically, compared with PLM-based
methods, our method improves NDCG@5 and HR@5 by \textcolor{red}{40.65\%} and
\textcolor{red}{36.42\%} on the books dataset, \textcolor{red}{11.17\%} and
\textcolor{red}{11.47\%} on the music dataset, and \textcolor{red}{22.17\%} and
\textcolor{red}{19.14\%} on the movies dataset, respectively. Our code is
publicly available at the link:
\href{https://github.com/zhaijianyang/KP4SR}{\textcolor{blue}{https://github.com/zhaijianyang/KP4SR}.
DREAM: Efficient Dataset Distillation by Representative Matching
Dataset distillation aims to synthesize small datasets with little
information loss from original large-scale ones for reducing storage and
training costs. Recent state-of-the-art methods mainly constrain the sample
synthesis process by matching synthetic images and the original ones regarding
gradients, embedding distributions, or training trajectories. Although there
are various matching objectives, currently the strategy for selecting original
images is limited to naive random sampling.
We argue that random sampling overlooks the evenness of the selected sample
distribution, which may result in noisy or biased matching targets.
Besides, the sample diversity is also not constrained by random sampling.
These factors together lead to optimization instability in the distilling
process and degrade the training efficiency. Accordingly, we propose a novel
matching strategy named as \textbf{D}ataset distillation by
\textbf{RE}present\textbf{A}tive \textbf{M}atching (DREAM), where only
representative original images are selected for matching. DREAM is able to be
easily plugged into popular dataset distillation frameworks and reduce the
distilling iterations by more than 8 times without performance drop. Given
sufficient training time, DREAM further provides significant improvements and
achieves state-of-the-art performances.Comment: Efficient matching for dataset distillatio
Universal Sleep Decoder: Aligning awake and sleep neural representation across subjects
Decoding memory content from brain activity during sleep has long been a goal
in neuroscience. While spontaneous reactivation of memories during sleep in
rodents is known to support memory consolidation and offline learning,
capturing memory replay in humans is challenging due to the absence of
well-annotated sleep datasets and the substantial differences in neural
patterns between wakefulness and sleep. To address these challenges, we
designed a novel cognitive neuroscience experiment and collected a
comprehensive, well-annotated electroencephalography (EEG) dataset from 52
subjects during both wakefulness and sleep. Leveraging this benchmark dataset,
we developed the Universal Sleep Decoder (USD) to align neural representations
between wakefulness and sleep across subjects. Our model achieves up to 16.6%
top-1 zero-shot accuracy on unseen subjects, comparable to decoding
performances using individual sleep data. Furthermore, fine-tuning USD on test
subjects enhances decoding accuracy to 25.9% top-1 accuracy, a substantial
improvement over the baseline chance of 6.7%. Model comparison and ablation
analyses reveal that our design choices, including the use of (i) an additional
contrastive objective to integrate awake and sleep neural signals and (ii) the
pretrain-finetune paradigm to incorporate different subjects, significantly
contribute to these performances. Collectively, our findings and methodologies
represent a significant advancement in the field of sleep decoding
PepGB: Facilitating peptide drug discovery via graph neural networks
Peptides offer great biomedical potential and serve as promising drug
candidates. Currently, the majority of approved peptide drugs are directly
derived from well-explored natural human peptides. It is quite necessary to
utilize advanced deep learning techniques to identify novel peptide drugs in
the vast, unexplored biochemical space. Despite various in silico methods
having been developed to accelerate peptide early drug discovery, existing
models face challenges of overfitting and lacking generalizability due to the
limited size, imbalanced distribution and inconsistent quality of experimental
data. In this study, we propose PepGB, a deep learning framework to facilitate
peptide early drug discovery by predicting peptide-protein interactions
(PepPIs). Employing graph neural networks, PepGB incorporates a fine-grained
perturbation module and a dual-view objective with contrastive learning-based
peptide pre-trained representation to predict PepPIs. Through rigorous
evaluations, we demonstrated that PepGB greatly outperforms baselines and can
accurately identify PepPIs for novel targets and peptide hits, thereby
contributing to the target identification and hit discovery processes. Next, we
derive an extended version, diPepGB, to tackle the bottleneck of modeling
highly imbalanced data prevalent in lead generation and optimization processes.
Utilizing directed edges to represent relative binding strength between two
peptide nodes, diPepGB achieves superior performance in real-world assays. In
summary, our proposed frameworks can serve as potent tools to facilitate
peptide early drug discovery
- …