222 research outputs found
Studying microbial community diversities by developing high-throughput experimental techniques and computational tools
Since the advent of high-throughput technologies, the understanding of microbial biodiversity has rapidly transformed. Amplicon sequencing of phylogenetic makers, especially 16S rRNA genes has now become a well-adopted tool to discover microbial taxonomic diversities in virtually all habitats, aquatic, terrestrial, local or global ecosystems. Although high-throughput sequencing, such as Illumina-based technologies (e.g. MiSeq), has revolutionized microbial ecology, the adoption of amplicon sequencing for environmental microbial community analysis is challenging due to the problem of low base diversity of the target region. In this study, a new phasing amplicon sequencing approach (PAS) was developed by shifting sequencing phases among different community samples from both directions via adding various numbers of bases (0â7) as spacers to both forward and reverse primers. Our results first indicated that the PAS method substantially ameliorated the problem of unbalanced base composition. Second, the PAS method substantially improved the sequence read base quality (an average of 10 % higher of bases above Q30). Third, the PAS method effectively increased raw sequence throughput (~15 % more raw reads). In addition, the PAS method significantly increased effective reads (9â47 %) and the effective read sequence length (16â96 more bases) after quality trim at Q30 with window 5. In addition, the PAS method reduced half of the sequencing errors (0.54â1.1 % less). Finally, two-step PCR amplification of the PAS method effectively ameliorated the amplification biases introduced by the long-barcoded PCR primers. The developed strategy is robust for 16S rRNA gene amplicon sequencing, and a similar strategy could also be used for sequencing other genes important to ecosystem functional processes.
To facilitate the analysis of the data produced from the amplicon sequencing technologies, a data analysis pipeline is developed and is running to serve more than 200 users with the data processing and preliminary analysis for the amplicon sequences. The publicly available pipelines, such as QIIME(Caporaso, Kuczynski et al. 2010, Caporaso, Lauber et al. 2012) and MOTHUR (Schloss, Westcott et al. 2009), are mostly standalone services and need minimum program skills to perform the analysis. Our pipeline provides a more user-friendly interface through webpage and users will only need to click buttons rather than type command lines to perform the basic data analysis. Besides the convenient operations, the Galaxy platform provides an organized way to upload, store, track and share the data histories from different projects. The pipeline is also flexible to add new programs that are developed by others and the data source is not limited to 16S rRNAs but also functional gene amplicon sequences. The pipeline has served the research community for several years, and more than a dozen papers are published using this pipeline.
A practical application of amplicon sequencing was followed to discover the biodiversity of microbial fungal communities in six North American forests soils. The biodiversity of fungi has been studied across many habitats, but the spatial patterns of fungi diversity and the possible mechanisms behind them still need exploration. In this study, the soil fungal samples were collected from six forest sites across a wide range of latitudes in North America with a nested design in each site to uncover the diversity pattern of the soil fungal communities in forest systems. The richness of fungi follows a clear latitudinal gradient, where temperature, precipitation, pH and nitrogen concentration also contribute to the prediction of the richness of the soil fungal communities. The compositions of fungal communities are distinct from each other across six forest sites. The main drivers of alpha diversity of fungi in forest soil are latitude, along with the mean annual temperature, precipitation, soil pH, soil total carbon, and soil total nitrogen. These seven variables can be used to predict the α-diversity of the soil fungal communities, and more than 70% variance can be explained by these variables only. As for the ÎČ-diversity, the dissimilarities among the fungal communities increases significantly as the distance between the sampling sites become larger. The distance-decay curve explains this pattern and indicates that the turnover rates of the fungal species are different in the local and continental scales. We further proved that the key drivers of the difference in fungal community composition highly depends on the spatial scale, and the geographic distance is the major contributor to explain these differences. In summary, this study of the fungal communities in the North American forest soils has shown several patterns along with the possible drivers behind them, which presents insights into the nature of soil fungal communities.
When the advanced high-throughput technologies have enabled researchers to gain unprecedented insights of the diversity of microbial communities without culturing and identify individuals, the merely knowing the answer to âwho is thereâ is no longer enough, the question now is to link the âmeasurableâ community structures to the ecosystem functioning. If this connection can be set up, then it is possible to understand that how the disturbances brought by the human activities and global climate change will change the ecosystem functioning carried out by microbial communities. Functional diversity, which measures the range of things that organisms do in the surrounding ecosystem has shown its power in linking the microbial communities to the dynamics of ecosystems. In the final part of this study, we provide a framework using Raoâs entropy to quantify microbial functional diversity based on GeoChip (a high-throughput functional gene array), and the phylogenetic distances between each probe are considered in the calculation. This index falls into the category of trait-based functional diversity, with the advantages of pre-selected key functional traits related to functional ecosystem designed in GeoChip. This functional diversity index can be partitioned into α- and ÎČ- diversity, which extends the understanding of functional diversity pattern into different temporal or spatial scales. The functional redundancy can also be defined following the definition of the functional diversity, which is more like a measure of gene similarity or convergence, rather than the traditionally defined âfunctional redundancyâ for multiple functionalities in an ecosystem. Given the hypothesis that sequence similarity leads to function similarity, the new definition of functional redundancy can reveal the redundant level of functional traits in the same gene. We applied this functional diversity framework to study the dynamic changes over a 9-month period of microbial communities in a contaminated groundwater system (with U(VI), SO42-, NO3-, etc.,) after a one-time EVO (emulsified vegetable oil) amendment, which has been proven that it can effectively reduce U(VI) for a considerable time period (around one year). Using the acetate production as the measurement of EVO degradation process, the functional diversity of the key gene responsible for degradation of EVO significantly correlate with the function itself (R2 = 0.685, p-0.021), where the other functional indices such as the gene richness did not show such a strong relationship. When using functional diversity to profile the whole community functional structure, statistical tests also proved that the change of environmental variables does shift the community functional structure, while this connection is not as clear if using other indices to represent the community functional structures. In summary, the new framework of function diversity integrates both functional traits and their phylogenetic signals, which has been proven to be a more sensitive indicator of ecological functions than traditionally used gene richness
Research on Online Reviews Reliability
This study examines the factors that have an impact on online reviews reliability. A theoretical framework was built and empirically tested with a sample of 200 interviewees. Results of structural equation model show that the online reviews quality and perceived risk have positive impact on online review reliability. Accordingly, online review value and number have positive impact on online review quality, customer involvement and reviewer acception have positive impact on perceived risk. The results of this study also suggest that the character of online review and reviewer indirectly impact review reliability by impacting intermediate variables
Recommended from our members
Transcriptional response of Desulfatibacillum alkenivorans AK-01 to growth on alkanes: insights from RT-qPCR and microarray analyses.
Microbial transformation of n-alkanes in anaerobic ecosystems plays a pivotal role in biogeochemical carbon cycling and bioremediation, but the requisite genetic machinery is not well elucidated.Desulfatibacillum alkenivorans AK-01 utilizes n-alkanes (C13 to C18) and contains two genomic loci encoding alkylsuccinate synthase (ASS) gene clusters. ASS catalyzes alkane addition to fumarate to form methylalkylsuccinic acids. We hypothesized that the genes in the two clusters would be differentially expressed depending on the alkane substrate utilized for growth. RT-qPCR was used to investigate ass-gene expression across AK-01's known substrate range, and microarray-based transcriptomic analysis served to investigate whole-cell responses to growth on n-hexadecane versus hexadecanoate. RT-qPCR revealed induction of ass gene cluster 1 during growth on all tested alkane substrates, and the transcriptional start sites in cluster 1 were determined via 5'RACE. Induction of ass gene cluster 2 was not observed under the tested conditions. Transcriptomic analysis indicated that the upregulation of genes potentially involved in methylalkylsuccinate metabolism, including methylmalonyl-CoA mutase and a putative carboxyl transferase. These findings provide new directions for studying the transcriptional regulation of genes involved in alkane addition to fumarate, fumarate recycling and the processing of methylalkylsuccinates with regard to isolates, enrichment cultures and ecological datasets
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models
Large Language Models (LLMs) have made significant progress in utilizing
tools, but their ability is limited by API availability and the instability of
implicit reasoning, particularly when both planning and execution are involved.
To overcome these limitations, we propose CREATOR, a novel framework that
enables LLMs to create their own tools using documentation and code
realization. CREATOR disentangles abstract tool creation and concrete decision
execution, resulting in improved performance. We evaluate CREATOR on MATH and
TabMWP benchmarks, respectively consisting of challenging math competition
problems and diverse tabular contents. Remarkably, CREATOR outperforms existing
chain-of-thought, program-of-thought, and tool-using baselines. Additionally,
we introduce the Creation Challenge dataset, featuring 2K diverse questions, to
emphasize the necessity and benefits of LLMs' tool creation ability. Further
research demonstrates that leveraging LLMs as tool creators facilitates
knowledge transfer, and LLMs exhibit varying levels of tool creation abilities,
enabling them to adapt to diverse situations. The tool creation ability
revolutionizes the LLM's problem-solving paradigm, driving us closer to the
next frontier of artificial intelligence. All the codes and data are released
Exploring Format Consistency for Instruction Tuning
Instruction tuning has emerged as a promising approach to enhancing large
language models in following human instructions. It is shown that increasing
the diversity and number of instructions in the training data can consistently
enhance generalization performance, which facilitates a recent endeavor to
collect various instructions and integrate existing instruction tuning datasets
into larger collections. However, different users have their unique ways of
expressing instructions, and there often exist variations across different
datasets in the instruction styles and formats, i.e., format inconsistency. In
this work, we study how format inconsistency may impact the performance of
instruction tuning. We propose a framework called "Unified Instruction Tuning"
(UIT), which calls OpenAI APIs for automatic format transfer among different
instruction tuning datasets. We show that UIT successfully improves the
generalization performance on unseen instructions, which highlights the
importance of format consistency for instruction tuning. To make the UIT
framework more practical, we further propose a novel perplexity-based denoising
method to reduce the noise of automatic format transfer. We also train a
smaller offline model that achieves comparable format transfer capability than
OpenAI APIs to reduce costs in practice
Similarity and Potential Relation Between Periimplantitis and Rheumatoid Arthritis on Transcriptomic Level: Results of a Bioinformatics Study
Background: This bioinformatics study aimed to reveal potential cross-talk genes,
related pathways, and transcription factors between periimplantitis and rheumatoid
arthritis (RA).
Methods: The datasets GSE33774 (seven periimplantitis and eight control samples) and
GSE106090 (six periimplantitis and six control samples) were included from the National
Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO). A
differential expression analysis (p < 0.05 and |logFC (fold change)| â„ 1) and a functional
enrichment analysis (p < 0.05) were performed. Based on this, a proteinâprotein
interaction (PPI) network was constructed by Cytoscape. RA-related genes were
extracted from DisGeNET database, and an overlap between periimplantitis-related
genes and these RA-related genes was examined to identify potential cross-talk genes.
Gene expression was merged between two datasets, and feature selection was
performed by Recursive Feature Elimination (RFE) algorithm. For the feature selection
cross-talk genes, support vector machine (SVM) models were constructed. The
expression of these feature genes was determined from GSE93272 for RA. Finally, a
network including cross-talk genes, related pathways, and transcription factors
was constructed.
Results: Periimplantitis datasets included 138 common differentially expressed genes
(DEGs) including 101 up- and 37 downregulated DEGs. The PPI interwork of
periimplantitis comprised 1,818 nodes and 2,517 edges. The RFE method selected six
features, i.e., MERTK, CD14, MAPT, CCR1, C3AR1, and FCGR2B, which had the highest
prediction. Out of these feature genes, CD14 and FCGR2B were most highly expressed in
periimplantitis and RA. The final activated pathwayâgene network contained 181 nodes
and 360 edges. Nuclear factor (NF) kappa B signaling pathway and osteoclast
differentiation were identified as potentially relevant pathways.
Conclusions: This current study revealed FCGR2B and CD14 as the most relevant
potential cross-talk genes between RA and periimplantitis, which suggests a similarity
between RA and periimplantitis and can serve as a theoretical basis for future research
ProQA: Structural Prompt-based Pre-training for Unified Question Answering
Question Answering (QA) is a longstanding challenge in natural language
processing. Existing QA works mostly focus on specific question types,
knowledge domains, or reasoning skills. The specialty in QA research hinders
systems from modeling commonalities between tasks and generalization for wider
applications. To address this issue, we present ProQA, a unified QA paradigm
that solves various tasks through a single model. ProQA takes a unified
structural prompt as the bridge and improves the QA-centric ability by
structural prompt-based pre-training. Through a structurally designed
prompt-based input schema, ProQA concurrently models the knowledge
generalization for all QA tasks while keeping the knowledge customization for
every specific QA task. Furthermore, ProQA is pre-trained with structural
prompt-formatted large-scale synthesized corpus, which empowers the model with
the commonly-required QA ability. Experimental results on 11 QA benchmarks
demonstrate that ProQA consistently boosts performance on both full data
fine-tuning, few-shot learning, and zero-shot testing scenarios. Furthermore,
ProQA exhibits strong ability in both continual learning and transfer learning
by taking the advantages of the structural prompt.Comment: NAACL 202
Altered gene-expression profile in rat plasma and promoted body and brain development by environmental enrichment
Environmental enrichment (EE) refers to the exposure of laboratory animals to physical and social stimulation, which can improve animalsâ well-being. The study was aimed to explore how the prenatal EE impacts affect the development, behavior, hormones and gene expression of the offspring. 28 pregnant rats were randomized into an EE group (EEG) housed in cages with EE or a control group (CG) housed in normal cages. Measurements included offspring development parameters (body weight, body length, and tail length) and behavior (open-field test, OFT), hormone levels (cortisol, dopamine, 5-HT, and growth hormone) and gene expression profile. Results showed that the development parameters of EEG offspring were statistically superior to the CG offspring. OFT count of EEG offspring was more than CG. EEG and CG offspring did not differ on cortisol, dopamine, 5-HT or growth factor. Gene expression profile chip test showed that 25 genes were up-regulated and 23 genes down-regulated in the EEG vs CG comparison, among which five GO annotations and four KEGG pathways were annotated. Findings indicate that EE during pregnancy could positively promote the body and nervous system development of offspring, involving the evidence for altered gene expression profile.Keywords: Environmental enrichment, rats, gene expression, behavior, developmentAfrican Journal of Biotechnology Vol. 12(20), pp. 3071-308
- âŠ