897 research outputs found
An ica algorithm for analyzing multiple data sets
In this paper we derive an independent-component analysis (ICA) method for analyzing two or more data sets simultaneously. Our model permits there to be components individual to the various data sets, and others that are common to all the sets. We explore the assumed time autocorrelation of independent signal components and base our algorithm on prediction analysis. We illustrate the algorithm using a simple image separation example. Our aim is to apply this method to functional brain mapping using functional magnetic resonance imaging (fMRI). 1
Targeted Insertion of the mPing Transposable Element
Class II DNA Transposable Elements (TEs) are moved from one location to another in the genome by the action of transposase proteins that bind to repeat sequences at the ends of the elements. Although the location TE insertion is mostly random, the addition of DNA binding domains to the transposase proteins has allowed for targeted insertion of some elements. In this study, the Gal4 binding domain was added to the transposase proteins, ORF1 and TPase, which mobilize the mPing element from rice. The Gal4:TPase construct was capable of increasing the number of mPing insertions into the Gal2 and Gal4 promoter sequences in yeast. While this confirms that mPing insertion preference can be manipulated, the target specificity is relatively low. Thus, the CRISPR/Cas9 system was tested for its ability to generate targeted insertion of mPing. A dCas9:TPase fusion protein had a low transposition rate suggesting that the addition of this large protein disrupts TPase function. Unfortunately, the use of a MS2 binding domain to localize the TPase to the MS2 hairpin containing gRNA failed to produce targeted insertion. Thus, our results suggest that the addition of small DNA binding domain to the N-terminal of TPase is the best strategy for targeted insertion of mPing
A Spatially Robust ICA Algorithm for Multiple fMRI Data Sets
In this paper we derive an independent-component analysis (ICA) method for analyzing two or more data sets simultaneously. Our model extracts independent components common to all data sets and independent data-set-specific components. We use time-delayed autocorrelations to obtain independent signal components and base our algorithm on prediction analysis. We applied this method to functional brain mapping using functional magnetic resonance imaging (fMRI). The results of our 3-subject analysis demonstrate the robustness of the algorithm to the spatial misalignment intrinsic in multiple-subject fMRI data sets. 1
Can we explain machine learning-based prediction for rupture status assessments of intracranial aneurysms?
Although applying machine learning (ML) algorithms to rupture status assessment of intracranial aneurysms (IA) has yielded promising results, the opaqueness of some ML methods has limited their clinical translation. We presented the first explainability comparison of six commonly used ML algorithms: multivariate logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), multi-layer perceptron neural network (MLPNN), and Bayesian additive regression trees (BART). A total of 112 IAs with known rupture status were selected for this study. The ML-based classification used two anatomical features, nine hemodynamic parameters, and thirteen morphologic variables. We utilized permutation feature importance, local interpretable model-agnostic explanations (LIME), and SHapley Additive exPlanations (SHAP) algorithms to explain and analyze 6 Ml algorithms. All models performed comparably: LR area under the curve (AUC) was 0.71; SVM AUC was 0.76; RF AUC was 0.73; XGBoost AUC was 0.78; MLPNN AUC was 0.73; BART AUC was 0.73. Our interpretability analysis demonstrated consistent results across all the methods; i.e., the utility of the top 12 features was broadly consistent. Furthermore, contributions of 9 important features (aneurysm area, aneurysm location, aneurysm type, wall shear stress maximum during systole, ostium area, the size ratio between aneurysm width, (parent) vessel diameter, one standard deviation among time-averaged low shear area, and one standard deviation of temporally averaged low shear area less than 0.4 Pa) were nearly the same. This research suggested that ML classifiers can provide explainable predictions consistent with general domain knowledge concerning IA rupture. With the improved understanding of ML algorithms, clinicians’ trust in ML algorithms will be enhanced, accelerating their clinical translation
Optimizing Preprocessing and Analysis Pipelines for Single-Subject fMRI: 2. Interactions with ICA, PCA, Task Contrast and Inter-Subject Heterogeneity
A variety of preprocessing techniques are available to correct subject-dependant artifacts in fMRI, caused by head motion and physiological noise. Although it has been established that the chosen preprocessing steps (or “pipeline”) may significantly affect fMRI results, it is not well understood how preprocessing choices interact with other parts of the fMRI experimental design. In this study, we examine how two experimental factors interact with preprocessing: between-subject heterogeneity, and strength of task contrast. Two levels of cognitive contrast were examined in an fMRI adaptation of the Trail-Making Test, with data from young, healthy adults. The importance of standard preprocessing with motion correction, physiological noise correction, motion parameter regression and temporal detrending were examined for the two task contrasts. We also tested subspace estimation using Principal Component Analysis (PCA), and Independent Component Analysis (ICA). Results were obtained for Penalized Discriminant Analysis, and model performance quantified with reproducibility (R) and prediction metrics (P). Simulation methods were also used to test for potential biases from individual-subject optimization. Our results demonstrate that (1) individual pipeline optimization is not significantly more biased than fixed preprocessing. In addition, (2) when applying a fixed pipeline across all subjects, the task contrast significantly affects pipeline performance; in particular, the effects of PCA and ICA models vary with contrast, and are not by themselves optimal preprocessing steps. Also, (3) selecting the optimal pipeline for each subject improves within-subject (P,R) and between-subject overlap, with the weaker cognitive contrast being more sensitive to pipeline optimization. These results demonstrate that sensitivity of fMRI results is influenced not only by preprocessing choices, but also by interactions with other experimental design factors. This paper outlines a quantitative procedure to denoise data that would otherwise be discarded due to artifact; this is particularly relevant for weak signal contrasts in single-subject, small-sample and clinical datasets
The role of "costs" in political choice: a review
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/67848/2/10.1177_002200276300700209.pd
Validation of Coevolving Residue Algorithms via Pipeline Sensitivity Analysis: ELSC and OMES and ZNMI, Oh My!
Correlated amino acid substitution algorithms attempt to discover groups of residues that co-fluctuate due to either structural or functional constraints. Although these algorithms could inform both ab initio protein folding calculations and evolutionary studies, their utility for these purposes has been hindered by a lack of confidence in their predictions due to hard to control sources of error. To complicate matters further, naive users are confronted with a multitude of methods to choose from, in addition to the mechanics of assembling and pruning a dataset. We first introduce a new pair scoring method, called ZNMI (Z-scored-product Normalized Mutual Information), which drastically improves the performance of mutual information for co-fluctuating residue prediction. Second and more important, we recast the process of finding coevolving residues in proteins as a data-processing pipeline inspired by the medical imaging literature. We construct an ensemble of alignment partitions that can be used in a cross-validation scheme to assess the effects of choices made during the procedure on the resulting predictions. This pipeline sensitivity study gives a measure of reproducibility (how similar are the predictions given perturbations to the pipeline?) and accuracy (are residue pairs with large couplings on average close in tertiary structure?). We choose a handful of published methods, along with ZNMI, and compare their reproducibility and accuracy on three diverse protein families. We find that (i) of the algorithms tested, while none appear to be both highly reproducible and accurate, ZNMI is one of the most accurate by far and (ii) while users should be wary of predictions drawn from a single alignment, considering an ensemble of sub-alignments can help to determine both highly accurate and reproducible couplings. Our cross-validation approach should be of interest both to developers and end users of algorithms that try to detect correlated amino acid substitutions
- …