16 research outputs found

    An Imaging Mass Spectrometry Investigation Into the N-linked Glycosylation Landscape of Pancreatic Ductal Adenocarcinoma and the Development of Associated Tools for Enhanced Glycan Separation and Characterization

    Get PDF
    The severity of pancreatic ductal adenocarcinoma (PDAC) is largely attributed to a failure to detect the disease before metastatic spread has occurred. CA19-9, a carbohydrate biomarker, is used clinically to surveille disease progression, but due to specificity challenges is not suitable for early discovery. As CA19-9 and other prospective markers are glycan epitopes, there is great clinical interest in understanding the glycobiology of pancreatic cancer. Unfortunately, few studies have been able to link glycosylation changes directly to pancreatic tumors and instead have focused on peripheral glycan alterations in the serum of PDAC patients. To address this gap in our understanding, we applied an imaging mass spectrometry (IMS) approach with complementary enzymatic and chemical isomer separation techniques to spatially assess the PDAC N-glycome in a cohort of pancreatic cancer patients. Orthogonally, we characterized the expression of CA19-9 and a new biomarker, sTRA, by multi-round immunofluorescence (IF) in the same cohort. These analyses revealed increased sialylation, fucosylation and branching amongst other structural themes in areas of PDAC tumor tissue. CA19-9 expressing tumors were defined by multiply branched, fucosylated bisecting N-glycans while sTRA expressing tumors favored tetraantennary N-glycans with polylactosamine extensions. IMS and IF-derived glycan and biomarker features were used to build classification models that detected PDAC tissue with an AUC of 0.939, outperforming models using either dataset individually. While studying sialylation isomers in our PDAC cohort, we saw an opportunity to enhance the chemical derivatization protocol we were using to address its shortcomings and expand its functionality. Subsequently, we developed a set of novel amidation-amidation strategies to stabilize and differentially label 2,3 and 2,6-linked sialic acids. In our alkyne-based approach, the differential mass shifts induced by the reactions allow for isomeric discrimination in imaging mass spectrometry experiments. This scheme, termed AAXL, was further characterized in clinical tissue specimens, biofluids and cultured cells. Our azide-based approach, termed AAN3, was more suitable for bioorthogonal applications, where the azide tag installed on 2,3 and 2,8-sialic acids could be reacted by click chemistry with a biotin-alkyne for subsequent streptavidin-peroxidase staining. Furthering the use of AAN3, we developed two additional techniques to fluorescently label (SAFER) and preferentially enrich (SABER) 2,3 and 2,8-linked sialic acids for more advanced glycomic applications. Initial experiments with these novel approaches have shown successful fluorescent staining and the identification of over 100 sialylated glycoproteins by LC-MS/MS. These four bioorthogonal strategies provide a new glycomic tool set for the characterization of sialic acid isomers in pancreatic and other cancers. Overall, this work furthers our collective understanding of the glycobiology underpinning pancreatic cancer and potentiates the discovery of novel carbohydrate biomarkers for the early detection of PDAC

    Efficient Knowledge Extraction from Structured Data

    Get PDF
    Knowledge extraction from structured data aims for identifying valid, novel, potentially useful, and ultimately understandable patterns in the data. The core step of this process is the application of a data mining algorithm in order to produce an enumeration of particular patterns and relationships in large databases. Clustering is one of the major data mining tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized, and the similarity of objects from different clusters is minimized. In this thesis, we advance the state-of-the-art data mining algorithms for analyzing structured data types. We describe the development of innovative solutions for hierarchical data mining. The EM-based hierarchical clustering method ITCH (Information-Theoretic Cluster Hierarchies) is designed to propose solid solutions for four different challenges. (1) to guide the hierarchical clustering algorithm to identify only meaningful and valid clusters. (2) to represent each cluster content in the hierarchy by an intuitive description with e.g. a probability density function. (3) to consistently handle outliers. (4) to avoid difficult parameter settings. ITCH is built on a hierarchical variant of the information-theoretic principle of Minimum Description Length (MDL). Interpreting the hierarchical cluster structure as a statistical model of the dataset, it can be used for effective data compression by Huffman coding. Thus, the achievable compression rate induces a natural objective function for clustering, which automatically satisfies all four above mentioned goals. The genetic-based hierarchical clustering algorithm GACH (Genetic Algorithm for finding Cluster Hierarchies) overcomes the problem of getting stuck in a local optimum by a beneficial combination of genetic algorithms, information theory and model-based clustering. Besides hierarchical data mining, we also made contributions to more complex data structures, namely objects that consist of mixed type attributes and skyline objects. The algorithm INTEGRATE performs integrative mining of heterogeneous data, which is one of the major challenges in the next decade, by a unified view on numerical and categorical information in clustering. Once more, supported by the MDL principle, INTEGRATE guarantees the usability on real world data. For skyline objects we developed SkyDist, a similarity measure for comparing different skyline objects, which is therefore a first step towards performing data mining on this kind of data structure. Applied in a recommender system, for example SkyDist can be used for pointing the user to alternative car types, exhibiting a similar price/mileage behavior like in his original query. For mining graph-structured data, we developed different approaches that have the ability to detect patterns in static as well as in dynamic networks. We confirmed the practical feasibility of our novel approaches on large real-world case studies ranging from medical brain data to biological yeast networks. In the second part of this thesis, we focused on boosting the knowledge extraction process. We achieved this objective by an intelligent adoption of Graphics Processing Units (GPUs). The GPUs have evolved from simple devices for the display signal preparation into powerful coprocessors that do not only support typical computer graphics tasks but can also be used for general numeric and symbolic computations. As major advantage, GPUs provide extreme parallelism combined with a high bandwidth in memory transfer at low cost. In this thesis, we propose algorithms for computationally expensive data mining tasks like similarity search and different clustering paradigms which are designed for the highly parallel environment of a GPU, called CUDA-DClust and CUDA-k-means. We define a multi-dimensional index structure which is particularly suited to support similarity queries under the restricted programming model of a GPU. We demonstrate the superiority of our algorithms running on GPU over their conventional counterparts on CPU in terms of efficiency

    Računalna analiza genotipova i N-glikoma ljudske plazme

    Get PDF
    Glycosylation is one of the most extensive protein modifications. Glycans influence both structure and function of the proteins and known to have important roles in physiological and pathological processes. The absence of a universal code for glycan synthesis combined with the technological challenges faced by glycan quantification analysis has hindered the knowledge about the processes regulating the assembly of glycans. Major breakthroughs in analytical procedures created the possibility to reliably quantify glycans in a high-throughput manner and allowed the first large-scale studies on human plasma N-glycome. In order to explore the genomic and environmental regulation of glycosylation, different computational methods were employed to the integrated analysis of glycan, physiological/biochemical and genotype data in three isolated population cohorts. Specific glyco-phenotypes were identified in the general population and the potential use of glycan modifications as biomarkers was evaluated for the particular case of diabetes. General associations between glycans and phenotypes were observed and glycan, phenotypic and genotypic patterns capable of discriminating the populations were explored. The analysis of polymorphisms associated with glycosylation was addressed replicating previous findings and suggesting possible novel associations.Glikozilacija je jedna od najopsežnijih modifikacija proteina. Glikani utječu na strukturu i funkciju proteina na koje su vezani, a poznato je i da imaju važne uloge u fiziološkim i patološkim procesima. Nedostatak univerzalnog koda za sintezu glikana zajedno sa tehnološkim poteškoćama kvantifikacije glikana razlozi su ograničenom razumijevanju procesa koji reguliraju njihovu sintezu. Značajni napretci u analitičkim postupcima omogućili su razvoj pouzdanih visoko-protočnih metoda za kvantifikaciju glikana, a time i prve studije plazma N-glikoma velikog broja ljudi. Kako bi se istražila genomska i okolišna regulacija glikozilacije, u ovome su radu glikanski, fiziološki i biokemijski podaci te genotipovi iz tri različite izolirane populacije analizirani različitim računalnim metodama. Identificirani su glikanski profili specifični za opću populaciju evaluiran je potencijal glikana kao biomarkera dijabetesa. Također, analizirane su asocijacije glikana i fenotipova te su istraženi glikanski, fenotipski i genotipski uzorci koji definiraju pojedine populacije. Analize polimorfizama povezanih sa glikozilacijom potvrdile su prethodna otkrića te su otkrivene nove potencijalne poveznice

    Coping with new Challenges in Clustering and Biomedical Imaging

    Get PDF
    The last years have seen a tremendous increase of data acquisition in different scientific fields such as molecular biology, bioinformatics or biomedicine. Therefore, novel methods are needed for automatic data processing and analysis of this large amount of data. Data mining is the process of applying methods like clustering or classification to large databases in order to uncover hidden patterns. Clustering is the task of partitioning points of a data set into distinct groups in order to minimize the intra cluster similarity and to maximize the inter cluster similarity. In contrast to unsupervised learning like clustering, the classification problem is known as supervised learning that aims at the prediction of group membership of data objects on the basis of rules learned from a training set where the group membership is known. Specialized methods have been proposed for hierarchical and partitioning clustering. However, these methods suffer from several drawbacks. In the first part of this work, new clustering methods are proposed that cope with problems from conventional clustering algorithms. ITCH (Information-Theoretic Cluster Hierarchies) is a hierarchical clustering method that is based on a hierarchical variant of the Minimum Description Length (MDL) principle which finds hierarchies of clusters without requiring input parameters. As ITCH may converge only to a local optimum we propose GACH (Genetic Algorithm for Finding Cluster Hierarchies) that combines the benefits from genetic algorithms with information-theory. In this way the search space is explored more effectively. Furthermore, we propose INTEGRATE a novel clustering method for data with mixed numerical and categorical attributes. Supported by the MDL principle our method integrates the information provided by heterogeneous numerical and categorical attributes and thus naturally balances the influence of both sources of information. A competitive evaluation illustrates that INTEGRATE is more effective than existing clustering methods for mixed type data. Besides clustering methods for single data objects we provide a solution for clustering different data sets that are represented by their skylines. The skyline operator is a well-established database primitive for finding database objects which minimize two or more attributes with an unknown weighting between these attributes. In this thesis, we define a similarity measure, called SkyDist, for comparing skylines of different data sets that can directly be integrated into different data mining tasks such as clustering or classification. The experiments show that SkyDist in combination with different clustering algorithms can give useful insights into many applications. In the second part, we focus on the analysis of high resolution magnetic resonance images (MRI) that are clinically relevant and may allow for an early detection and diagnosis of several diseases. In particular, we propose a framework for the classification of Alzheimer's disease in MR images combining the data mining steps of feature selection, clustering and classification. As a result, a set of highly selective features discriminating patients with Alzheimer and healthy people has been identified. However, the analysis of the high dimensional MR images is extremely time-consuming. Therefore we developed JGrid, a scalable distributed computing solution designed to allow for a large scale analysis of MRI and thus an optimized prediction of diagnosis. In another study we apply efficient algorithms for motif discovery to task-fMRI scans in order to identify patterns in the brain that are characteristic for patients with somatoform pain disorder. We find groups of brain compartments that occur frequently within the brain networks and discriminate well among healthy and diseased people

    Flexibility vs consistency: Quantifying differences in neuromodulatory elicited patterns of activity

    Get PDF
    Central pattern generating circuits underly fundamental behaviors such as respiration or locomotion and are under the influence of neuromodulators. The presence of neuromodulators is thought to confer flexibility to these circuits to generate distinct patterns of activity to meet distinct behavioral needs. Network output flexibility can be achieved by distinct classes of neuromodulators, those which have convergent cellular actions but divergent circuit actions or by those which have divergent cellular actions but convergent circuit actions. Both classes of neuromodulator exist in the stomatogastric nervous system of the crab Cancer borealis and influence the activity of a central pattern generating circuit in the stomatogastric ganglion, the pyloric network. The ability of both classes of neuromodulator, when applied individually, to generate qualitatively and quantitatively distinct patterns of activity has been demonstrated with respect to a baseline activity state. While it is assumed that each individual neuromodulator’s activity pattern is distinct, there has yet to be a fully quantitative description of the degree of difference between two modulated activity patterns. It is also unlikely that any single circuit will be under the influence of only a single neuromodulator at any point. Therefore, the possibility of generating distinct network outputs increases with each distinct combination of neuromodulators. While the actions of individual neuromodulators have been explored, the consequences of co-modulation on the pyloric network’s output are less understood. Previous attempts at quantifying the effects of a neuromodulator on the pyloric network output relied on evaluating only a single, often multi-dimensional, attribute of activity at a time and statistically testing the dependent parameters of that attribute with statistics that assume independence. This dissertation uses a new approach to quantify and statistically test how different one neuromodulator elicited pattern of activity is from another, preserving the inherent multi-dimensional nature of the attributes evaluated. The results of this dissertation show that the pyloric network output is able to generate statistically distinct network outputs with individual neuromodulators; however, flexibility is lost in favor of consistency under co-modulatory conditions

    Exact algorithms for pairwise protein structure alignment

    Get PDF
    Klau, G.W. [Promotor

    Development of a static bioactive stent prototype and dynamic aneurysm-on-a-chip(TM) model for the treatment of aneurysms

    Get PDF
    Aneurysms are pockets of blood that collect outside blood vessel walls forming dilatations and leaving arterial walls very prone to rupture. Current treatments include: (1) clipping, and (2) coil embolization, including stent-assisted coiling. While these procedures can be effective, it would be advantageous to design a biologically active stent, modified with magnetic stent coatings, allowing cells to be manipulated to heal the arterial lining. Further, velocity, pressure, and wall shear stresses aid in the disease development of aneurysmal growth, but the shear force mechanisms effecting wound closure is elusive. Due to these factors, there is a definite need to cultivate a new stent device that will aid in healing an aneurysm insitu. To this end, a static bioactive stent device was synthesized. Additionally, to study aneurysm pathogenesis, a lab-on-a-chip device (a dynamic stent device) is the key to discovering the underlying mechanisms of these lesions. A first step to the reality of a true bioactive stent involves the study of cells that can be tested against the biomaterials that constitute the stent itself. The second step is to test particles/cells in a microfluidic environment. Therefore, biocompatability data was collected against PDMS, bacterial nanocellulose (BNC), and magnetic bacterial nanocellulose (MBNC). Preliminary static bioactive stents were synthesized whereby BNC was grown to cover standard nitinol stents. In an offshoot of the original research, a two-dimensional microfluidic model, the Aneurysm-on-a-ChipTM (AOC), was the logical answer to study particle flow within an aneurysm sac - this was the dynamic bioactive stent device. The AOC apparatus can track particles/cells when it is coupled to a particle image velocimetry software (PIV) package. The AOC fluid flow was visualized using standard microscopy techniques with commercial microparticles/cells. Movies were taken during fluid flow experiments and PIV was utilized to monito

    Self Assembly Problems of Anisotropic Particles in Soft Matter.

    Full text link
    Anisotropic building blocks assembled from colloidal particles are attractive building blocks for self-assembled materials because their complex interactions can be exploited to drive self-assembly. In this dissertation we address the self-assembly of anisotropic particles from multiple novel computational and mathematical angles. First, we accelerate algorithms for modeling systems of anisotropic particles via massively parallel GPUs. We provide a scheme for generating statistically robust pseudo-random numbers that enables GPU acceleration of Brownian and dissipative particle dynamics. We also show how rigid body integration can be accelerated on a GPU. Integrating these two algorithms into a GPU-accelerated molecular dynamics code (HOOMD-blue), make a single GPU the ideal computing environment for modeling the self-assembly of anisotropic nanoparticles. Second, we introduce a new mathematical optimization problem, filling, a hybrid of the familiar shape packing and covering problem, which can be used to model shaped particles. We study the rich mathematical structures of the solution space and provide computational methods for finding optimal solutions for polygons and convex polyhedra. We present a sequence of isosymmetric optimal filling solutions for the Platonic solids. We then consider the filling of a hyper-cone in dimensions two to eight and show the solution remains scale-invariant but dependent on dimension. Third, we study the impact of size variation, polydispersity, on the self-assembly of an anisotropic particle, the polymer-tethered nanosphere, into ordered phases. We show that the local nanoparticle packing motif, icosahedral or crystalline, determines the impact of polydispersity on energy of the system and phase transitions. We show how extensions of the Voronoi tessellation can be calculated and applied to characterize such micro-segregated phases. By applying a Voronoi tessellation, we show that properties of the individual domains can be studied as a function of system properties such as temperature and concentration. Last, we consider the thermodynamically driven self-assembly of terminal clusters of particles. We predict that clusters related to spherical codes, a mathematical sequence of points, can be synthesized via self-assembly. These anisotropic clusters can be tuned to different anisotropies via the ratio of sphere diameters and temperature. The method suggests a rich new way for assembling anisotropic building blocks.Ph.D.Applied Physics and Scientific ComputingUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91576/1/phillicl_1.pd

    Seventh Biennial Report : June 2003 - March 2005

    No full text
    corecore