Search CORE

164 research outputs found

An Efficient Approach to Clustering in Large Multimedia Databases with Noise".

Author: Alexander Hinneburg
Daniel A Keim
Publication venue
Publication date: 01/01/1998
Field of study

Abstract Several clustering algorithms can be applied to clustering in large multimedia databases. The effectiveness and efficiency of the existing algorithms, however, is somewhat limited, since clustering in multimedia databases requires clustering high-dimensional feature vectors and since multimedia databases often contain large amounts of noise. In this paper, we therefore introduce a new algorithm to clustering in large multimedia databases called DENCLUE (DENsitybased CLUstEring). The basic idea of our new approach is to model the overall point density analytically as the sum of influence functions of the data points. Clusters can then be identified by determining density-attractors and clusters of arbitrary shape can be easily described by a simple equation based on the overall density function. The advantages of our new approach are (1) it has a firm mathematical basis, (2) it has good clustering properties in data sets with large amounts of noise, (3) it allows a compact mathematical description of arbitrarily shaped clusters in high-dimensional data sets and (4) it is significantly faster than existing algorithms. To demonstrate the effectiveness and efficiency of DENCLUE, we perform a series of experiments on a number of different data sets from CAD and molecular biology. A comparison with DBSCAN shows the superiority of our new approach

CiteSeerX

On the reliability of the theoretical internal conversion coefficients

Author: Band I M
Band I M
Band I M
Band I M
Band I M
Band I M
Borisoglebskii L A
Brabec V
Caso C (Particle Data Group)
Church E L
Coulthard M A
Coulthard M A
Dragoun O
Dragoun O
Dragoun O
Dragoun O
Firestone R B
Fujioka M
Gasiorowicz S
Hager R S
Hartmann E
Hinneburg D
Hinneburg D
Hinneburg D
Krutov V A
Krutov V A
Krutov V A
Krutov V A
Krutov V A
Krutov V A
Krutov V A
Listengarten M A
Listengarten M A
Lu C C
M Rysavý
Matese J J
Mayol R
O Dragoun
Pauli H C
Pauli H C
Raff U
Rose M E
Rysavý M
Rysavý M
Rysavý M
Rösel F
Sergeev V O
Sevier K D
Sliv L A
Sliv L A
Sliv L A
Vsevolodov M M
Zilitis V A
Publication venue: 'IOP Publishing'
Publication date: 16/06/2000
Field of study

Possible sources of uncertainties in the calculations of the internal conversion coefficients are studied. The uncertainties induced by them are estimated.Comment: 16 pages (including 3 figures inserted by 'epsfig' macro

arXiv.org e-Print Archive

Crossref

PARMA-CC: Parallel Multiphase Approximate Cluster Combining

Author: Arlia D.
Hinneburg Alexander
Hoeflinger Jay P
Keramatian Amir
Najdataei H.
Patwary M. M. A.
Wang Wei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Clustering is a common component in data analysis applications. Despite the extensive literature, the continuously increasing volumes of data produced by sensors (e.g. rates of several MB/s by 3D scanners such as LIDAR sensors), and the time-sensitivity of the applications leveraging the clustering outcomes (e.g. detecting critical situations, that are known to be accuracy-dependent), demand for novel approaches that respond faster while coping with large data sets. The latter is the challenge we address in this paper. We propose an algorithm, PARMA-CC, that complements existing density-based and distance-based clustering methods. PARMA-CC is based on approximate, data parallel cluster combining, where parallel threads can compute summaries of clusters of data (sub)sets and, through combining, together construct a comprehensive summary of the sets of clusters. By approximating clusters with their respective geometrical summaries, our technique scales well with increased data volumes, and, by computing and efficiently combining the summaries in parallel, it enables latency improvements. PARMA-CC combines the summaries using special data structures that enable parallelism through in-place data processing. As we show in our analysis and evaluation, PARMA-CC can complement and outperform well-established methods, with significantly better scalability, while still providing highly accurate results in a variety of data sets, even with skewed data distributions, which cause the traditional approaches to exhibit their worst-case behaviour. In the paper we also describe how PARMA-CC can facilitate time-critical applications through appropriate use of the summaries

Crossref

Chalmers Research

High-resolution longitudinal N- and O-glycoprofiling of human monocyte-to-macrophage transition.

Author: Bokil NJ
Hinneburg H
Kawahara R
Pedersen JL
Pralow A
Rapp E
Saunders BM
Schirmeister F
Thaysen-Andersen M
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/03/2021
Field of study

Protein glycosylation impacts the development and function of innate immune cells. The glycophenotypes and the glycan remodelling associated with the maturation of macrophages from monocytic precursor populations remain incompletely described. Herein, label-free porous graphitised carbon-liquid chromatography-tandem mass spectrometry (PGC-LC-MS/MS) was employed to profile with high resolution the N- and O-glycome associated with human monocyte-to-macrophage transition. Primary blood-derived CD14+ monocytes were differentiated ex vivo in the absence of strong anti- and proinflammatory stimuli using a conventional 7-day granulocyte-macrophage colony-stimulating factor differentiation protocol with longitudinal sampling. Morphology and protein expression monitored by light microscopy and proteomics validated the maturation process. Glycomics demonstrated that monocytes and macrophages display similar N-glycome profiles, comprising predominantly paucimannosidic (Man1-3GlcNAc2Fuc0-1, 22.1-30.8%), oligomannosidic (Man5-9GlcNAc2, 29.8-35.7%) and α2,3/6-sialylated complex-type N-glycans with variable core fucosylation (27.6-39.1%). Glycopeptide analysis validated conjugation of these glycans to human proteins, while quantitative proteomics monitored the glycoenzyme expression levels during macrophage differentiation. Significant interperson glycome variations were observed suggesting a considerable physiology-dependent or heritable heterogeneity of CD14+ monocytes. Only few N-glycome changes correlated with the monocyte-to-macrophage transition across donors including decreased core fucosylation and reduced expression of mannose-terminating (paucimannosidic-/oligomannosidic-type) N-glycans in macrophages, while lectin flow cytometry indicated that more dramatic cell surface glycan remodelling occurs during maturation. The less heterogeneous core 1-rich O-glycome showed a minor decrease in core 2-type O-glycosylation but otherwise remained unchanged with macrophage maturation. This high-resolution glycome map underpinning normal monocyte-to-macrophage transition, the most detailed to date, aids our understanding of the molecular makeup pertaining to two vital innate immune cell types and forms an important reference for future glycoimmunological studies

OPUS - University of Technology Sydney

Distinguishing N-acetylneuraminic acid linkage isomers on glycopeptides by ion mobility-mass spectrometry

Author: Altmann F.
Hinneburg H.
Hofmann J.
Kolarich D.
Pagel K.
Seeberger P. H.
Struwe W. B.
Thader A.
Varón Silva D.
Publication venue
Publication date: 01/01/2016
Field of study

Differentiating the structure of isobaric glycopeptides represents a major challenge for mass spectrometry-based characterisation techniques. Here we show that the regiochemistry of the most common N-acetylneuraminic acid linkages of N-glycans can be identified in a site-specific manner from individual glycopeptides using ion mobility-mass spectrometry analysis of diagnostic fragment ions

Institutional Repository of the Freie Universität Berlin

Publikationsserver der Universitätsbibliothek Bodenkultur Wien

MPG.PuRe

FEBUKO and MODMEP: Field measurements and modelling of aerosol and cloud multiphase processes

Author: Acker K.
Baechmann K.
Barzaghi P.
Birmili W.
Brueggemann E.
Chemnitzer R.
Collett J.
Diehl K.
Galgon D.
Gnauk T.
Heinold B.
Herrmann H.
Hinneburg D.
Hofmann D.
Jaeschke W.
Knoth O.
Kramberger H.
Lehmann K.
Majdik Z.
Massling A.
Mauersberger G.
Mertes S.
Mueller F.
Mueller K.
Nowak A.
Plewka A.
Rued C.
Schwirn K.
Sehili A.
Simmel M.
Svrcina B.
Tilgner A.
van Pinxteren D.
Wiedensohler A.
Wierprecht W.
Wolke R.
Wurzler S.
Publication venue: 'Elsevier BV'
Publication date: 01/07/2005
Field of study

An overview of the two FEBUKO aerosol–cloud interaction field experiments in the Thüringer Wald (Germany) in October 2001 and 2002 and the corresponding modelling project MODMEP is given. Experimentally, a variety of measurement methods were deployed to probe the gas phase, particles and cloud droplets at three sites upwind, downwind and within an orographic cloud with special emphasis on the budgets and interconversions of organic gas and particle phase constituents. Out of a total of 14 sampling periods within 30 cloud events three events (EI, EII and EIII) are selected for detailed analysis. At various occasions an impact of the cloud process on particle chemical composition such as on the organic compounds content, sulphate and nitrate and also on particle size distributions and particle mass is observed. Moreover, direct phase transfer of polar organic compound from the gas phase is found to be very important for the understanding of cloudwater composition. For the modelling side, a main result of the MODMEP project is the development of a cloud model, which combines a complex multiphase chemistry with detailed microphysics. Both components are described in a fine-resolved particle/drop spectrum. New numerical methods are developed for an efficient solution of the entire complex model. A further development of the CAPRAM mechanism has lead to a more detailed description of tropospheric aqueous phase organic chemistry. In parallel, effective tools for the reduction of highly complex reaction schemes are provided. Techniques are provided and tested which allow the description of complex multiphase chemistry and of detailed microphysics in multidimensional chemistry-transport models

MPG.PuRe

Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics

Author: A Arenas
A Capocci
A Hinneburg
A Lancichinetti
A Lancichinetti
AK Ramani
C Baerveldt
D Ekman
D Krioukov
DJ Watts
DL Nelson
ER Gansner
F Radicchi
G Palla
G Tibély
H Yu
I Kovacs
I Vragovic
István A. Kovács
J Moody
JB Axelsen
JD Han
JM Kumpula
JM Thevelein
JP Bagrow
JP Eckmann
JW Berry
K Komurov
M Blatt
M Fiedler
M Girvan
M Grendar
M Rosvall
ME Newman
ME Newman
ML Clark
Máté S. Szalay
N Bertin
Olaf Sporns
P Csermely
P Pons
Peter Csermely
PM Kim
Robin Palotai
S Fortunato
S Fortunato
S Fortunato
T Nepusz
TS Evans
V Latora
VD Blondel
WW Zachary
Y-Y Ahn
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2010
Field of study

Background: Network communities help the functional organization and evolution of complex networks. However, the development of a method, which is both fast and accurate, provides modular overlaps and partitions of a heterogeneous network, has proven to be rather difficult. Methodology/Principal Findings: Here we introduce the novel concept of ModuLand, an integrative method family determining overlapping network modules as hills of an influence function-based, centrality-type community landscape, and including several widely used modularization methods as special cases. As various adaptations of the method family, we developed several algorithms, which provide an efficient analysis of weighted and directed networks, and (1) determine pervasively overlapping modules with high resolution; (2) uncover a detailed hierarchical network structure allowing an efficient, zoom-in analysis of large networks; (3) allow the determination of key network nodes and (4) help to predict network dynamics. Conclusions/Significance: The concept opens a wide range of possibilities to develop new approaches and applications including network routing, classification, comparison and prediction.Comment: 25 pages with 6 figures and a Glossary + Supporting Information containing pseudo-codes of all algorithms used, 14 Figures, 5 Tables (with 18 module definitions, 129 different modularization methods, 13 module comparision methods) and 396 references. All algorithms can be downloaded from this web-site: http://www.linkgroup.hu/modules.ph

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

ELTE Digital Institutional Repository (EDIT)

Clustering Algorithms: Their Application to Gene Expression Data

Author: Agrawal R.
Alizadeh A.A.
Bandyopadhyay S.
Bandyopadhyay S.
Bezdek J.C.
Bezdek J.C.
Bezdek† J.C.
Bhargavi M.S.
Blatt M.
Bochkov Y.A.
Brunet J.P.
Bryan K.
Buitinck L.
Bunnik E.M.
Caliński T.
Chandrasekhar T.
Cheng Y.
Costa I.G.
Cover T.M.
D'haeseleer P.
Dave R.N.
Davies D.L.
De Morsier F.
Dempster A.P.
Dharmarajan A.
Dhillon I.S.
Divina F.
Do C.B.
Domany E.
Du Z.
Dunn† J.C.
Edla D.R.
Eisen M.B.
Ferguson T.S.
Frey B.J.
Fu L.
Fukuyama Y.
Galluccio L.
Gath I.
Getz G.
Gordon G.J.
Gu J.
Guha S.
Handhayani T.
Handl J.
Hatamlou A.
Heard N.A.
Heyer L.J.
Hinneburg A.
Hinneburg A.
Hu X.
Hubert L.J.
Jain A.K.
Jiang D.
Jiang H.
Joopudi S.
Kao Y.T.
Karmilasari S.W.
Karypis G.
Kaufman L.
Kerr G.
Kluger Y.
Kohonen T.
Kohonen T.
Krzanowski W.J.
Leone M.
Lu Y.
Lu Y.
Ma'sum M.A.
MacQueen J.
Madeira S.C.
Mann A.K.
Masciari E.
Maulik U.
Milligan G.W.
Mitra S.
Moon T.K.
Moore W.C.
Müllner D.
Nagpal A.
Nasser S.
Neal R.M.
Ng R.T.
Pakhira M.K.
Pal N.R.
Pedregosa F.
Pirim H.
Pitman J.
Prelić A.
Qin Z.S.
Raman S.
Rasmussen C.E.
Rezaee B.
Rezaee M.R.
Ruspini E.H.
Saha S.
Saha S.
Saha S.
Sathishkumar K.
Sheikholeslami G.
Sheng Q.
Sirinukunwattana K.
Sokal R.R.
Sun J.
Talaat A.M.
Tamayo P.
Tanay A.
Tang C.
Thalamuthu A.
Tibshirani R.
Wan M.
Wang L.
Wang W.
Williams G.
Wu J.
Wu K.L.
Wu S.
Xie X.L.
Xu R.
Xu Y.
Yu H.
Zhang D.
Zhang T.
Zhang Y.
Zhang Z.Y.
Zhao L.
Zhong C.
Zitnik M.
Řehůřek R.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure

Covenant University Repository

Crossref

Directory of Open Access Journals

PubMed Central

Relating gene expression data on two-component systems to functional annotations in Escherichia coli

Abstract Background Obtaining physiological insights from microarray experiments requires computational techniques that relate gene expression data to functional information. Traditionally, this has been done in two consecutive steps. The first step identifies important genes through clustering or statistical techniques, while the second step assigns biological functions to the identified groups. Recently, techniques have been developed that identify such relationships in a single step. Results We have developed an algorithm that relates patterns of gene expression in a set of microarray experiments to functional groups in one step. Our only assumption is that patterns co-occur frequently. The effectiveness of the algorithm is demonstrated as part of a study of regulation by two-component systems in <it>Escherichia coli</it>. The significance of the relationships between expression data and functional annotations is evaluated based on density histograms that are constructed using product similarity among expression vectors. We present a biological analysis of three of the resulting functional groups of proteins, develop hypotheses for further biological studies, and test one of these hypotheses experimentally. A comparison with other algorithms and a different data set is presented. Conclusion Our new algorithm is able to find interesting and biologically meaningful relationships, not found by other algorithms, in previously analyzed data sets. Scaling of the algorithm to large data sets can be achieved based on a theoretical model.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

SMOTE for high-dimensional class-imbalanced data

Author: A Fallahi
A Hinneburg
B Wallace
C Bunkhumpornpat
C Cortes
C Drummond
C Sotiriou
CM Bishop
DA Cieslak
E Fix
H Han
H He
J Pittman
J Wang
J Xiao
J Zhu
JV Hulse
K Beyer
KD MacIsaac
L Breiman
L Breiman
Lara Lusa
LD Miller
MA Shipp
N Iizuka
NV Chawla
P Radivojac
Q Gu
R Batuwita
R Blagus
R Development Core Team
R Johnson
R Tibshirani
RM Simon
Rok Blagus
S Daskalaki
S Doyle
S Dudoit
S Ramaswamy
SE Ertekin
T Fawcett
TP Speed
Y Guo
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref