Search CORE

1,523 research outputs found

What are the Best Hierarchical Descriptors for Complex Networks?

Author: Abe S
Clauset A Moore C Newman M E
Costa L da F
Costa L da F Kaiser M Hilgetag C
Duda R O
Erdos P
Han J
Hand D
Luciano da Fontoura Costa
McLachlan G J
Roberto Fernandes Silva Andrade
Wasserman S
Publication venue: 'IOP Publishing'
Publication date: 29/05/2007
Field of study

This work reviews several hierarchical measurements of the topology of complex networks and then applies feature selection concepts and methods in order to quantify the relative importance of each measurement with respect to the discrimination between four representative theoretical network models, namely Erd\"{o}s-R\'enyi, Barab\'asi-Albert, Watts-Strogatz as well as a geographical type of network. The obtained results confirmed that the four models can be well-separated by using a combination of measurements. In addition, the relative contribution of each considered feature for the overall discrimination of the models was quantified in terms of the respective weights in the canonical projection into two dimensions, with the traditional clustering coefficient, hierarchical clustering coefficient and neighborhood clustering coefficient resulting particularly effective. Interestingly, the average shortest path length and hierarchical node degrees contributed little for the separation of the four network models.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Searching for differentially expressed gene combinations

Author: Dettling Marcel
Gabrielson Edward
Parmigiani Giovanni
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

We propose 'CorScor', a novel approach for identifying gene pairs with joint differential expression. This is defined as a situation with good phenotype discrimination in the bivariate, but not in the two marginal distributions. CorScor can be used to detect phenotype-related dependencies and interactions among genes. Our easily interpretable approach is scalable to current microarray dimensions and yields promising results on several cancer-gene-expression datasets

Springer - Publisher Connector

PubMed Central

ZHAW digitalcollection

Collection Of Biostatistics Research Archive

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Author: Bühlmann Peter
Meinshausen Nicolai
Michel Loris
Näf Jeffrey
Ćevid Domagoj
Publication venue
Publication date: 28/05/2021
Field of study

Random Forests (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data, which can also be used for targets other than the original mean estimation. We propose a novel forest construction for multivariate responses based on their joint conditional distribution, independent of the estimation target and the data model. It uses a new splitting criterion based on the MMD distributional metric, which is suitable for detecting heterogeneity in multivariate distributions. The induced weights define an estimate of the full conditional distribution, which in turn can be used for arbitrary and potentially complicated targets of interest. The method is very versatile and convenient to use, as we illustrate on a wide range of examples. The code is available as Python and R packages drf

arXiv.org e-Print Archive

Measuring Categorical Perception in Color-Coded Scatterplots

Author: Quadri Ghulam Jilani
Szafir Danielle Albers
Tseng Chin
Wang Zeyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/03/2023
Field of study

Scatterplots commonly use color to encode categorical data. However, as datasets increase in size and complexity, the efficacy of these channels may vary. Designers lack insight into how robust different design choices are to variations in category numbers. This paper presents a crowdsourced experiment measuring how the number of categories and choice of color encodings used in multiclass scatterplots influences the viewers' abilities to analyze data across classes. Participants estimated relative means in a series of scatterplots with 2 to 10 categories encoded using ten color palettes drawn from popular design tools. Our results show that the number of categories and color discriminability within a color palette notably impact people's perception of categorical data in scatterplots and that the judgments become harder as the number of categories grows. We examine existing palette design heuristics in light of our results to help designers make robust color choices informed by the parameters of their data.Comment: The paper has been accepted to the ACM CHI 2023. 14 pages, 7 figure

arXiv.org e-Print Archive

Phonological awareness in preschool age children with developmental disabilities

Author: Barton-Hulsey Andrea
Publication venue: ScholarWorks @ Georgia State University
Publication date: 12/08/2016
Field of study

Reading skills are critically important for a child’s development and continued growth in school. The home and school literacy experiences of children who have developmental disabilities have been found to be qualitatively different from the experiences of their same age peers without disabilities. In addition to access to instruction, a number of intrinsic factors including cognitive ability, receptive language and expressive speech skills have been suggested as factors that may place children with developmental disabilities at a greater risk for limited development of reading skills. Currently, little is understood about how children who have developmental disabilities and may have limitations in productive speech learn to read. This study identifies key intrinsic and extrinsic factors that are related to the development of phonological awareness in 42 children between 4 years and 5 years 9 months of age with developmental disabilities and a range of speech abilities. Aims of this project were to 1- systematically assess children’s intrinsic factors of speech ability, receptive and expressive language and vocabulary, cognitive skills and phonological awareness to determine key intrinsic factors related to phonological awareness and 2- describe the extrinsic factors of home literacy experience and preschool literacy instruction provided to children. Children were found to have frequent and positive home literacy experiences. No significant correlations between speech ability and frequency of shared reading experiences were found. Parents reported low levels of preschool literacy instruction. Significant correlations were found between instruction in decoding and word recognition and children’s sound-symbol awareness. Correlations were found between the use of technology and media and Augmentative and Alternative Communication (AAC) and children’s speech ability. Positive, significant relationships were found between phonological awareness and all direct assessment measures of developmental skill, speech ability and early reading skills but were not found between phonological awareness and home or school literacy experiences. Speech ability did not predict a significant amount of variance in phonological awareness skill beyond what would be expected by cognitive development, receptive language and orthographic knowledge. This study provides important implications for practitioners and researchers alike concerning the factors related to early reading development in children with limited speech ability

ScholarWorks @ Georgia State University

Rapid Visual Categorization is not Guided by Early Salience-Based Selection

Author: Kotseruba Iuliia
Tsotsos John K.
Wloka Calden
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

The current dominant visual processing paradigm in both human and machine research is the feedforward, layered hierarchy of neural-like processing elements. Within this paradigm, visual saliency is seen by many to have a specific role, namely that of early selection. Early selection is thought to enable very fast visual performance by limiting processing to only the most salient candidate portions of an image. This strategy has led to a plethora of saliency algorithms that have indeed improved processing time efficiency in machine algorithms, which in turn have strengthened the suggestion that human vision also employs a similar early selection strategy. However, at least one set of critical tests of this idea has never been performed with respect to the role of early selection in human vision. How would the best of the current saliency models perform on the stimuli used by experimentalists who first provided evidence for this visual processing paradigm? Would the algorithms really provide correct candidate sub-images to enable fast categorization on those same images? Do humans really need this early selection for their impressive performance? Here, we report on a new series of tests of these questions whose results suggest that it is quite unlikely that such an early selection process has any role in human rapid visual categorization.Comment: 22 pages, 9 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Leaf Morphology, Taxonomy and Geometric Morphometrics: A Simplified Protocol for Beginners

Author: A Bell
A Cardini
A Cardini
A Cardini
A Cardini
A Roth-Nebelsick
AL Albarrán-Lara
Andrea Cardini
B Buchanan
BFJ Manly
C Fadda
Carles Lalueza-Fox
CE Oxnard
CJ Breuker
CP Klingenberg
CP Klingenberg
CP Klingenberg
CP Klingenberg
CP Klingenberg
CP Klingenberg
DC Adams
DC Adams
DC Howell
DW Thompson
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FJ Rohlf
FL Bookstein
G Albrecht
G Antonecchia
GN Stone
H Tsukaya
HD Sheets
IL Dryden
J Claude
J McDonald
JM Gómez
JM Gómez
JM Gómez
JM Peñaloza-Ramirez
K De Queiroz
K Kovarovic
K Kovarovic
LF Marcus
LF Marcus
M Frieß
M-J Fortin
ML Zelditch
ML Zelditch
N MacLeod
NA Neff
P Legendre
P Mitteroecker
P Mitteroecker
P O'Higgins
P Sanfilippo
PH Harvey
RA Fisher
RJ Jensen
RJ Jensen
RJ Jensen
RJ Jensen
RR Sokal
S Elton
S Huttegger
SJ Gould
SM Stanley
T Van der Niet
V Chickarmane
V Debat
V Viscosi
V Viscosi
V Viscosi
Vincenzo Viscosi
Ø Hammer
Ø Hammer
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Taxonomy relies greatly on morphology to discriminate groups. Computerized geometric morphometric methods for quantitative shape analysis measure, test and visualize differences in form in a highly effective, reproducible, accurate and statistically powerful way. Plant leaves are commonly used in taxonomic analyses and are particularly suitable to landmark based geometric morphometrics. However, botanists do not yet seem to have taken advantage of this set of methods in their studies as much as zoologists have done. Using free software and an example dataset from two geographical populations of sessile oak leaves, we describe in detailed but simple terms how to: a) compute size and shape variables using Procrustes methods; b) test measurement error and the main levels of variation (population and trees) using a hierachical design; c) estimate the accuracy of group discrimination; d) repeat this estimate after controlling for the effect of size differences on shape (i.e., allometry). Measurement error was completely negligible; individual variation in leaf morphology was large and differences between trees were generally bigger than within trees; differences between the two geographic populations were small in both size and shape; despite a weak allometric trend, controlling for the effect of size on shape slighly increased discrimination accuracy. Procrustes based methods for the analysis of landmarks were highly efficient in measuring the hierarchical structure of differences in leaves and in revealing very small-scale variation. In taxonomy and many other fields of botany and biology, the application of geometric morphometrics contributes to increase scientific rigour in the description of important aspects of the phenotypic dimension of biodiversity. Easy to follow but detailed step by step example studies can promote a more extensive use of these numerical methods, as they provide an introduction to the discipline which, for many biologists, is less intimidating than the often inaccessible specialistic literature

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia