Search CORE

49 research outputs found

Data science

Author: A Silberschatz
AL Samuel
AV Aho
C Shearer
C Ware
CD Manning
CM Bishop
DE Knuth
EF Codd
ER Tufte
G Rossum Van
GF Luger
H Chen
H Schütze
I Goodfellow
IH Witten
J Neumann Von
J Spohrer
JF Hughes
Kurt Stockinger
L Deng
L Wall
L Wasserman
M Frické
M Stonebraker
R Ramakrishnan
RR Wilcox
S Bateman
S Chaudhuri
SJ Russell
U Fayyad
WH Inmon
ZC Holcomb
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2019
Field of study

Even though it has only entered public perception relatively recently, the term "data science" already means many things to many people. This chapter explores both top-down and bottom-up views on the field, on the basis of which we define data science as "a unique blend of principles and methods from analytics, engineering, entrepreneurship and communication that aim at generating value from the data itself". The chapter then discusses the disciplines that contribute to this "blend", briefly outlining their contributions and giving pointers for readers interested in exploring their backgrounds further

Crossref

ZHAW digitalcollection

The Newcomb-Benford Law in Its Relation to Some Common Distributions

Author: A Berger
A Diekmann
AK Adhikari
AK Adhikari
Anton K. Formann
B Luque
DE Giles
DE Knuth
DN Hales
DV Hinkley
E Ley
F Benford
G Judge
GA Gottwald
H-A Engel
H-J Kim
J Torres
L Dümbgen
L Pietronero
LM Leemis
MJ Nigrini
MW Browne
NL Johnson
P Schatte
RA Raimi
Richard James Morris
RJ Rodriguez
RS Pinkham
S Irmay
S Newcomb
SJ Miller
T el Sehity
T Lolbert
TP Hill
TP Hill
TP Hill
TP Hill
TW Beer
W Hürlimann
WH Furry
É Janvresse
Publication venue: Public Library of Science
Publication date: 07/05/2010
Field of study

An often reported, but nevertheless persistently striking observation, formalized as the Newcomb-Benford law (NBL), is that the frequencies with which the leading digits of numbers occur in a large variety of data are far away from being uniform. Most spectacular seems to be the fact that in many data the leading digit 1 occurs in nearly one third of all cases. Explanations for this uneven distribution of the leading digits were, among others, scale- and base-invariance. Little attention, however, found the interrelation between the distribution of the significant digits and the distribution of the observed variable. It is shown here by simulation that long right-tailed distributions of a random variable are compatible with the NBL, and that for distributions of the ratio of two random variables the fit generally improves. Distributions not putting most mass on small values of the random variable (e.g. symmetric distributions) fail to fit. Hence, the validity of the NBL needs the predominance of small values and, when thinking of real-world data, a majority of small entities. Analyses of data on stock prices, the areas and numbers of inhabitants of countries, and the starting page numbers of papers from a bibliography sustain this conclusion. In all, these findings may help to understand the mechanisms behind the NBL and the conditions needed for its validity. That this law is not only of scientific interest per se, but that, in addition, it has also substantial implications can be seen from those fields where it was suggested to be put into practice. These fields reach from the detection of irregularities in data (e.g. economic fraud) to optimizing the architecture of computers regarding number representation, storage, and round-off errors

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Network Compression as a Quality Measure for Protein Interaction Networks

Author: A Barabasi
A Breitkreutz
A Ceol
A Grigoriev
A Kocsor
A Langville
A Shevchenko
A Shevchenko
A Sorribas
A Whitty
A. Francis Stewart
AC Gavin
AC Gavin
B Aranda
B Titz
BD MacArthur
BJ Breitkreutz
C von Mering
CE Shannon
CM Deane
D Hannah
D Minoli
DA Schneider
DE Knuth
DJ LaCount
DJ Watts
DL Lindstrom
E Formstecher
E Torreira
EL Hong
F Jin
G Butland
G Lima-Mendez
GD Bader
GD Bader
GJ Chaitin
GT Hart
H Dortay
H Herzel
H Lu
H Yu
HB Fraser
HW Mewes
I Lee
I Lemmens
J Leskovec
J Sun
J White
J Zhong
JC Claussen
JC Rain
JF Rual
JJ Heymans
JR Parrish
K Anand
K Norlen
K Tarassov
K Venkatesan
KH Randall
L Demetrius
L Giot
L Ji
L Kiemer
L Royer
L Salwinski
LJ Jensen
Loic Royer
M Arifuzzaman
M Dehmer
M Dehmer
M Deng
M Harata
M Kao
M Li
M Pellegrini
Matthias Reimann
ME Cusick
MEJ Newman
Michael Schroeder
N Deo
N Simonis
NJ Krogan
O Weiss
P Boldi
P Braun
P Erds
P Smialowski
P Uetz
Patrick Aloy
PM Kim
PW Holland
R Diestel
R Jansen
R Solé
RJ Deshaies
RM Ewing
S Fields
S Jukna
S Li
S Maslov
S Sato
SR Collins
T Feder
T Ito
T Ito
T Manke
T Pawson
T Reguly
TSK Prasad
U Stelzl
V Colizza
W Cleveland
WH Wu
WK Huh
WT Tutte
X Shen
X Xin
Y Ho
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

With the advent of large-scale protein interaction studies, there is much debate about data quality. Can different noise levels in the measurements be assessed by analyzing network structure? Because proteomic regulation is inherently co-operative, modular and redundant, it is inherently compressible when represented as a network. Here we propose that network compression can be used to compare false positive and false negative noise levels in protein interaction networks. We validate this hypothesis by first confirming the detrimental effect of false positives and false negatives. Second, we show that gold standard networks are more compressible. Third, we show that compressibility correlates with co-expression, co-localization, and shared function. Fourth, we also observe correlation with better protein tagging methods, physiological expression in contrast to over-expression of tagged proteins, and smart pooling approaches for yeast two-hybrid screens. Overall, this new measure is a proxy for both sensitivity and specificity and gives complementary information to standard measures such as average degree and clustering coefficients

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Murine monoclonal antibodies specific for conserved and non-conserved antigenic determinants of the human and murine Ku autoantigens

Author: A Dvir
A Porges
AM Francoeur
CH Chou
E Vries de
J Wen
JE Celis
JW Goding
L Abu-Elheiga
M Falzon
M Reichlin
M Shlomchik
M Yaneva
MH Stuiver
MW Knuth
R Bravo
S Lees-Miller
S Paillard
SP Jackson
SP Lees-Miller
T Mimori
T Mimori
TM Gottlieb
WH Reeves
WH Reeves
WH Reeves
WH Reeves
WH Reeves
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures

Author: A Supalov
C Chen
D Szałkowski
DE Knuth
DE Knuth
DR Ripoll
DV Pryor
EE Santos
H Niederreiter
Hockney RW
J Dongarra
J Jeffers
J Jeffers
JM Bull
Przemysław Stpiczyński
R Chandra
R Hockney
R Rahman
T Hahn
WH Press
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Variance component estimation techniques compared for two mating designs with forest genetic architecture through computer simulation

Author: A Olsen
AR Hallauer
B Griffing
CR Henderson
CR Rao
CR Rao
D. A. Huber
DA Harville
DE Knuth
DF Matzinger
EJG Pittman
FE Bridgwater
FG Giesbrecht
G. R. Hodge
GA Milliken
HD Patterson
HO Hartley
HO Hartley
JA Loo-Dinkins
JH Goodnight
JH Klotz
JJ Miller
K Meyer
M Singh
MD Wilcox
MG Kendall
RG Shaw
RJ Weir
RR Corbeil
RV Hogg
SR Searle
SR Searle
T. L. White
WH Press
WH Swallow
WH Swallow
WH Swallow
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A symmetry property for $q$ q -weighted Robinson–Schensted and other branching insertion algorithms

Author: A Gerasimov
C Greene
C Schensted
DE Knuth
I MacDonald
K Johansson
N O’Connell
N O’Connell
N O’Connell
N O’Connell
P Etingof
R Stanley
S Fomin
S Fomin
S Fomin
SNM Ruijsenaars
T Sasamoto
WH Burge
Yuchen Pei
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Solving parameterised boolean equation systems with infinite data through quotienting

Author: A Bouajjani
D Park
DE Knuth
E Clarke
G Behrmann
Gijs Kant
H Garavel
JF Groote
JJA Keiren
JJA Keiren
Kathi Fisler
L Lamport
P Fontana
R Alur
RPJ Koolen
S Cranen
S Tripakis
Sjoerd Cranen
Sjoerd Cranen
T Chen
TAC Willemse
WH Hesselink
Yutaro Nagae
Yutaro Nagae
Publication venue: Springer
Publication date: 05/10/2018
Field of study

Parameterised Boolean Equation Systems (PBESs) can be used to represent many different kinds of decision problems. Most notably, model checking and equivalence problems can be encoded in a PBES. Traditional instantiation techniques cannot deal with PBESs with an infinite data domain. We propose an approach that can solve PBESs with infinite data by computing the bisimulation quotient of the underlying graph structure. Furthermore, we show how this technique can be improved by repeatedly searching for finite proofs. Unlike existing approaches, our technique is not restricted to subfragments of PBESs. Experimental results show that our ideas work well in practice and support a wider range of models and properties than state-of-the-art techniques.</p

Crossref

Pure OAI Repository

Stochastic nonlinear dynamics pattern formation and growth models

Author: A Turing
B Mandelbrot
DE Knuth
H Meinhardt
H Meinhardt
H Meinhardt
H-O Peitgen
JD Murray
L Yaroslavsky
L Yaroslavsky
Leonid P Yaroslavsky
LP Yaroslavsky
LP Yaroslavsky
M Argentina
M Eden
M Eden
M Eden
M Gardner
Ph Brodatz
VV Lyubarsky
WH Press
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Construction of cospectral graphs

Author: A Peres
AJ Schwenk
AJ Schwenk
Bibhas Adhikari
CD Godsil
CW Wu
DE Knuth
DM Cvetkovic
DM Cvetkovic
ER van Dam
F Harary
H Fujii
L Halbeisen
P Horodecki
P Rowlinson
R Hildebrand
RA Horn
S Dutta
S Dutta
S Dutta
SR Garcia
Supriyo Dutta
WH Haemers
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref