Search CORE

11,929 research outputs found

Exploratory Analysis of Highly Heterogeneous Document Collections

Author: Blei D. M.
Bun K. K.
Maiya A. S.
Manning C. D.
Mihalcea R.
Pecina P.
Ranganathan S. R.
Wagstaff K.
Publication venue
Publication date: 01/01/2013
Field of study

We present an effective multifaceted system for exploratory analysis of highly heterogeneous document collections. Our system is based on intelligently tagging individual documents in a purely automated fashion and exploiting these tags in a powerful faceted browsing framework. Tagging strategies employed include both unsupervised and supervised approaches based on machine learning and natural language processing. As one of our key tagging strategies, we introduce the KERA algorithm (Keyword Extraction for Reports and Articles). KERA extracts topic-representative terms from individual documents in a purely unsupervised fashion and is revealed to be significantly more effective than state-of-the-art methods. Finally, we evaluate our system in its ability to help users locate documents pertaining to military critical technologies buried deep in a large heterogeneous sea of information.Comment: 9 pages; KDD 2013: 19th ACM SIGKDD Conference on Knowledge Discovery and Data Minin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Inferential mistakes in population proxies: A response to Torfing's "Neolithic population and summed probability distribution of 14Cdates"

Author: Manning K
Shennan S
Timpson A
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 26/06/2015
Field of study

In his paper "Neolithic population and summed probability distribution of 14C-dates" Torfing opposes the widely held principle originally proposed by Rick (1987) that variation through time in the amount of archaeological material discovered in a region will reflect variation in the size of that local human population. His argument illustrates a persistent divide in archaeology between analytical and descriptive approaches when using proxies for past population size. We critically evaluate the numerous inferential mistakes he makes, showing that his conclusion is unjustified

UCL Discovery

Scaling and Universality in the Counterion-Condensation Transition at Charged Cylinders

Author: A. G. Moreira
Ali Naji
B. H. Zimm
F. Oosawa
G. S. Manning
G. S. Manning
J. Cardy
K. Binder
M. Le Bret
R. Zana
Roland R. Netz
T. Ohnishi
Y. Y. Suzuki
Publication venue: 'American Physical Society (APS)'
Publication date: 31/08/2005
Field of study

We address the critical and universal aspects of counterion-condensation transition at a single charged cylinder in both two and three spatial dimensions using numerical and analytical methods. By introducing a novel Monte-Carlo sampling method in logarithmic radial scale, we are able to numerically simulate the critical limit of infinite system size (corresponding to infinite-dilution limit) within tractable equilibration times. The critical exponents are determined for the inverse moments of the counterionic density profile (which play the role of the order parameters and represent the inverse localization length of counterions) both within mean-field theory and within Monte-Carlo simulations. In three dimensions (3D), correlation effects (neglected within mean-field theory) lead to an excessive accumulation of counterions near the charged cylinder below the critical temperature (condensation phase), while surprisingly, the critical region exhibits universal critical exponents in accord with the mean-field theory. In two dimensions (2D), we demonstrate, using both numerical and analytical approaches, that the mean-field theory becomes exact at all temperatures (Manning parameters), when number of counterions tends to infinity. For finite particle number, however, the 2D problem displays a series of peculiar singular points (with diverging heat capacity), which reflect successive de-localization events of individual counterions from the central cylinder. In both 2D and 3D, the heat capacity shows a universal jump at the critical point, and the energy develops a pronounced peak. The asymptotic behavior of the energy peak location is used to locate the critical temperature, which is also found to be universal and in accordance with the mean-field prediction.Comment: 31 pages, 16 figure

arXiv.org e-Print Archive

Crossref

Sofic-Dyck shifts

Author: A. Manning
C. Reutenauer
G. Keller
J. Berstel
J. Berstel
K. Culik II
R. McNaughton
S. Ginsburg
W. Krieger
Publication venue
Publication date: 01/01/2014
Field of study

We define the class of sofic-Dyck shifts which extends the class of Markov-Dyck shifts introduced by Inoue, Krieger and Matsumoto. Sofic-Dyck shifts are shifts of sequences whose finite factors form unambiguous context-free languages. We show that they correspond exactly to the class of shifts of sequences whose sets of factors are visibly pushdown languages. We give an expression of the zeta function of a sofic-Dyck shift

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Genome-Wide Association with Diabetes-Related Traits in the Framingham Heart Study

Author: Cupples L. Adrienne
Dupuis Josée
Florez Jose C.
Fox Caroline S.
Liu Chunyu
Manning Alisa K.
Meigs James B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2007
Field of study

BACKGROUND: Susceptibility to type 2 diabetes may be conferred by genetic variants having modest effects on risk. Genome-wide fixed marker arrays offer a novel approach to detect these variants. METHODS: We used the Affymetrix 100K SNP array in 1,087 Framingham Offspring Study family members to examine genetic associations with three diabetes-related quantitative glucose traits (fasting plasma glucose (FPG), hemoglobin A1c, 28-yr time-averaged FPG (tFPG)), three insulin traits (fasting insulin, HOMA-insulin resistance, and 0–120 min insulin sensitivity index); and with risk for diabetes. We used additive generalized estimating equations (GEE) and family-based association test (FBAT) models to test associations of SNP genotypes with sex-age-age2-adjusted residual trait values, and Cox survival models to test incident diabetes. RESULTS: We found 415 SNPs associated (at p 1%) 100K SNPs in LD (r2 > 0.05) with ABCC8 A1369S (rs757110), KCNJ11 E23K (rs5219), or SNPs in CAPN10 or HNFa. PPARG P12A (rs1801282) was not significantly associated with diabetes or related traits. CONCLUSION: Framingham 100K SNP data is a resource for association tests of known and novel genes with diabetes and related traits posted at. Framingham 100K data replicate the TCF7L2 association with diabetes.National Heart, Lung, and Blood Institute's Framingham Heart Study (N01-HC-25195); National Institutes of Health National Center for Research Resources Shared Instrumentation grant (1S10RR163736-01A1); National Center for Research Resources General Clinical Research Center (M01-RR-01066); American Diabetes Association Career Developement Award; GlaxoSmithKline; Merck; Lilly; National Institutes of Health Research Career Award (K23 DK659678-03

Boston University Institutional Repository (OpenBU)

PubMed Central

Why is the condensed phase of DNA preferred at higher temperature? DNA compaction in the presence of a multivalent cation

Author: Dean J. A.
Gelbart W. M.
Grosberg A. Yu.
Heskins M.
K Yoshikawa
Levin Y.
Manning G. S.
Oosawa F.
T Iwaki
T Saito
Yamasaki Y.
Yoshikawa K.
Publication venue: 'IOP Publishing'
Publication date: 01/11/2004
Field of study

Upon the addition of multivalent cations, a giant DNA chain exhibits a large discrete transition from an elongated coil into a folded compact state. We performed single-chain observation of long DNAs in the presence of a tetravalent cation (spermine), at various temperatures and monovalent salt concentrations. We confirmed that the compact state is preferred at higher temperatures and at lower monovalent salt concentrations. This result is interpreted in terms of an increase in the net translational entropy of small ions due to ionic exchange between higher and lower valence ions.Comment: 4pages,3figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

$^{24}$ Mg( $p$ , $\alpha$ ) $^{21}$ Na reaction study for spectroscopy of $^{21}$ Na

Author: Ahn S.
Bardayan D. W.
Cha S. M.
Chae K. Y.
Chipps K. A.
Cizewski J. A.
Howard M. E.
Kim A.
Kozub R. L.
Lee E. J.
Manning B.
Matos M.
O'Malley P. D.
Pain S. D.
Peters W. A.
Pittman S. T.
Ratkiewicz A.
Smith M. S.
Strauss S.
Publication venue
Publication date: 10/08/2015
Field of study

The

^{24}

Mg(

p

\alpha

)

^{21}

Na reaction was measured at the Holifield Radioactive Ion Beam Facility at Oak Ridge National Laboratory in order to better constrain spins and parities of energy levels in

^{21}

Na for the astrophysically important

^{17}

\alpha, p

)

^{20}

Ne reaction rate calculation. 31 MeV proton beams from the 25-MV tandem accelerator and enriched

^{24}

Mg solid targets were used. Recoiling

^{4}

He particles from the

^{24}

Mg(

p

\alpha

)

^{21}

Na reaction were detected by a highly segmented silicon detector array which measured the yields of

^{4}

He particles over a range of angles simultaneously. A new level at 6661

\pm

5 keV was observed in the present work. The extracted angular distributions for the first four levels of

^{21}

Na and Distorted Wave Born Approximation (DWBA) calculations were compared to verify and extract angular momentum transfer.Comment: 11 pages, 6 figures, proceedings of the 18th International Conference on Accelerators and Beam Utilization (ICABU2014

arXiv.org e-Print Archive

T ${}^2$ K ${}^2$ : The Twitter Top-K Keywords Benchmark

Author: A Guille
AE Gattiker
CD Manning
D Kılınç
DD Lewis
F Ravat
J Darmont
J Ferrarons
J Gray
J O’Shea
JD Cooper
K Spärck Jones
K Spärck Jones
L Wang
S Bringay
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/09/2017
Field of study

Information retrieval from textual data focuses on the construction of vocabularies that contain weighted term tuples. Such vocabularies can then be exploited by various text analysis algorithms to extract new knowledge, e.g., top-k keywords, top-k documents, etc. Top-k keywords are casually used for various purposes, are often computed on-the-fly, and thus must be efficiently computed. To compare competing weighting schemes and database implementations, benchmarking is customary. To the best of our knowledge, no benchmark currently addresses these problems. Hence, in this paper, we present a top-k keywords benchmark, T

{}^2

{}^2

, which features a real tweet dataset and queries with various complexities and selectivities. T

{}^2

{}^2

helps evaluate weighting schemes and database implementations in terms of computing performance. To illustrate T

{}^2

{}^2

's relevance and genericity, we successfully performed tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand, and on different relational (Oracle, PostgreSQL) and document-oriented (MongoDB) database implementations, on the other hand

arXiv.org e-Print Archive

Understanding Crime Scene Examination through an ethnographic lens

Author: Brewer J. D.
Casey E.
Holdaway S.
Knorr‐Cetina K.
Kruse C.
Latour B.
Manning P. K.
Manning P. K.
Tilley N.
Williams R.
Wilson‐Kovacs D.
Publication venue: 'Wiley'
Publication date: 30/07/2019
Field of study

Crossref

King's Research Portal