1,129 research outputs found
Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications (Extended Version)
Although the ``scale-free'' literature is large and growing, it gives neither
a precise definition of scale-free graphs nor rigorous proofs of many of their
claimed properties. In fact, it is easily shown that the existing theory has
many inherent contradictions and verifiably false claims. In this paper, we
propose a new, mathematically precise, and structural definition of the extent
to which a graph is scale-free, and prove a series of results that recover many
of the claimed properties while suggesting the potential for a rich and
interesting theory. With this definition, scale-free (or its opposite,
scale-rich) is closely related to other structural graph properties such as
various notions of self-similarity (or respectively, self-dissimilarity).
Scale-free graphs are also shown to be the likely outcome of random
construction processes, consistent with the heuristic definitions implicit in
existing random graph approaches. Our approach clarifies much of the confusion
surrounding the sensational qualitative claims in the scale-free literature,
and offers rigorous and quantitative alternatives.Comment: 44 pages, 16 figures. The primary version is to appear in Internet
Mathematics (2005
Maximum Entropy Models For Natural Language Ambiguity Resolution
This thesis demonstrates that several important kinds of natural language ambiguities can be resolved to state-of-the-art accuracies using a single statistical modeling technique based on the principle of maximum entropy.
We discuss the problems of sentence boundary detection, part-of-speech tagging, prepositional phrase attachment, natural language parsing, and text categorization under the maximum entropy framework. In practice, we have found that maximum entropy models offer the following advantages:
State-of-the-art Accuracy: The probability models for all of the tasks discussed perform at or near state-of-the-art accuracies, or outperform competing learning algorithms when trained and tested under similar conditions. Methods which outperform those presented here require much more supervision in the form of additional human involvement or additional supporting resources.
Knowledge-Poor Features: The facts used to model the data, or features, are linguistically very simple, or knowledge-poor but yet succeed in approximating complex linguistic relationships.
Reusable Software Technology: The mathematics of the maximum entropy framework are essentially independent of any particular task, and a single software implementation can be used for all of the probability models in this thesis.
The experiments in this thesis suggest that experimenters can obtain state-of-the-art accuracies on a wide range of natural language tasks, with little task-specific effort, by using maximum entropy probability models
Understanding Internet topology: principles, models, and validation
Building on a recent effort that combines a first-principles approach to modeling router-level connectivity with a more pragmatic use of statistics and graph theory, we show in this paper that for the Internet, an improved understanding of its physical infrastructure is possible by viewing the physical connectivity as an annotated graph that delivers raw connectivity and bandwidth to the upper layers in the TCP/IP protocol stack, subject to practical constraints (e.g., router technology) and economic considerations (e.g., link costs). More importantly, by relying on data from Abilene, a Tier-1 ISP, and the Rocketfuel project, we provide empirical evidence in support of the proposed approach and its consistency with networking reality. To illustrate its utility, we: 1) show that our approach provides insight into the origin of high variability in measured or inferred router-level maps; 2) demonstrate that it easily accommodates the incorporation of additional objectives of network design (e.g., robustness to router failure); and 3) discuss how it complements ongoing community efforts to reverse-engineer the Internet
Looking Beyond the Canonical Formulation and Evaluation Paradigm of Prepositional Phrase Attachment
Prepositional phrase attachment has long been considered one of the most difficult tasks in automated syntactic parsing of natural language text. In this thesis, we examine several aspects of what has become the dominant view of PP attachment in natural language processing with an eye toward extending this view to a more realistic account of the problem. In particular, we take issue with the manner in which most PP attachment work is evaluated, and the degree to which traditional assumptions and simplifications no longer allow for realistically meaningful assessments. We also argue for looking beyond the canonical subset of attachment problems, where almost all attention has been focused, toward a fuller view of the task, both in terms of the types of ambiguities addressed and the contextual information considered
An assessment of brand experience knowledge literature: using bibliometric data to identify future research direction
There is wide consensus that the brand experience literature (BEL) suffers from a deficit in conceptual works. This study argues that, for brand experience research to overcome its conceptual insipidity, it must reexamine the core of its intellectual structure to rediscover what ‘an experience provided by brands’ truly implies. The purpose of this paper is to reconceptualize and present a future research framework for research into the concept of brand experience, by identifying both the core and peripheral sources of knowledge of the concept and its association with brand meaning. Through a bibliometric process covering 136 articles published between 2002 and 2018, resulting in a database of 2,698 citations, this brand experience conceptual paper fills a critical research gap by providing the first full-scale bibliometric study to date of the BEL, using a combination of high citation and co-citation metrics. Based on this conceptual reorientation, a matrix for future development is presented, enabling the reader to visualize the scope and breadth of potential brand experience research horizons in areas relating to customer experience, consumer-brand relationship, online brand experience and sensory brand experience. The four approaches listed in the matrix – firm-based, social constructionist, virtuality and embodiment – provide a roadmap for future brand experience research undertakings to explore the rich potential of experience evoked by brands
Using the Web to Overcome Data Sparseness
This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verbobject bigrams from the web by querying a search engine. We evaluate this method by demonstrating that web frequencies and correlate with frequencies obtained from a carefully edited, balanced corpus
- …