1,157 research outputs found

    Knowledge management and history

    Get PDF
    Capitalisation of the history of a technology, a technique or a concept within an industrial company is relevant to historians. However it largely exceeds the historical problems from a Knowledge Management point of view. In this context, it can be the subject of specific approaches especially Knowledge Engineering. However, it faces two types of difficulties: - The techniques in History have few modelling tools, and are even rather reticent with the use of such tools. - Knowledge Engineering doesn't often address historical knowledge modelling, for tracing knowledge evolution. It is however possible to develop robust and validated methods, tools and techniques which take into account these two approaches, which, if they function in synergy, appear rich and fertile.History, MASK, Knowledge management, Knowledge engineering, History of techniques

    Verbose, Laconic or Just Right: A Simple Computational Model of Content Appropriateness under Length Constraints

    Get PDF
    Length constraints impose implicit requirements on the type of content that can be included in a text. Here we pro- pose the first model to computationally assess if a text deviates from these requirements. Specifically, our model predicts the appropriate length for texts based on content types present in a snippet of constant length. We consider a range of features to approximate content type, including syntactic phrasing, constituent compression probability, presence of named entities, sentence specificity and intersentence continuity. Weights for these features are learned using a corpus of summaries written by experts and on high quality journalistic writing. During test time, the difference between actual and predicted length allows us to quantify text verbosity. We use data from manual evaluation of summarization systems to assess the verbosity scores produced by our model. We show that the automatic verbosity scores are significantly negatively correlated with manual content quality scores given to the summaries

    Quantity, Quality, and Relevance: Central Bank Research, 1990-2003

    Get PDF
    The authors document the research output of 34 central banks from 1990 to 2003, and use proxies of research inputs to measure the research productivity of central banks over this period. Results are obtained with and without controlling for quality and for policy relevance. The authors find that, overall, central banks have been hiring more researchers and publishing more research since 1990, with the United States accounting for more than half of all published central bank research output, although the European Central Bank is rapidly establishing itself as an important research centre. When controlling for research quality and relevance, the authors generally find that there is no clear relationship between the size of an institution and its productivity. They also find preliminary evidence of positive correlations between the policy relevance and the scientific quality of central bank research. There is only very weak evidence of a positive correlation between the quantity of external partnerships and the productivity of researchers in central banks.Central bank research

    Structured and Unstructured Cache Models for SMT Domain Adaptation

    Get PDF
    We present a French to English translation system for Wikipedia biography articles. We use training data from out- of-domain corpora and adapt the system for biographies. We propose two forms of domain adaptation. The first biases the system towards words likely in biographies and encourages repetition of words across the document. Since biographies in Wikipedia follow a regular structure, our second model exploits this structure as a sequence of topic segments, where each segment discusses a narrower subtopic of the biography domain. In this structured model, the system is encouraged to use words likely in the current segment’s topic rather than in biographies as a whole. We implement both systems using cache based translation techniques. We show that a system trained on Europarl and news can be adapted for biographies with 0.5 BLEU score improvement using our models. Further the structure-aware model out performs the system which treats the entire document as a single segment

    Which Step Do I Take First? Troubleshooting with Bayesian Models

    Get PDF
    Online discussion forums and community question-answering websites provide one of the primary avenues for online users to share information. In this paper, we propose text mining techniques which aid users navigate troubleshooting-oriented data such as questions asked on forums and their suggested solutions. We introduce Bayesian generative models of the troubleshooting data and apply them to two interrelated tasks (a) predicting the complexity of the solutions (e.g., plugging a keyboard in the computer is easier compared to installing a special driver) and (b) presenting them in a ranked order from least to most complex. Experimental results show that our models are on par with human performance on these tasks, while outperforming baselines based on solution length or readability

    Conversation Trees: A Grammar Model for Topic Structure in Forums

    Get PDF
    Online forum discussions proceed differently from face-to-face conversations and any single thread on an online forum contains posts on different subtopics. This work aims to characterize the content of a forum thread as a conversation tree of topics. We present models that jointly per- form two tasks: segment a thread into sub- parts, and assign a topic to each part. Our core idea is a definition of topic structure using probabilistic grammars. By leveraging the flexibility of two grammar formalisms, Context-Free Grammars and Linear Context-Free Rewriting Systems, our models create desirable structures for forum threads: our topic segmentation is hierarchical, links non-adjacent segments on the same topic, and jointly labels the topic during segmentation. We show that our models outperform a number of tree generation baselines

    Creating Local Coherence: An Empirical Assessment

    Get PDF
    Two of the mechanisms for creating natural transitions between adjacent sentences in a text, resulting in local coherence, involve discourse relations and switches of focus of attention between discourse entities. These two aspects of local coherence have been traditionally discussed and studied separately. But some empirical studies have given strong evidence for the necessity of understanding how the two types of coherence-creating devices interact. Here we present a joint corpus study of discourse relations and entity coherence exhibited in news texts from the Wall Street Journal and test several hypotheses expressed in earlier work about their interaction.

    A Bayesian Method to Incorporate Background Knowledge during Automatic Text Summarization

    Get PDF
    In order to summarize a document, it is often useful to have a background set of documents from the domain to serve as a reference for determining new and important information in the input document. We present a model based on Bayesian surprise which provides an intuitive way to identify surprising information from a summarization input with respect to a background corpus. Specifically, the method quantifies the degree to which pieces of information in the input change one’s beliefs’ about the world represented in the background. We develop systems for generic and update summarization based on this idea. Our method provides competitive content selection performance with particular advantages in the update task where systems are given a small and topical background corpus
    corecore