87 research outputs found

    Accurate user directed summarization from existing tools

    Get PDF
    This paper describes a set of experimental results produced from the TIPSTER SUMMAC initiative on user directed summaries: document summaries generated in the context of an information need expressed as a query. The summarizer that was evaluated was based on a set of existing statistical techniques that had been applied successfully to the INQUERY retrieval system. The techniques proved to have a wider utility, however, as the summarizer was one of the better performing systems in the SUMMAC evaluation. The design of this summarizer is presented with a range of evaluations: both those provided by SUMMAC as well as a set of preliminary, more informal, evaluations that examined additional aspects of the summaries. Amongst other conclusions, the results reveal that users can judge the relevance of documents from their summary almost as accurately as if they had had access to the document’s full text

    The Dangers of Automated Gunshot Detection

    Get PDF

    An Experimental Digital Library Platform - A Demonstrator Prototype for the DigLib Project at SICS

    Get PDF
    Within the framework of the Digital Library project at SICS, this thesis describes the implementation of a demonstrator prototype of a digital library (DigLib); an experimental platform integrating several functions in one common interface. It includes descriptions of the structure and formats of the digital library collection, the tailoring of the search engine Dienst, the construction of a keyword extraction tool, and the design and development of the interface. The platform was realised through sicsDAIS, an agent interaction and presentation system, and is to be used for testing and evaluating various tools for information seeking. The platform supports various user interaction strategies by providing: search in bibliographic records (Dienst); an index of keywords (the Keyword Extraction Function (KEF)); and browsing through the hierarchical structure of the collection. KEF was developed for this thesis work, and extracts and presents keywords from Swedish documents. Although based on a comparatively simple algorithm, KEF contributes by supplying a long-felt want in the area of Information Retrieval. Evaluations of the tasks and the interface still remain to be done, but the digital library is very much up and running. By implementing the platform through sicsDAIS, DigLib can deploy additional tools and search engines without interfering with already running modules. If wanted, agents providing other services than SICS can supply, can be plugged in

    Wrongly Accused Redux: How Race Contributes to Convicting the Innocent: The Informants Example

    Get PDF
    This article analyzes five forces that may raise the risk of convicting the innocent based upon the suspect\u27s race: the selection, ratchet, procedural justice, bystanders, and aggressive-suspicion effects. In other words, subconscious forces press police to focus more attention on racial minorites, the ratchet makes this focus every-increasing, the resulting sense by the community of unfair treatment raises its involvment in crime while lowering its willingness to aid the police in resisting crime, innocent persons suffer when their skin color becomes associated with criminality, and the police use more aggressive techniques on racial minorities in a way that raises the risk of reply upon a false confession, a mistaken eyewitness identifcication, or other flawed investigation method that raises the risk of error. The piece explains how these five forces interact to promote racially-biased wrongful convictions. The final major portion of the article uses the plight of informants as a case study to demonstrate the interaction of these techniques to promote further mistakes

    Justice Ruth Bader Ginsburg: 20 Years of Supreme Court Jurisprudence

    Get PDF

    A semantic-based approach to information processing

    Get PDF
    The research reported in this thesis is centred around the development of a semantic based approach to information processing. Traditional word-based pattern matching approaches to information processing suffer from both the richness and ambiguousness of natural language. Although retrieval performances of traditional systems can be satisfactory in many situations, it is commonly held that the traditional approach has reached the peak of its potential and any substantial improvements will be very difficult to achieve, [Smea91], Word-based pattern matching retrieval systems are devoid of the semantic power necessary to either distinguish between different senses of homonyms or identity the similar meanings of related terms. Our proposed semantic information processing system was designed to tackle these problems among others, (we also wanted to allow phrasal as well as single word terms to describe concepts). Our prototype system is comprised of a WordNet derived domain independent knowledge base (KB) and a concept level semantic similarity estimator. The KB, which is rich in noun phrases, is used as a controlled vocabulary which effectively addresses many of the problems posed by ambiguities in natural language. Similarly both proposals for the semantic similarity estimator tackle issues regarding the richness of natural language and in particular the multitude of ways of expressing the same concept. A semantic based document retrieval system is developed as a means of evaluating our approach. However, many other information processing applications are discussed with particular attention directed towards the application of our approach to locating and relating information in a large scale Federated Database System (FDBS). The document retrieval evaluation application operates by obtaining KB representations of both the documents and queries and using the semantic similarity estimators as the comparison mechanism in the procedure to determine the degree of relevance of a document for a query. The construction of KB representations for documents and queries is a completely automatic procedure, and among other steps includes a sense disambiguation phase. The sense disambiguator developed for this research also represents a departure from existing approaches to sense disambiguation. In our approach four individual disambiguation mechanisms are used to individually weight different senses of ambiguous terms. This allows the possibility of there being more than one correct sense. Our evaluation mechanism employs the Wall Street Journal text corpus and a set of TREC queries along with their relevance assessments in an ovrall document retrieval application. A traditional pattern matching tPIDF system is used as a baseline system in our evaluation experiments. The results indicate firstly that our WordNet derived KB is capable of being used as a controlled vocabulary and secondly that our approaches to estimating semantic similarity operate well at their intended concept level. However, it is more difficult to arrive at conclusive interpretations of the results with regard to the application of our semantic based systems to the complex task of document retrieval. A more complete evaluation is left as a topic for future research

    COSPO/CENDI Industry Day Conference

    Get PDF
    The conference's objective was to provide a forum where government information managers and industry information technology experts could have an open exchange and discuss their respective needs and compare them to the available, or soon to be available, solutions. Technical summaries and points of contact are provided for the following sessions: secure products, protocols, and encryption; information providers; electronic document management and publishing; information indexing, discovery, and retrieval (IIDR); automated language translators; IIDR - natural language capabilities; IIDR - advanced technologies; IIDR - distributed heterogeneous and large database support; and communications - speed, bandwidth, and wireless

    Winona Daily News

    Get PDF
    https://openriver.winona.edu/winonadailynews/1338/thumbnail.jp

    Automatic bilingual text document summarization.

    Get PDF
    Lo Sau-Han Silvia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 137-143).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Definition of a summary --- p.2Chapter 1.2 --- Definition of text summarization --- p.3Chapter 1.3 --- Previous work --- p.4Chapter 1.3.1 --- Extract-based text summarization --- p.5Chapter 1.3.2 --- Abstract-based text summarization --- p.8Chapter 1.3.3 --- Sophisticated text summarization --- p.9Chapter 1.4 --- Summarization evaluation methods --- p.10Chapter 1.4.1 --- Intrinsic evaluation --- p.10Chapter 1.4.2 --- Extrinsic evaluation --- p.11Chapter 1.4.3 --- The TIPSTER SUMMAC text summarization evaluation --- p.11Chapter 1.4.4 --- Text Summarization Challenge (TSC) --- p.13Chapter 1.5 --- Research contributions --- p.14Chapter 1.5.1 --- Text summarization based on thematic term approach --- p.14Chapter 1.5.2 --- Bilingual news summarization based on an event-driven approach --- p.15Chapter 1.6 --- Thesis organization --- p.16Chapter 2 --- Text Summarization based on a Thematic Term Approach --- p.17Chapter 2.1 --- System overview --- p.18Chapter 2.2 --- Document preprocessor --- p.20Chapter 2.2.1 --- English corpus --- p.20Chapter 2.2.2 --- English corpus preprocessor --- p.22Chapter 2.2.3 --- Chinese corpus --- p.23Chapter 2.2.4 --- Chinese corpus preprocessor --- p.24Chapter 2.3 --- Corpus thematic term extractor --- p.24Chapter 2.4 --- Article thematic term extractor --- p.26Chapter 2.5 --- Sentence score generator --- p.29Chapter 2.6 --- Chapter summary --- p.30Chapter 3 --- Evaluation for Summarization using the Thematic Term Ap- proach --- p.32Chapter 3.1 --- Content-based similarity measure --- p.33Chapter 3.2 --- Experiments using content-based similarity measure --- p.36Chapter 3.2.1 --- English corpus and parameter training --- p.36Chapter 3.2.2 --- Experimental results using content-based similarity mea- sure --- p.38Chapter 3.3 --- Average inverse rank (AIR) method --- p.59Chapter 3.4 --- Experiments using average inverse rank method --- p.60Chapter 3.4.1 --- Corpora and parameter training --- p.61Chapter 3.4.2 --- Experimental results using AIR method --- p.62Chapter 3.5 --- Comparison between the content-based similarity measure and the average inverse rank method --- p.69Chapter 3.6 --- Chapter summary --- p.73Chapter 4 --- Bilingual Event-Driven News Summarization --- p.74Chapter 4.1 --- Corpora --- p.75Chapter 4.2 --- Topic and event definitions --- p.76Chapter 4.3 --- Architecture of bilingual event-driven news summarization sys- tem --- p.77Chapter 4.4 --- Bilingual event-driven approach summarization --- p.80Chapter 4.4.1 --- Dictionary-based term translation applying on English news articles --- p.80Chapter 4.4.2 --- Preprocessing for Chinese news articles --- p.89Chapter 4.4.3 --- Event clusters generation --- p.89Chapter 4.4.4 --- Cluster selection and summary generation --- p.96Chapter 4.5 --- Evaluation for summarization based on event-driven approach --- p.101Chapter 4.6 --- Experimental results on event-driven summarization --- p.103Chapter 4.6.1 --- Experimental settings --- p.103Chapter 4.6.2 --- Results and analysis --- p.105Chapter 4.7 --- Chapter summary --- p.113Chapter 5 --- Applying Event-Driven Summarization to a Parallel Corpus --- p.114Chapter 5.1 --- Parallel corpus --- p.115Chapter 5.2 --- Parallel documents preparation --- p.116Chapter 5.3 --- Evaluation methods for the event-driven summaries generated from the parallel corpus --- p.118Chapter 5.4 --- Experimental results and analysis --- p.121Chapter 5.4.1 --- Experimental settings --- p.121Chapter 5.4.2 --- Results and analysis --- p.123Chapter 5.5 --- Chapter summary --- p.132Chapter 6 --- Conclusions and Future Work --- p.133Chapter 6.1 --- Conclusions --- p.133Chapter 6.2 --- Future work --- p.135Bibliography --- p.137Chapter A --- English Stop Word List --- p.144Chapter B --- Chinese Stop Word List --- p.149Chapter C --- Event List Items on the Corpora --- p.151Chapter C.1 --- "Event list items for the topic ""Upcoming Philippine election""" --- p.151Chapter C.2 --- "Event list items for the topic ""German train derail"" " --- p.153Chapter C.3 --- "Event list items for the topic ""Electronic service delivery (ESD) scheme"" " --- p.154Chapter D --- The sample of an English article (9505001.xml). --- p.15
    corecore