105,330 research outputs found

    A Similar Legal Case Retrieval System by Multiple Speech Question and Answer

    Get PDF
    In real life, on the one hand people often lack legal knowledge and legal awareness; on the other hand lawyers are busy, timeconsuming, and expensive. As a result, a series of legal consultation problems cannot be properly handled. Although some legal robots can answer some problems about basic legal knowledge that are matched by keywords, they cannot do similar case retrieval and sentencing prediction according to the criminal facts described in natural language. To overcome the difficulty, we propose a similar case retrieval system based on natural language understanding. The system uses online speech synthesis of IFLYTEK and speech reading and writing technology, integrates natural language semantic processing technology and multiple rounds of question-and-answer dialogue mechanism to realise the legal knowledge question and answer with the memory-based context processing ability, and finally retrieves a case that is most similar to the criminal facts that the user consulted. After trial use, the system has a good level of human-computer interaction and high accuracy of information retrieval, which can largely satisfy people\u27s consulting needs for legal issues

    Linking Parliamentary Minutes and Videos in the Japanese Diet

    Get PDF
    This paper offers an overview of the video retrieval system we have developed for the Japanese Diet. With our video retrieval system one can directly retrieve the video feed segment of interest, gain a visual understanding of the flow of parliamentary debate, and check the facial expressions and body language of the speaker. In this paper, we demonstrate how one can retrieve video streaming on user terminals that do not support Japanese language input, and suggest a variety of ways in which our video retrieval system can be utilized. Also, we report a preliminary analysis on the correspondence between the official minutes and the results of speech recognition of recordings of parliamentary meetings. We believe that our system encourages research on the utilization of visual information in policy-making and marks a step toward the provision of universal access to policy information.This work is supported by JSPS Kakenhi Grant Number 15H05727 and the Diet Archives Project funded by the GRIPS Policy Research Center.An earlier version (Masuyama 2016b) was presented at the 2016 General Conference of the European Consortium for Political Research, Charles University, Prague, Czech Republic, September 7-10, 2016.http://www.grips.ac.jp/list/jp/facultyinfo/masuyama_mikitaka

    Heat of Discussion: A New Approach to Understanding Parliamentary Discussion

    Get PDF
    This paper offers an overview of the video retrieval system we have developed for the Japanese Diet. With our video retrieval system one can directly retrieve the video feed segment of interest, gain a visual understanding of the flow of parliamentary debate, and check the facial expressions and body language of the speaker. In this paper, we demonstrate how one can retrieve video streaming on user terminals that do not support Japanese language input, and suggest a variety of ways in which our video retrieval system can be utilized. Also, we report a first systematic analysis on the correspondence between the official minutes and the results of speech recognition of recordings of parliamentary meetings. Departing from tradition of focusing on written official minutes, we investigate the variation in the rate of correspondence and understand complex and multifaceted nature of parliamentary discussion. We believe that our system encourages research on the utilization of visual information in policymaking and marks a step toward the provision of universal access to policy information.This work is supported by JSPS Kakenhi Grant Number 15H05727 and based on a paper prepared for presentation at the 25th IPSA World Congress of Political Science, Brisbane, Australia, July 21 - 26, 2018.http://www.grips.ac.jp/list/jp/facultyinfo/masuyama_mikitaka

    Improved Chinese Language Processing for an Open Source Search Engine

    Get PDF
    Natural Language Processing (NLP) is the process of computers analyzing on human languages. There are also many areas in NLP. Some of the areas include speech recognition, natural language understanding, and natural language generation. Information retrieval and natural language processing for Asians languages has its own unique set of challenges not present for Indo-European languages. Some of these are text segmentation, named entity recognition in unsegmented text, and part of speech tagging. In this report, we describe our implementation of and experiments with improving the Chinese language processing sub-component of an open source search engine, Yioop. In particular, we rewrote and improved the following sub-systems of Yioop to try to make them as state-of-the-art as possible: Chinese text segmentation, Part-of-speech (POS) tagging, Named Entity Recognition (NER), and Question and Answering System. Compared to the previous system we had a 9% improvement on Chinese words Segmentation accuracy. We built POS tagging with 89% accuracy. And We implement NER System with 76% accuracy

    Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval

    Get PDF
    International audienceProper name recognition is a challenging task in information retrieval from large audio/video databases. Proper names are semantically rich and are usually key to understanding the information contained in a document. Our work focuses on increasing the vocabulary coverage of a speech transcription system by automatically retrieving proper names from contemporary diachronic text documents. We proposed methods that dynamically augment the automatic speech recognition system vocabulary using lexical and temporal features in diachronic documents. We also studied different metrics for proper name selection in order to limit the vocabulary augmentation and therefore the impact on the ASR performances. Recognition results show a significant reduction of the proper name error rate using an augmented vocabulary

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Multimedia information technology and the annotation of video

    Get PDF
    The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    Information Compression, Intelligence, Computing, and Mathematics

    Full text link
    This paper presents evidence for the idea that much of artificial intelligence, human perception and cognition, mainstream computing, and mathematics, may be understood as compression of information via the matching and unification of patterns. This is the basis for the "SP theory of intelligence", outlined in the paper and fully described elsewhere. Relevant evidence may be seen: in empirical support for the SP theory; in some advantages of information compression (IC) in terms of biology and engineering; in our use of shorthands and ordinary words in language; in how we merge successive views of any one thing; in visual recognition; in binocular vision; in visual adaptation; in how we learn lexical and grammatical structures in language; and in perceptual constancies. IC via the matching and unification of patterns may be seen in both computing and mathematics: in IC via equations; in the matching and unification of names; in the reduction or removal of redundancy from unary numbers; in the workings of Post's Canonical System and the transition function in the Universal Turing Machine; in the way computers retrieve information from memory; in systems like Prolog; and in the query-by-example technique for information retrieval. The chunking-with-codes technique for IC may be seen in the use of named functions to avoid repetition of computer code. The schema-plus-correction technique may be seen in functions with parameters and in the use of classes in object-oriented programming. And the run-length coding technique may be seen in multiplication, in division, and in several other devices in mathematics and computing. The SP theory resolves the apparent paradox of "decompression by compression". And computing and cognition as IC is compatible with the uses of redundancy in such things as backup copies to safeguard data and understanding speech in a noisy environment
    corecore