9,475 research outputs found

    Chinese Speech Information Retrieval for Questions on Mobile Phone Operation

    Get PDF
    PACLIC 20 / Wuhan, China / 1-3 November, 200

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    A Similar Legal Case Retrieval System by Multiple Speech Question and Answer

    Get PDF
    In real life, on the one hand people often lack legal knowledge and legal awareness; on the other hand lawyers are busy, timeconsuming, and expensive. As a result, a series of legal consultation problems cannot be properly handled. Although some legal robots can answer some problems about basic legal knowledge that are matched by keywords, they cannot do similar case retrieval and sentencing prediction according to the criminal facts described in natural language. To overcome the difficulty, we propose a similar case retrieval system based on natural language understanding. The system uses online speech synthesis of IFLYTEK and speech reading and writing technology, integrates natural language semantic processing technology and multiple rounds of question-and-answer dialogue mechanism to realise the legal knowledge question and answer with the memory-based context processing ability, and finally retrieves a case that is most similar to the criminal facts that the user consulted. After trial use, the system has a good level of human-computer interaction and high accuracy of information retrieval, which can largely satisfy people\u27s consulting needs for legal issues

    Construction and Application of Learner Corpus for Chinese Language E-Learning Systems

    Get PDF
    制度:新 ; 報告番号:甲3354号 ; 学位の種類:博士(国際情報通信学) ; 授与年月日:2011/2/23 ; 早大学位記番号:新567

    Multi-view Semantic Matching of Question retrieval using Fine-grained Semantic Representations

    Full text link
    As a key task of question answering, question retrieval has attracted much attention from the communities of academia and industry. Previous solutions mainly focus on the translation model, topic model, and deep learning techniques. Distinct from the previous solutions, we propose to construct fine-grained semantic representations of a question by a learned importance score assigned to each keyword, so that we can achieve a fine-grained question matching solution with these semantic representations of different lengths. Accordingly, we propose a multi-view semantic matching model by reusing the important keywords in multiple semantic representations. As a key of constructing fine-grained semantic representations, we are the first to use a cross-task weakly supervised extraction model that applies question-question labelled signals to supervise the keyword extraction process (i.e. to learn the keyword importance). The extraction model integrates the deep semantic representation and lexical matching information with statistical features to estimate the importance of keywords. We conduct extensive experiments on three public datasets and the experimental results show that our proposed model significantly outperforms the state-of-the-art solutions.Comment: 10 page

    UmobiTalk: Ubiquitous Mobile Speech Based Learning Language Translator for Sesotho Language

    Get PDF
    Published ThesisThe need to conserve the under-resourced languages is becoming more urgent as some of them are becoming extinct; natural language processing can be used to redress this. Currently, most initiatives around language processing technologies are focusing on western languages such as English and French, yet resources for such languages are already available. The Sesotho language is one of the under-resourced Bantu languages; it is mostly spoken in Free State province of South Africa and in Lesotho. Like other parts of South Africa, Free State has experienced high number of migrants and non-Sesotho speakers from neighboring provinces and countries; such people are faced with serious language barrier problems especially in the informal settlements where everyone tends to speak only Sesotho. Non-Sesotho speakers refers to the racial groups such as Xhosas, Zulus, Coloureds, Whites and more, in which Sesotho language is not their native language. As a solution to this, we developed a parallel corpus that has English as source and Sesotho as a target language and packaged it in UmobiTalk - Ubiquitous mobile speech based learning translator. UmobiTalk is a mobile-based tool for learning Sesotho for English speakers. The development of this tool was based on the combination of automatic speech recognition, machine translation and speech synthesis

    A comparative study of Chinese and European Internet companies' privacy policy based on knowledge graph

    Get PDF
    Privacy policy is not only a means of industry self-discipline, but also a way for users to protect their online privacy. The European Union (EU) promulgated the General Data Protection Regulation (GDPR) on May 25th, 2018, while China has no explicit personal data protection law. Based on knowledge graph, this thesis makes a comparative analysis of the Chinese and European Internet companies’ privacy policies, and combines with the relevant provisions of GDPR, puts forward suggestions on the privacy policy of Internet companies, so as to solve the problem of personal in-formation protection to a certain extent. Firstly, this thesis chooses the process and methods of knowledge graph construction and analysis. The process of constructing and analyzing the knowledge graph is: data preprocessing, entity extraction, storage in graph database and query. Data preprocessing includes word segmentation and part-of-speech tagging, as well as text format adjustment. Entity extraction is the core of knowledge graph construction in this thesis. Based on the principle of Conditional Random Fields (CRF), CFR++ toolkit is used for the entity extraction. Subsequently, the extracted entities are transformed into “.csv” format and stored in the graph database Neo4j, so the knowledge graph is generated. Cypher query statements can be used to query information in the graph database. The next part is about comparison and analysis of the Internet companies’ privacy policies in China and Europe. After sampling, the overall characteristics of the privacy policies of Chinese and European Internet companies are compared. According to the process of constructing knowledge graphs mentioned above, the “collected information” and “contact us” parts of the privacy policy are used to construct the knowledge graphs. Finally, combined with the relevant content of GDPR, the results of the comparative analysis are further discussed, and suggestions are proposed. Although Chinese Internet companies’ privacy policies have some merits, they are far inferior to those of European Internet companies. China also needs to enact a personal data protection law according to its national conditions. This thesis applies knowledge graph to the privacy policy research, and analyses Internet companies’ privacy policies from a comparative perspective. It also discusses the comparative results with GDPR and puts forward suggestions, and provides reference for the formulation of China's personal information protection law
    corecore