2 research outputs found

    Constructing Concept Space from Social Collaborative Editing

    No full text
    由於社群應用興起,群眾共創的資訊暴增,直接影響了所有使用者的資訊接收與決策行為。為了善加利用與分析這些網路的文字資訊,我們需要一個可以因應用領域變更,新詞彙出現,而持續成長的概念空間,做為語意計算的基礎知識庫。本體知識自動建構是一個與本研究相關的重要研究議題,前人的研究包括下列三個方向:從非結構文本自動建構、從既有的本體知識擴充、從半結構化語料庫建構。我們發展了一個可以將社群協作網站作為半結構化語料庫的概念空間自動建構架構,包括專名偵測、過濾、消岐義、專名擴充與排序,我們詳述了研究的步驟以及評估的方法,並在最後針對本體知識自動建構的演進與品質進行討論。我們所提出的架構證實在現實世界的即時資料流動下以自動的方式產生本體知識自動建構是可行的,而且無論是跨語言、跨領域或是在處理即時資料上較前人研究上提供了更廣的資料涵蓋範圍,有別於過去單一方法用於單一應用,本論文所採用的架構可提供更多實際的應用。With the prevalence of Social Networking Services (SNS), real-world consumption behaviors are influenced from reality to social networks. In order to utilize the information from social network, we need a concept space that can alter with application domain. In the presence of new vocabulary and continuously growing, and automatic ontology construction has been an important issue. There are previous studies concerning free-format ontology construction, enriching given ontologies from web or corpus sources, and construction of ontology from semi-structural corpora; among these studies, semi-structural corpora have been prevailing studies. In this thesis, we developed an adaptive framework for cross corpora on social collaborative editing, and we focus on semi-structural text mining in particular. The framework involves detection of named entity in a document, filtering of named entity, disambiguation detection, named entity expansion and ranking of the related named entity. We describe how this framework in detail and proposed method for each stage, and the metrics in the previous studies and the one we used for evaluation. We then discuss the evolution and quality of concept space. Our proposed framework made real-world corpora computationally possible, and a dynamic concept space is generated from this framework. It could deal with more diverse domains and languages, and for pragmatic real-world applications, our method shows better flexibility than previous studies

    Constructing Concept Space from Social Collaborative Editing

    Get PDF
    由於社群應用興起,群眾共創的資訊暴增,直接影響了所有使用者的資訊接收與決策行為。為了善加利用與分析這些網路的文字資訊,我們需要一個可以因應用領域變更,新詞彙出現,而持續成長的概念空間,做為語意計算的基礎知識庫。本體知識自動建構是一個與本研究相關的重要研究議題,前人的研究包括下列三個方向:從非結構文本自動建構、從既有的本體知識擴充、從半結構化語料庫建構。我們發展了一個可以將社群協作網站作為半結構化語料庫的概念空間自動建構架構,包括專名偵測、過濾、消岐義、專名擴充與排序,我們詳述了研究的步驟以及評估的方法,並在最後針對本體知識自動建構的演進與品質進行討論。我們所提出的架構證實在現實世界的即時資料流動下以自動的方式產生本體知識自動建構是可行的,而且無論是跨語言、跨領域或是在處理即時資料上較前人研究上提供了更廣的資料涵蓋範圍,有別於過去單一方法用於單一應用,本論文所採用的架構可提供更多實際的應用。With the prevalence of Social Networking Services (SNS), real-world consumption behaviors are influenced from reality to social networks. In order to utilize the information from social network, we need a concept space that can alter with application domain. In the presence of new vocabulary and continuously growing, and automatic ontology construction has been an important issue. There are previous studies concerning free-format ontology construction, enriching given ontologies from web or corpus sources, and construction of ontology from semi-structural corpora; among these studies, semi-structural corpora have been prevailing studies. In this thesis, we developed an adaptive framework for cross corpora on social collaborative editing, and we focus on semi-structural text mining in particular. The framework involves detection of named entity in a document, filtering of named entity, disambiguation detection, named entity expansion and ranking of the related named entity. We describe how this framework in detail and proposed method for each stage, and the metrics in the previous studies and the one we used for evaluation. We then discuss the evolution and quality of concept space. Our proposed framework made real-world corpora computationally possible, and a dynamic concept space is generated from this framework. It could deal with more diverse domains and languages, and for pragmatic real-world applications, our method shows better flexibility than previous studies