29 research outputs found

    빅데이터를 이용한 재난 피해 예측 시스템

    No full text
    재난 발생 시 실제 데이터에 기반하여 블록 또는 건물 단위로 운영 정지 기간에 따른 간접 운영 피해를 추정함으로써, 재난 피해에 대한 보상 기준을 제시할 수 있는 재난 피해 예측 기술 제

    Media-aware quantitative trading based on publicWeb information

    No full text
    Recent studies in behavioral finance discover that emotional impulses of stock investors affect stock prices. The challenge lies in how to quantify such sentiment to predict stock market movements. In this article, we propose a media-aware quantitative trading strategy utilizing sentiment information of Web media. This is achieved by capturing public mood from interactive behaviors of investors in social media and studying the impact of firm specific news sentiment on stocks along with such public mood. Our experiments on the CSI 100 stocks during a three-month period show that a predictive performance in closeness to the actual future stock price is 0.612 in terms of root mean squared error, the same direction of price movement as the future price is 55.08%, and a simulation trading return is up to 166.11%

    A Study on Representation Difficulty Assessment of Educational Presentation Materials

    No full text
    대학과 같은 고등교육 현장에서 슬라이드는 수업 또는 세미나에 많이 활용되는 매체이다. 최근에는 SlideShare 등의 슬라이드 전용 공유 플랫폼까지 등장하며 온라인상에 더 많은 교육용 슬라이드가 축적되고 있다. 이 연구에서는 이러한 슬라이드 형태이 교육 자료에 대해 인식하기 쉽고 어려운 정도인 표현적 난이도를 자동으로 측정하는 기법을 제안한다. 제안하는 60개의 자질을 활용하여 기계학습 모델을 구축하고 표현적으로 고 난이도와 저 난이도의 슬라이드를 효과적으로 구분한다. 정밀하게 파악된 난이도 정보는 콘텐츠 선택에 있어 사용자 편의성을 획기적으로 증대시켜 줄 수 있다

    Domain Terminology Collection for Semantic Interpretation of Sensor Network Data

    No full text
    Many studies have investigated the management of data delivered over sensor networks and attempted to standardize their relations. Sensor data come from numerous tangible and intangible sources, and existing work has focused on the integration and management of the sensor data itself. The data should be interpreted according to the sensor environment and related objects, even though the data type, and even the value, is exactly the same.This means that the sensor data should have semantic connections with all objects, and so a knowledge base that covers all domains should be constructed. In this paper, we suggest a method of domain terminology collection based onWikipedia category information in order to prepare seed data for such knowledge bases.However, Wikipedia has two weaknesses, namely, loops and unreasonable generalizations in the category structure. To overcome these weaknesses, we utilize a horizontal bootstrapping method for category searches and domain-term collection. Both the categoryarticle andarticle-linkrelations defined inWikipedia are employedas terminology indicators, and we use a newmeasure to calculate the similarity between categories. By evaluating various aspects of the proposed approach, we show that it outperforms the baseline method, having wider coverage and higher precision. The collected domain terminologies can assist the construction of domain knowledge bases for the semantic interpretation of sensor data

    Activity Inference for Constructing User Intention Model

    No full text
    User intention modeling is a key component for providing appropriate services within ubiquitous and pervasive computing environments. Intention modeling should be concentrated on inferring user activities based on the objects a user approaches or touches. In order to support this kind of modeling, we propose the creation of object-activity pairs based on relatedness in a general domain. In this paper, we show our method for achieving this and evaluate its effectiveness

    An Intensive Case Study on Kernel-based Relation Extraction

    No full text
    Relation extraction refers to a method of efficiently detecting and identifying predefined semantic relationships within a set of entities in text documents. Numerous relation extractionfc techniques have been developed thus far, owing to their innate importance in the domain of information extraction and text mining. The majority of the relation extraction methods proposed to date is based on a supervised learning method requiring the use of learning collections; such learning methods can be classified into feature-based, semi-supervised, and kernel-based techniques. Among these methods, a case analysis on a kernel-based relation extraction method, considered the most successful of the three approaches, is carried out in this paper. Although some previous survey papers on this topic have been published, they failed to select the most essential of the currently available kernel-based relation extraction approaches or provide an in-depth comparative analysis of them. Unlike existing case studies, the study described in this paper is based on a close analysis of the operation principles and individual characteristics of five vital representative kernel-based relation extraction methods. In addition, we present deep comparative analysis results of these methods. In addition, for further research on kernel-based relation extraction with an even higher performance and for general high-level kernel studies for linguistic processing and text mining, some additional approaches including feature-based methods based on various criteria are introduced

    질의 어절의 고유한 분별력에 기반한 어절 가중치 부여방법 연구

    No full text
    학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ 88 p. ]Term weighting for document ranking and retrieval has been an important research topic in Information Retrieval for decades. We propose a novel term weighting method that utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieval documents, and their relevance judgments. A term’s evidential weight, DP (Discrimination Power) which we propose in this paper, depends on the degree to which the mean weighting scores for the relevant and non-relevant document distributions are different in the relevance-judged past document collection. It also takes into account the rankings and similarity values of the relevant and non-relevant documents to make a compensation for incorrect positions or scores in the retrieved document list. The experiments were performed using two well-known open-source search engines, Terrier and Indri, and four different ranking models including TFIDF, DFR (Divergence From Randomness) BM25, Hiemstra Language Model, and Indri Language Model. Our experimental result using a standard test collection (TREC-3,4, and 5) shows that a term weighting scheme that incorporates the notion of evidential weights outperforms the four baseline scheme. It is interesting to note that we obtained the performance increase with only a small number of terms found in the relatively small number of past queries. An additional analysis of how the effectiveness changes as the number of terms having DP value increases shows that DP has strong applicability given a large set of queries because the effect of DP is in proportion to the number of DP terms. Further analysis shows the notion of evidential weight, not based on the entire collection but based on the relevance-judged documents, is clearly distinct from IDF. In addition, an experiment was performed and showed significant result on TREC Web Blogs collection to show the proposed method is feasible to apply to general Web search. As a result, we designed a new te...한국과학기술원 : 전산학과
    corecore