Article thumbnail

[[alternative]]Development of A Disease Relation and Prediction Model Using NHIRDB as an Example

By 汪皖翔 and Wan-Hsiang Wang


[[abstract]]動機: 現今有越來越多研究人員使用臺灣獨有的巨量資料-健保資料庫做研究分析。但由於資料量龐大,分析健保資料的過程是耗時費工的。除此之外,研究人員可能需學習 SQL 語法、其他程式語言或統計軟體的使用,才可以進行分析。對於非資訊專長的使用者而言已造成了基本技術限制。目的: 建立一巨量資料分析平臺系統,系統會依使用者欲查詢疾病代碼(ICD9CM)從健保資料庫撈取相關資料,透過 LASSO 迴歸從資料自動找出特徵疾病,並根據特徵疾病來建立疾病預測類神經網路模型。本研究實作示範以肺癌(ICD9CM=162.9)當做欲查詢疾病,在本研究系統找出特徵疾病並建立一肺癌類神經網路預測模型。方法: 本研究以 R 語言來開發巨量資料分析平臺的功能與視覺化介面,系統後端與研究資料庫連接,其研究資料為整合過的 2000 至 2009年的門診處方及治療明細檔。初步研究結果: 本研究系統的操作簡單,使用者只需輸入欲查詢之疾病代碼(ICD9CM)。系統會自動撈取相關資料並轉換成過去罹患其他疾病統計資料,使用者可將資料下載進行後續其他分析或是在平臺上建立預測模型,簡化了過去分析健保資料的複雜步驟。[[abstract]]Motivation: Currently, more and more researchers are use the National Health Insurance Research Database (NHIRDB) to research and analyse. But, the huge volume of data in NHIRDB makes analysis is time consuming and demanding. Besides, researchers may learn how to use Structured Query Language(SQL), programming languages and statistical software for analysis data. It’s limitations of non-data analyst. Objective: Developing a Big Data analytics platform. Users can enter ICD9CM-code to the platform, and then the platform will Acquire data from NHIRDB. Then, users can use LASSO regression to extract disease features and use these features to build an artificial neural network disease prediction model. In this study, we enter an icd-9 code for lung cancer(ICD9CM=162.9). And use disease feature to build lung cancer risk prediction model as demonstration. Method: Build a web application in R with shiny. The web application connect to Microsoft SQL Server with the research data(CD) which during the period of 2000 – 2009. Result:The analytics platform is easy to use. Users only need to enter ICD9CM-code to the platform. Then, it will acquire and automatically process data. Users can download result data set or build an artificial neural network disease prediction model in the platform. It greatly simplifies the data access of NHIRDB

Topics: 巨量資料;健保資料庫;R 語言;類神經網路;LASSO 迴歸, Big Data;NHIRDB;R language;Artificial Neural Network;LASSO Regression, [[classification]]15
Publisher: 國立台北護理健康大學資訊管理研究所
Year: 2020
OAI identifier: oai:NTUNHSIR:987654321/6955
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)

  • To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

    Suggested articles