3 research outputs found

    Data Pipeline Architecture with Near Real-Time Streaming Multiple Source Indonesian Online News Data Lake

    Get PDF
    The rapid development of information has made online news increasingly needed. Online news attracts readers' attention by providing convenience and speed in presenting news from various fields. However, the large amount (volume) of online news that spreads in a short time (velocity) and the public's need to consume news in various references (variety) can affect people's lives. Therefore, the government as the regulator and news agencies need to monitor online news circulating. Based on these problems, the researcher proposes a data lake architectural design that is suitable for online news and can run in real-time. Data lakes can solve the main problems of Big Data (volume, velocity, variety). In proposing this data lake architecture, the researcher conducted a literature study and analyzed the flow of the data lake architecture according to online news. Furthermore, the researcher will use this architecture to combine and uniform the online news data structure from several online news channels and then stream it in real-time to fill the data lake. The results of using the data lake architecture for online news will be stored on MongoDB which functions as a database to store all data for both the short and long term. Finally, this data lake will be a means to accommodate, dive into, and analyze the circulating online news data. Keywords – Data Lake, Online News, Real-Tim

    Scalable Architecture for Personalized Healthcare Service Recommendation Using Big Data Lake

    No full text
    Presented in Sixth Australasian Symposium on Service Research and Innovation 2017 (This collection of papers also includes: 5th Australasian Symposium, ASSRI 2015, Sydney, NSW, Australia, November 2–3, 2015) Title in Libraries Australia: Service research and innovation : 5th and 6th Australian Symposium, ASSRI 2015 and ASSRI 2017, Sydney, NSW, Australia, November 2-3, 2015, and October 19-20 2017 ; revised selected paper
    corecore