7 research outputs found

    Situation fencing: making geo-fencing personal and dynamic

    Get PDF
    Geo-fencing has recently been applied to multiple applications including media recommendation, advertisements, wildlife monitoring, and recreational activities. However current geo-fencing systems work with static geographical boundaries. Situation Fencing allows for these boundaries to vary automatically based on situations derived by a combination of global and personal data streams. We present a generic approach for situation fencing, and demonstrate how it can be operationalized in practice. The results obtained in a personalized allergy alert application are encouraging and open door for building thousands of similar applications using the same framework in near future

    Mining the Characteristics of Jupyter Notebooks in Data Science Projects

    Full text link
    Nowadays, numerous industries have exceptional demand for skills in data science, such as data analysis, data mining, and machine learning. The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in practice. Kaggle and GitHub are two platforms where data science communities are used for knowledge-sharing, skill-practicing, and collaboration. While tutorials and guidelines for novice data science are available on both platforms, there is a low number of Jupyter Notebooks that received high numbers of votes from the community. The high-voted notebook is considered well-documented, easy to understand, and applies the best data science and software engineering practices. In this research, we aim to understand the characteristics of high-voted Jupyter Notebooks on Kaggle and the popular Jupyter Notebooks for data science projects on GitHub. We plan to mine and analyse the Jupyter Notebooks on both platforms. We will perform exploratory analytics, data visualization, and feature importances to understand the overall structure of these notebooks and to identify common patterns and best-practice features separating the low-voted and high-voted notebooks. Upon the completion of this research, the discovered insights can be applied as training guidelines for aspiring data scientists and machine learning practitioners looking to improve their performance from novice ranking Jupyter Notebook on Kaggle to a deployable project on GitHub

    Situation Recognition from Multi-Resolution Event Streams

    No full text
    The nature of data has changed from the past decades. A large volume of spatial-temporal data has been produced, collected, and shared to observe events in the physical world at different spatial scales and temporal frequencies. A multitude of data streams such as weather patterns, stock prices, social media, traffic information, and disease incidents can be used to recognize evolving real-world situations. These situations vary and affect multiple aspects of people’s lives - such as traffic, flash floods, economic recession, and epidemic diseases. Detecting situations in time to take appropriate actions can help in saving lives and resources. Building upon a situation recognition and social network concept, where people can easily connect to people online, the Social Life Network (SLN) concept is introduced. The goal is to connect people with the right resources efficiently, effectively, and promptly, depending on the evolving situations. To achieve this goal, a novel platform for situation recognition is required.In this dissertation, we propose a generic framework for situation recognition from heterogeneous event streams. First, to ingest and store massive data streams, we implement the archival system using the pub/sub messaging system, Kafka, and big data management system, AsterixDB. Second, to handle data and events at different granularities, multi-resolution processing is introduced in every step including data ingestion, query processing, and data visualization. To aid users in finding the right resolution, statistical information about estimated error and response time for each situation model at different resolutions are provided. The system can automatically select an appropriate resolution or users can interactively select the most satisfactory resolution according to their needs. The idea that action drives design is a key component in creating a situation model. Finally, to fuse disparate data sources, a micro-reports concept is proposed. These micro-reports are compelling, universal, spontaneous, and objective. They can be used to solve problems of existing micro-blogs and replace them. The applicability of this framework is presented in a healthcare application for asthma risk management, disaster rescue for flood and hurricane mitigation, and smart city innovation for trash management in Downtown Washington D.C

    Integration of Diverse Data Sources for Spatial PM2.5 Data Interpolation

    No full text

    CAMELON: A System for Crime Metadata Extraction and Spatiotemporal Visualization From Online News Articles

    No full text
    Crimes result in not only loss to individuals but also hinder national economic growth. While crime rates have been reported to decrease in developed countries, underdeveloped and developing nations still suffer from prevalent crimes, especially those undergoing rapid expansion of urbanization. The ability to monitor and assess trends of different types of crimes at both regional and national levels could assist local police and national-level policymakers in proactively devising means to prevent and address the root causes of criminal incidents. Furthermore, such a system could prove useful to individuals seeking to evaluate criminal activity for purposes of travel, investment, and relocation decisions. Recent literature has opted to utilize online news articles as a reliable and timely source for information on crime activity. However, most of the crime monitoring systems fueled by such news sources merely classified crimes into different types and visualized individual crimes on the map using extracted geolocations, lacking crucial information for stakeholders to make relevant, informed decisions. To better serve the unique needs of the target user groups, this paper proposes a novel comprehensive crime visualization system that mines relevant information from large-scale online news articles. The system features automatic crime-type classification and metadata extraction from news articles. The crime classification and metadata schemes are designed to serve the need for information from law enforcement and policymakers, as well as general users. Novel interactive spatiotemporal designs are integrated into the system with the ability to assess the severity and intensity of crimes in each region through the novel Criminometer index. The system is designed to be generalized for implementation in different countries with diverse prevalent crime types and languages composing the news articles, owing to the use of deep learning cross-lingual language models. The experiment results reveal that the proposed system yielded 86%, 51%, and 67% F1 in crime type classification, metadata extraction, and closed-form metadata extraction tasks, respectively. Additionally, the results of the system usability tests indicated a notable level of contentment among the target user groups. The findings not only offer insights into the possible applications of interactive spatiotemporal crime visualization tools for proactive policymaking and predictive policing but also serve as a foundation for future research that utilizes online news articles for intelligent monitoring of real-world phenomena
    corecore