133 research outputs found

    Social Media Analytics Reporting Toolkit

    Get PDF
    With the fast growth of social media services, vast amount of user-generated content with time-space stamps are produced everyday. Considerable amount of these data are publicly available online, some of which collectively convey information that are of interest to data analysts. Social media data are dynamic and unstructured by nature, which makes it very hard for analysts to efficiently and effectively retrieve useful information. Social Media Analytics Reporting Toolkit (SMART), a system developed at Purdue VACCINE lab, aims to support such analyzing. The current framework collects real-time Twitter messages and visualizes volume densities on a map. It uses Latent Dirichilet Allocation (LDA) to extract regional topics and can optionally apply Seasonal-Trend decomposition using Loess (STL) to detect abnormal events. While Twitter has a fair amount of active users, they account for a small portion of total active social media users. Data generated by many other social media services are not currently utilized by SMART. Therefore, my work focused on expanding data sources of SAMRT system by creating means to collect data from other sources such as Facebook and Instagram. During a test run using a collection of 88 specified keywords in search, over two million Facebook posts were collected in one week. Besides, current SMART framework utilizes only one topic model, i.e. LDA, which is considered to be slower than Non-negative Matrix Factorization (NMF) model, thus I also put my effort into integrating NMF algorithm into the system. The improved SMART system can be used to fulfill a variety of analyzing tasks such as monitoring regional social media responses from different sources in disastrous events, detecting user reported crimes and so on. SMART is currently an ongoing and promising project that can be further improved by integrating new features

    Defining the Resolution of a Network for Transportation Analyses: a New Method to Improve Transportation Planning Decisions

    Get PDF
    Travel demand models are important tools used in the analysis of transportation plans, projects, and policies. The modeling results are useful for transportation planners making transportation decisions and for policy makers developing transportation policies. Defining the level of detail (i.e., the number of roads) of the transport network in consistency with the travel demand model’s zone system is crucial to the accuracy of modeling results. However, travel demand modelers have not had tools to determine how much detail is needed in a transport network for a travel demand model. This dissertation seeks to fill this knowledge gap by (1) providing methodology to define an appropriate level of detail for a transport network in a given travel demand model; (2) implementing this methodology in a travel demand model in the Baltimore area; and (3) identifying how this methodology improves the modeling accuracy. All analyses identify the spatial resolution of the transport network has great impacts on the modeling results. For example, when compared to the observed traffic data, a very detailed network underestimates traffic congestion in the Baltimore area, while a network developed by this dissertation provides a more accurate modeling result of the traffic conditions. Through the evaluation of the impacts a new transportation project has on both networks, the differences in their analysis results point out the importance of having an appropriate level of network detail for making improved planning decisions. The results corroborate a suggested guideline concerning the development of a transport network in consistency with the travel demand model’s zone system. To conclude this dissertation, limitations are identified in data sources and methodology, based on which a plan of future studies is laid out

    Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

    Full text link
    Multimodal demonstrations provide robots with an abundance of information to make sense of the world. However, such abundance may not always lead to good performance when it comes to learning sensorimotor control policies from human demonstrations. Extraneous data modalities can lead to state over-specification, where the state contains modalities that are not only useless for decision-making but also can change data distribution across environments. State over-specification leads to issues such as the learned policy not generalizing outside of the training data distribution. In this work, we propose Masked Imitation Learning (MIL) to address state over-specification by selectively using informative modalities. Specifically, we design a masked policy network with a binary mask to block certain modalities. We develop a bi-level optimization algorithm that learns this mask to accurately filter over-specified modalities. We demonstrate empirically that MIL outperforms baseline algorithms in simulated domains including MuJoCo and a robot arm environment using the Robomimic dataset, and effectively recovers the environment-invariant modalities on a multimodal dataset collected on a real robot. Our project website presents supplemental details and videos of our results at: https://tinyurl.com/masked-ilComment: 13 page

    Autoregressive Diffusion Model for Graph Generation

    Full text link
    Diffusion-based graph generative models have recently obtained promising results for graph generation. However, existing diffusion-based graph generative models are mostly one-shot generative models that apply Gaussian diffusion in the dequantized adjacency matrix space. Such a strategy can suffer from difficulty in model training, slow sampling speed, and incapability of incorporating constraints. We propose an \emph{autoregressive diffusion} model for graph generation. Unlike existing methods, we define a node-absorbing diffusion process that operates directly in the discrete graph space. For forward diffusion, we design a \emph{diffusion ordering network}, which learns a data-dependent node absorbing ordering from graph topology. For reverse generation, we design a \emph{denoising network} that uses the reverse node ordering to efficiently reconstruct the graph by predicting the node type of the new node and its edges with previously denoised nodes at a time. Based on the permutation invariance of graph, we show that the two networks can be jointly trained by optimizing a simple lower bound of data likelihood. Our experiments on six diverse generic graph datasets and two molecule datasets show that our model achieves better or comparable generation performance with previous state-of-the-art, and meanwhile enjoys fast generation speed.Comment: 18 page

    Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm

    Full text link
    In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning unifying pre-training and task-specific fine-tuning. This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system. In this system, model pre-training is first conducted at an edge server via centralized learning on local pre-stored general data, and then task-specific fine-tuning is performed at edge devices based on the pre-trained model via federated edge learning. For the two-stage learning model, we first analyze the convergence behavior (in terms of the average squared gradient norm bound), which characterizes the impacts of various system parameters such as the number of learning rounds and batch sizes in the two stages on the convergence rate. Based on our analytical results, we then propose a joint communication and computation resource management design to minimize an average squared gradient norm bound, subject to constraints on the transmit power, overall system energy consumption, and training delay. The decision variables include the number of learning rounds, batch sizes, clock frequencies, and transmit power control for both pre-training and fine-tuning stages. Finally, numerical results are provided to evaluate the effectiveness of our proposed design. It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off among the training accuracy, delay, and energy consumption. The proposed design is also shown to effectively leverage the inherent trade-off between pre-training and fine-tuning, which arises from the differences in data distribution between pre-stored general data versus real-time task-specific data, thus efficiently optimizing overall system performance

    End-to-End Stochastic Optimization with Energy-Based Model

    Full text link
    Decision-focused learning (DFL) was recently proposed for stochastic optimization problems that involve unknown parameters. By integrating predictive modeling with an implicitly differentiable optimization layer, DFL has shown superior performance to the standard two-stage predict-then-optimize pipeline. However, most existing DFL methods are only applicable to convex problems or a subset of nonconvex problems that can be easily relaxed to convex ones. Further, they can be inefficient in training due to the requirement of solving and differentiating through the optimization problem in every training iteration. We propose SO-EBM, a general and efficient DFL method for stochastic optimization using energy-based models. Instead of relying on KKT conditions to induce an implicit optimization layer, SO-EBM explicitly parameterizes the original optimization problem using a differentiable optimization layer based on energy functions. To better approximate the optimization landscape, we propose a coupled training objective that uses a maximum likelihood loss to capture the optimum location and a distribution-based regularizer to capture the overall energy landscape. Finally, we propose an efficient training procedure for SO-EBM with a self-normalized importance sampler based on a Gaussian mixture proposal. We evaluate SO-EBM in three applications: power scheduling, COVID-19 resource allocation, and non-convex adversarial security game, demonstrating the effectiveness and efficiency of SO-EBM.Comment: NeurIPS 2022 Ora

    Two-Phase Multi-Dose-Level PET Image Reconstruction with Dose Level Awareness

    Full text link
    To obtain high-quality positron emission tomography (PET) while minimizing radiation exposure, a range of methods have been designed to reconstruct standard-dose PET (SPET) from corresponding low-dose PET (LPET) images. However, most current methods merely learn the mapping between single-dose-level LPET and SPET images, but omit the dose disparity of LPET images in clinical scenarios. In this paper, to reconstruct high-quality SPET images from multi-dose-level LPET images, we design a novel two-phase multi-dose-level PET reconstruction algorithm with dose level awareness, containing a pre-training phase and a SPET prediction phase. Specifically, the pre-training phase is devised to explore both fine-grained discriminative features and effective semantic representation. The SPET prediction phase adopts a coarse prediction network utilizing pre-learned dose level prior to generate preliminary result, and a refinement network to precisely preserve the details. Experiments on MICCAI 2022 Ultra-low Dose PET Imaging Challenge Dataset have demonstrated the superiority of our method.Comment: Accepted by ISBI202

    Combining river replenishment and restrictions on groundwater pumping to achieve groundwater balance in the Juma River Plain, North China Plain

    Get PDF
    In recent years, to alleviate the decline in groundwater levels, extensive restrictions on groundwater pumping have been implemented in the North China Plain (NCP). In September 2018, a large-scale ecological water replenishment project was executed involving 22 rivers and lakes. How to adjust the layout of reduction on groundwater pumping within the context of ecological water replenishment is a key issue to be addressed in the study of groundwater level recovery in the NCP. This study adopted the Juma River Plain in Baoding city as a case study, established a numerical model of river replenishment of groundwater, predicted groundwater level changes over the next 15 years (2021–2035) and quantitatively calculated the impact of river replenishment on groundwater levels. To achieve the goal of an overall groundwater balance by 2035, a suitable groundwater pumping restriction scenario was defined based on the impact of river replenishment on groundwater levels. The results indicated that by 2035, the relative rise in groundwater levels attributed to river replenishment and restrictions on groundwater pumping could reach 3.51 and 2.28 m, respectively. River replenishment significantly impacts groundwater levels, especially those near the river. Under the current groundwater exploitation conditions, river replenishment could ensure groundwater level recovery near the river, which accounts for 15% of the total study area. The goal of an overall groundwater balance by 2035 could be achieved if restrictions on groundwater pumping were superimposed, with an average annual reduction of 56 million m3. This study provides valuable insights into groundwater management across the NCP. The proposed methods are useful for the management of other depleted aquifers recharged via ecological water replenishment
    • …
    corecore