2,018 research outputs found

    Learning and Designing Stochastic Processes from Logical Constraints

    Get PDF
    Stochastic processes offer a flexible mathematical formalism to model and reason about systems. Most analysis tools, however, start from the premises that models are fully specified, so that any parameters controlling the system's dynamics must be known exactly. As this is seldom the case, many methods have been devised over the last decade to infer (learn) such parameters from observations of the state of the system. In this paper, we depart from this approach by assuming that our observations are {\it qualitative} properties encoded as satisfaction of linear temporal logic formulae, as opposed to quantitative observations of the state of the system. An important feature of this approach is that it unifies naturally the system identification and the system design problems, where the properties, instead of observations, represent requirements to be satisfied. We develop a principled statistical estimation procedure based on maximising the likelihood of the system's parameters, using recent ideas from statistical machine learning. We demonstrate the efficacy and broad applicability of our method on a range of simple but non-trivial examples, including rumour spreading in social networks and hybrid models of gene regulation

    Bayesian Logistic Regression Model for Siting Biomass-using Facilities

    Get PDF
    Key sources of oil for western markets are located in complex geopolitical environments that increase economic and social risk. The amalgamation of economic, environmental, social and national security concerns for petroleum-based economies have created a renewed emphasis on alternative sources of energy which include biomass. The stability of sustainable biomass markets hinges on improved methods to predict and visualize business risk and cost to the supply chain. This thesis develops Bayesian logistic regression models, with comparisons of classical maximum likelihood models, to quantify significant factors that influence the siting of biomass-using facilities and predict potential locations in the 13-state Southeastern United States for three types of biomass-using facilities. Group I combines all biomass-using mills, biorefineries using agricultural residues and wood-using bioenergy/biofuels plants. Group II included pulp and paper mills, and biorefineries that use agricultural and wood residues. Group III included food processing mills and biorefineries that use agricultural and wood residues. The resolution of this research is the 5-digit ZIP Code Tabulation Area (ZCTA), and there are 9,416 ZCTAs in the 13-state Southeastern study region. For both classical and Bayesian approaches, a training set of data was used plus a separate validation (hold out) set of data using a pseudo-random number-generating function in SAS® Enterprise Miner. Four predefined priors are constructed. Bayesian estimation assuming a Gaussian prior distribution provides the highest correct classification rate of 86.40% for Group I; Bayesian methods assuming the non-informative uniform prior has the highest correct classification rate of 95.97% for Group II; and Bayesian methods assuming a Gaussian prior gives the highest correct classification rate of 92.67% for Group III. Given the comparative low sensitivity for Group II and Group III, a hybrid model that integrates classification trees and local Bayesian logistic regression was developed as part of this research to further improve the predictive power. The hybrid model increases the sensitivity of Group II from 58.54% to 64.40%, and improves both of the specificity and sensitivity significantly for Group III from 98.69% to 99.42% and 39.35% to 46.45%, respectively. Twenty-five optimal locations for the biomass-using facility groupings at the 5-digit ZCTA resolution, based upon the best fitted Bayesian logistic regression model and the hybrid model, are predicted and plotted for the 13-state Southeastern study region

    Disguise without Disruption: Utility-Preserving Face De-Identification

    Full text link
    With the rise of cameras and smart sensors, humanity generates an exponential amount of data. This valuable information, including underrepresented cases like AI in medical settings, can fuel new deep-learning tools. However, data scientists must prioritize ensuring privacy for individuals in these untapped datasets, especially for images or videos with faces, which are prime targets for identification methods. Proposed solutions to de-identify such images often compromise non-identifying facial attributes relevant to downstream tasks. In this paper, we introduce Disguise, a novel algorithm that seamlessly de-identifies facial images while ensuring the usability of the modified data. Unlike previous approaches, our solution is firmly grounded in the domains of differential privacy and ensemble-learning research. Our method involves extracting and substituting depicted identities with synthetic ones, generated using variational mechanisms to maximize obfuscation and non-invertibility. Additionally, we leverage supervision from a mixture-of-experts to disentangle and preserve other utility attributes. We extensively evaluate our method using multiple datasets, demonstrating a higher de-identification rate and superior consistency compared to prior approaches in various downstream tasks.Comment: Accepted at AAAI 2024. Paper + supplementary materia

    Message Passing-based Inference in Hierarchical Autoregressive Models

    Get PDF
    corecore