5,103 research outputs found

    Encoding Markov Logic Networks in Possibilistic Logic

    Get PDF
    Markov logic uses weighted formulas to compactly encode a probability distribution over possible worlds. Despite the use of logical formulas, Markov logic networks (MLNs) can be difficult to interpret, due to the often counter-intuitive meaning of their weights. To address this issue, we propose a method to construct a possibilistic logic theory that exactly captures what can be derived from a given MLN using maximum a posteriori (MAP) inference. Unfortunately, the size of this theory is exponential in general. We therefore also propose two methods which can derive compact theories that still capture MAP inference, but only for specific types of evidence. These theories can be used, among others, to make explicit the hidden assumptions underlying an MLN or to explain the predictions it makes.Comment: Extended version of a paper appearing in UAI 201

    Problems of Bank Lending in Bulgaria: Information Asymmetry and Institutional Learning

    Get PDF
    Why are there such severe problems in lending in the transition countries? This research took a microeconomic and institutional look at part of the problem. We conducted interviews in Bulgaria and Hungary and sought answers to two questions. First, how do banks making "normal" loans insure that they were making "good" loans? Second, how do banks get their money back on loans that have turned bad? Clearly, weaknesses at either stage could explain both past loan failures and present reluctance to lend. The bankers we spoke to reported significant difficulties at both stages of the credit process. First, the bankers reported difficulties in accumulating the information to evaluate borrowers and their projects. The bankers also reported problems with encouraging borrowers to repay and difficulties with seizing collateral, and using legal action in collecting bad debts. Although many of the problems are universal problems of bank lending, many seemed specific to transition economies in general and Bulgaria in particular. We identified specific problems with obtaining and using the evidence about borrowers that might have been available. Bulgarian bankers were often less than fully effective in collecting all available information, or in considering later how they could improve their methods of evaluating clients. One method that more banks might usefully adopt is systematic review of loan losses and the incorporation of lessons learned into the training of new loan officers. In addition, there were serious difficulties in sharing information about borrowers among bankers and between bankers and other firms. Some relaxation of bank secrecy would be appropriate. We also identified policy areas where improvement appears appropriate. Reputation can be effective in ensuring that borrowers fulfill their contracts. However, there is a general lack of credit reporting institutions to share information about credit-worthiness; this need to be remedied. The heavy reliance on collateral imposes high costs on borrowers and lenders. For collateral to work properly, banks must be able to perfect the collateral and to dispose of it quickly. Finally, fraud against banks was common, but typically went unpunished; prosecutors were apparently not interested in such cases. Bankers and prosecutors must make the prosecution of bank fraud a priority. We base our findings on the 24 banking interviews we conducted in Bulgaria. We also conducted 12 interviews in Hungary. Bankers were surprisingly candid in describing most of their problems.

    Fusing data mining, machine learning and traditional statistics to detect biomarkers associated with depression

    Full text link
    BACKGROUND: Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. METHODS: The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010). Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators. RESULTS: After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30), serum glucose (OR 1.01; 95% CI 1.00, 1.01) and total bilirubin (OR 0.12; 95% CI 0.05, 0.28). Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016), and current smokers (p<0.001). CONCLUSION: The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and complex survey sampling methodology and was demonstrated to be a useful tool for detecting three biomarkers associated with depression for future hypothesis generation: red cell distribution width, serum glucose and total bilirubin
    corecore