Search CORE

27 research outputs found

CONSTRAINED MULTI-GROUP PROJECT ALLOCATION USING MAHALANOBIS DISTANCE

Author: Alkabaa Abdulaziz Saud
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2016
Field of study

Optimal allocation is one of the most active research areas in operation research using binary integer variables. The allocation of multi constrained projects among several options available along a given planning horizon is an especially significant problem in the general area of item classification. The main goal of this dissertation is to develop an analytical approach for selecting projects that would be most attractive from an economic point of view to be developed or allocated among several options, such as in-house engineers and private contractors (in transportation projects). A relevant limiting resource in addition to the availability of funds is the in-house manpower availability. In this thesis, the concept of Mahalanobis distance (MD) will be used as the classification criterion. This is a generalization of the Euclidean distance that takes into account the correlation of the characteristics defining the scope of a project. The desirability of a given project to be allocated to an option is defined in terms of its MD to that particular option. Ideally, each project should be allocated to its closest option. This, however, may not be possible because of the available levels of each relevant resource. The allocation process is formulated mathematically using two Binary Integer Programming (BIP) models. The first formulation maximizes the dollar value of benefits derived by the traveling public from those projects being implemented subject to a budget, total sum of MD, and in-house manpower constraints. The second formulation minimizes the total sum of MD subject to a budget and the in-house manpower constraints. The proposed solution methodology for the BIP models is based on the branchand- bound method. In particular, one of the contributions of this dissertation is the development of a strategy for branching variables and node selection that is consistent with allocation priorities based on MD to improve the branch-and-bound performance level as well as handle a large scale application. The suggested allocation process includes: (a) multiple allocation groups; (b) multiple constraints; (c) different BIP models. Numerical experiments with different projects and options are considered to illustrate the application of the proposed approach

University of Tennessee, Knoxville: Trace

GlySpy: A software suite for assigning glycan topologies from sequential mass spectral data

Author: Lapadula Anthony
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2007
Field of study

GlySpy is a suite of algorithms used to determine the structure of glycans. Glycans, which are orderly aggregations of monosaccharides such as glucose, mannose, and fucose, are often attached to proteins and lipids, and provide a wide range of biological functions. Previous biomolecule-sequencing algorithms have operated on linear polymers such as proteins or DNA but, because glycans form complicated branching structures, new approaches are required. GlySpy uses data derived from sequential mass spectrometry (MSn), in which a precursor molecule is fragmented to form products, each of which may then be fragmented further, gradually disassembling the glycan. GlySpy resolves the structures of the original glycans by examining these disassembly pathways. The four main components of GlySpy are: (1) OSCAR (the Oligosaccharide Subtree Constraint Algorithm), which accepts analyst-selected MSn disassembly pathways and produces a set of plausible glycan structures; (2) IsoDetect, which reports the MSn disassembly pathways that are inconsistent with a set of expected structures, and which therefore may indicate the presence of alternative isomeric structures; (3) IsoSolve, which attempts to assign the branching structures of multiple isomeric glycans found in a complex mixture; and (4) Intelligent Data Acquisition (IDA), which provides automated guidance to the mass spectrometer operator, selecting glycan fragments for further MSn disassembly. This dissertation provides a primer for the underlying interdisciplinary topics---carbohydrates, glycans, MSn, and so on-and also presents a survey of the relevant literature with a focus on currently-available tools. Each of GlySpy\u27s four algorithms is described in detail, along with results from their application to biologically-derived glycan samples. A summary enumerates GlySpy\u27s contributions, which include de novo glycan structural analysis, favorable performance characteristics, interpretation of higher-order MSn data, and the automation of both data acquisition and analysis

UNH Scholars' Repository

Unsupervised multilingual learning

Author: Snyder Benjamin, Ph. D. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 241-254).For centuries, scholars have explored the deep links among human languages. In this thesis, we present a class of probabilistic models that exploit these links as a form of naturally occurring supervision. These models allow us to substantially improve performance for core text processing tasks, such as morphological segmentation, part-of-speech tagging, and syntactic parsing. Besides these traditional NLP tasks, we also present a multilingual model for lost language deciphersment. We test this model on the ancient Ugaritic language. Our results show that we can automatically uncover much of the historical relationship between Ugaritic and Biblical Hebrew, a known related language.by Benjamin Snyder.Ph.D

DSpace@MIT

Real Time Crime Prediction Using Social Media

Author: Jimoh Fatai
Publication venue
Publication date: 02/06/2023
Field of study

There is no doubt that crime is on the increase and has a detrimental influence on a nation's economy despite several attempts of studies on crime prediction to minimise crime rates. Historically, data mining techniques for crime prediction models often rely on historical information and its mostly country specific. In fact, only a few of the earlier studies on crime prediction follow standard data mining procedure. Hence, considering the current worldwide crime trend in which criminals routinely publish their criminal intent on social media and ask others to see and/or engage in different crimes, an alternative, and more dynamic strategy is needed. The goal of this research is to improve the performance of crime prediction models. Thus, this thesis explores the potential of using information on social media (Twitter) for crime prediction in combination with historical crime data. It also figures out, using data mining techniques, the most relevant feature engineering needed for United Kingdom dataset which could improve crime prediction model performance. Additionally, this study presents a function that could be used by every state in the United Kingdom for data cleansing, pre-processing and feature engineering. A shinny App was also use to display the tweets sentiment trends to prevent crime in near-real time.Exploratory analysis is essential for revealing the necessary data pre-processing and feature engineering needed prior to feeding the data into the machine learning model for efficient result. Based on earlier documented studies available, this is the first research to do a full exploratory analysis of historical British crime statistics using stop and search historical dataset. Also, based on the findings from the exploratory study, an algorithm was created to clean the data, and prepare it for further analysis and model creation. This is an enormous success because it provides a perfect dataset for future research, particularly for non-experts to utilise in constructing models to forecast crime or conducting investigations in around 32 police districts of the United Kingdom.Moreover, this study is the first study to present a complete collection of geo-spatial parameters for training a crime prediction model by combining demographic data from the same source in the United Kingdom with hourly sentiment polarity that was not restricted to Twitter keyword search. Six unique base models that were frequently mentioned in the previous literature was selected and used to train stop-and-search historical crime dataset and evaluated on test data and finally validated with dataset from London and Kent crime datasets.Two different datasets were created from twitter and historical data (historical crime data with twitter sentiment score and historical data without twitter sentiment score). Six of the most prevalent machine learning classifiers (Random Forest, Decision Tree, K-nearest model, support vector machine, neural network and naïve bayes) were trained and tested on these datasets. Additionally, hyperparameters of each of the six models developed were tweaked using random grid search. Voting classifiers and logistic regression stacked ensemble of different models were also trained and tested on the same datasets to enhance the individual model performance.In addition, two combinations of stack ensembles of multiple models were constructed to enhance and choose the most suitable models for crime prediction, and based on their performance, the appropriate prediction model for the UK dataset would be selected. In terms of how the research may be interpreted, it differs from most earlier studies that employed Twitter data in that several methodologies were used to show how each attribute contributed to the construction of the model, and the findings were discussed and interpreted in the context of the study. Further, a shiny app visualisation tool was designed to display the tweets’ sentiment score, the text, the users’ screen name, and the tweets’ vicinity which allows the investigation of any criminal actions in near-real time. The evaluation of the models revealed that Random Forest, Decision Tree, and K nearest neighbour outperformed other models. However, decision trees and Random Forests perform better consistently when evaluated on test data

University of Salford Institutional Repository

Proceedings of the 7th International Conference on Functional-Structural Plant Models, Saariselkä, Finland, 9 - 14 June 2013

Author
Publication venue: 'University of Helsinki Libraries'
Publication date: 01/01/2013
Field of study

Jukuri

Computer modelling of agroforestry systems

Author: Anderson Thomas R.
Publication venue: The University of Edinburgh
Publication date: 01/01/1991
Field of study

Edinburgh Research Archive

Detecting and mapping forest nutrient deficiencies: eucalyptus variety (Eucalyptus grandis x and Eucalyptus urophylla) trees in KwaZulu-Natal, South Africa.

Author: Singh Leeth.
Publication venue
Publication date: 01/01/2022
Field of study

Doctoral Degree. University of KwaZulu-Natal, Pietermaritzburg.Abstract available in PDF

ResearchSpace@UKZN

Continuous Cover Forestry: A Selected Bibliography 2011-2018

Author: Manning Sam B.
Walmsley James
Publication venue
Publication date: 23/08/2018
Field of study

Bangor University Research Portal

Recommended from our members

Simulation of population changes of western dwarf mistletoe on Ponderosa pine

Author: Strand Mary Ann Sall, 1945-
Publication venue: 'Oregon State University'
Publication date
Field of study

Western dwarf mistletoe (Arceuthobium campylopodum Engelm. 1. campylopodum) is a parasite of ponderosa pine (Pinus ponderosa Laws. ). The objectives of this investigation are: (a) to formulate a mathematical description of the process of dwarf mistletoe disease spread in a pine forest, (b) to use this description to predict the spread in a few cases of interest, and (c) from the result to make some general hypotheses concerning the process. The simulation is based on a young-growth, managed ponderosa pine stand, where the trees are evenly spaced (9 to 18 feet apart), are of uniform height (10 to 25 feet), and have a light to moderate infection level. The model consists of four major submodels: tree growth, mistletoe seed production, seed dispersal, and infection establishment. The tree growth submodel provides information concerning size, position, and number of susceptible branches. The seed production submodel relates the amount of inoculum present to plant age. The process of disease spread is partitioned into a series of sequentially operating events. The probabilities associated with the events from mistletoe seed production to seed interception by a susceptible branch are computed in the seed dispersal submodel. The probabilities of subsequent events leading to infection are in the infection establishment submodel. Each submodel provides information for the next one, forming an interlocking set. Seven cases are examined using the complete simulation model. These include three tree spacings (9, 13, and 18 feet) with two moderate levels of infection (2 and 4 plants per infected tree) simulated for five years and one with a heavy infection level (15 plants and 9 feet spacing) simulated for ten years. The results are examined to assess changes in (a) the probability of infection with respect to tree spacing within a hypothetical stand, branchlet height, infection level, and time, and (b) the expected number of new infections. The model shows that the probability of reinfection decreases as the crown volume around a given height becomes larger and the foliage becomes sparser. The probability of infection due to contagion is found to decrease by about half for an increase in stand spacing of five feet. In a stand with an initial infection rate of 0.60 and a spacing of 9 feet, the expected number of new infections per 100 trees at the end of the fifth year is found to be 283 plants where there is an initial level of 2 plants per infected tree and to be 644 plants where there is a level of 4 plants per infected tree. Based on examination of the behavior of the model, five hypotheses concerning the disease spread process are formulated. (1) Plants high in the crown of the pine trees are the most important ones with respect to disease spread. (2) Where infection levels are moderate (fewer than 5 infections per tree) and where spacing is greater than 8 feet, vertical spread is accomplished primarily by reinfection. (3) It is possible for a tree to "outgrow" its infections. (4) In stands with spacing distances greater than 8 feet and a sparse mistletoe population, new infections are more likely to occur as a result of reinfection than as a result of contagion. (5) Increasing the spacing between trees reduces the probability of mistletoe infection from both reinfection and contagion. These hypotheses have a practi[c]al importance to the management of young pine forests. They indicate that selective thinning should discriminate against trees with infections at greatest heights. Also, in young stands with moderate infection levels, the chances are favorable for the trees to outgrow their infections, if they are spaced such that growth conditions are optimum

ScholarsArchive@OSU

The role of structured induction in expert systems

Author: Shapiro Alen David
Publication venue: The University of Edinburgh
Publication date: 01/01/1984
Field of study

A "structured induction" technique was developed and tested using a rules- from -examples generator together with a chess -specific application package. A drawback of past experience with computer induction, reviewed in this thesis, has been the generation of machine -oriented rules opaque to the user. By use of the structured approach humanly understandable rules were synthesized from expert supplied examples. These rules correctly performed chess endgame classifications of sufficient complexity to be regarded as difficult by international master standard players. Using the "Interactive ID3" induction tools developed by the author, chess experts, with a little programming support, were able to generate rules which solve problems considered difficult or impossible by conventional programming techniques. Structured induction and associated programming tools were evaluated using the chess endgames Icing and Pawn vs. King (Black -tomove) and King and Pawn vs. King and Rook (White -to -move, White Pawn on a7) as trial problems of measurable complexity.Structured solutions to both trial problems are presented, and implications of this work for the design of expert systems languages are assessed

Edinburgh Research Archive