    Random Forests model selection

    Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF have shown to be effective in many different real world classification problems and nowadays are considered as one of the best learning algorithms in this context. In this paper we discuss the effect of the hyperparameters of the RF over the accuracy of the final model, with particular reference to different theoretically grounded weighing strategies of the tree in the forest. In this way we go against the common misconception which considers RF as an hyperparameter-free learning algorithm. Results on a series of benchmark datasets show that performing an accurate Model Selection procedure can greatly improve the accuracy of the final RF classifier

    Predicting Post-Fire Change in West Virginia, USA from Remotely-Sensed Data

    Prescribed burning is used in West Virginia, USA to return the important disturbance process of fire to oak and oak-pine forests. Species composition and structure are often the main goals for re-establishing fire with less emphasis on fuel reduction or reducing catastrophic wildfire. In planning prescribed fires land managers could benefit from the ability to predict mortality to overstory trees. In this study, wildfires and prescribed fires in West Virginia were examined to determine if specific landscape and terrain characteristics were associated with patches of high/moderate post-fire change. Using the ensemble machine learning approach of Random Forest, we determined that linear aspect was the most important variable associated with high/moderate post-fire change patches, followed by hillshade, aspect as class, heat load index, slope/aspect ratio (sine transformed), average roughness, and slope in degrees. These findings were then applied to a statewide spatial model for predicting post-fire change. Our results will help land managers contemplating the use of prescribed fire to spatially target landscape planning and restoration sites and better estimate potential post-fire effects

    Marine safety and data analytics : vessel crash stop maneuvering performance prediction

    Crash stop maneuvering performance is one of the key indicators of the vessel safety properties for a shipbuilding company. Many different factors affect these performances, from the vessel design to the environmental conditions, hence it is not trivial to assess them accurately during the preliminary design stages. Several first principal equation methods are available to estimate the crash stop maneuvering performance, but unfortunately, these methods usually are either too costly or not accurate enough. To overcome these limitations, the authors propose a new data-driven method, based on the popular Random Forests learning algorithm, for predicting the crash stopping maneuvering performance. Results on real-world data provided by the DAMEN Shipyards show the effectiveness of the proposal

    Characterizing boreal peatland plant composition and species diversity with hyperspectral remote sensing

    Peatlands, which account for approximately 15% of land surface across the arctic and boreal regions of the globe, are experiencing a range of ecological impacts as a result of climate change. Factors that include altered hydrology resulting from drought and permafrost thaw, rising temperatures, and elevated levels of atmospheric carbon dioxide have been shown to cause plant community compositional changes. Shifts in plant composition affect the productivity, species diversity, and carbon cycling of peatlands. We used hyperspectral remote sensing to characterize the response of boreal peatland plant composition and species diversity to warming, hydrologic change, and elevated CO2. Hyperspectral remote sensing techniques offer the ability to complete landscape-scale analyses of ecological responses to climate disturbance when paired with plot-level measurements that link ecosystem biophysical properties with spectral reflectance signatures. Working within two large ecosystem manipulation experiments, we examined climate controls on composition and diversity in two types of common boreal peatlands: a nutrient rich fen located at the Alaska Peatland Experiment (APEX) in central Alaska, and an ombrotrophic bog located in northern Minnesota at the Spruce and Peatland Responses Under Changing Environments (SPRUCE) experiment. We found a strong effect of plant functional cover on spectral reflectance characteristics. We also found a positive relationship between species diversity and spectral variation at the APEX field site, which is consistent with other recently published findings. Based on the results of our field study, we performed a supervised land cover classification analysis on an aerial hyperspectral dataset to map peatland plant functional types (PFTs) across an area encompassing a range of different plant communities. Our results underscore recent advances in the application of remote sensing measurements to ecological research, particularly in far northern ecosystems

    Effects of bark beetle outbreaks on forest landscape pattern in the southern rocky mountains, U.S.A.

    Since the late 1990s, extensive outbreaks of native bark beetles (Curculionidae: Scolytinae) have affected coniferous forests throughout Europe and North America, driving changes in carbon storage, wildlife habitat, nutrient cycling, and water resource provisioning. Remote sensing is a cru-cial tool for quantifying the effects of these disturbances across broad landscapes. In particular, Landsat time series (LTS) are increasingly used to characterize outbreak dynamics, including the presence and severity of bark beetle-caused tree mortality, though broad-scale LTS-based maps are rarely informed by detailed field validation. Here we used spatial and temporal information from LTS products, in combination with extensive field data and Random Forest (RF) models, to develop 30-m maps of the presence (i.e., any occurrence) and severity (i.e., cumulative percent basal area mortality) of beetle-caused tree mortality 1997–2019 in subalpine forests throughout the Southern Rocky Mountains, USA. Using resultant maps, we also quantified spatial patterns of cumulative tree mortality throughout the region, an important yet poorly understood concept in beetle-affected forests. RF models using LTS products to predict presence and severity performed well, with 80.3% correctly classified (Kappa = 0.61) and R2 = 0.68 (RMSE = 17.3), respectively. We found that ≥10,256 km2 of subalpine forest area (39.5% of the study area) was affected by bark beetles and 19.3% of the study area experienced ≥70% tree mortality over the twenty-three year period. Variograms indi-cated that severity was autocorrelated at scales \u3c 250 km. Interestingly, cumulative patch-size dis-tributions showed that areas with a near-total loss of the overstory canopy (i.e., ≥90% mortality) were relatively small (\u3c0.24 km2) and isolated throughout the study area. Our findings help to in-form an understanding of the variable effects of bark beetle outbreaks across complex forested regions and provide insight into patterns of disturbance legacies, landscape connectivity, and susceptibility to future disturbance

    Coupling spectral imaging and laboratory analyses to digitally map sediment parameters and stratigraphic layers in Yeha, Ethiopia

    Quantitative analyses of soil and sediment samples are often used to complement stratigraphic interpretations in archaeological and geoscientific research. The outcome of such analyses often is confined to small parts of the examined profiles as only a limited number of samples can be extracted and processed. Recent laboratory studies show that such selectively measured soil and sediment characteristics can be spatially extrapolated using spectral image data, resulting in reliable maps of a variety of parameters. However, on-site usage of this method has not been examined. We therefore explore, whether image data (RGB data and visible and near infrared hyperspectral data), acquired under regular fieldwork conditions during an archaeological excavation, in combination with a sampling strategy that is close to common practice, can be used to produce maps of soil organic matter, hematite, calcite, several weathering indices and grain size characteristics throughout complex archaeological profiles. We examine two profiles from an archaeological trench in Yeha (Tigray, Ethiopia). Our findings show a promising performance of RGB data and its derivative CIELAB as well as hyperspectral data for the prediction of parameters via random forest regression. By including two individual profiles we are able to assess the accuracy and reproducibility of our results, and illustrate the advantages and drawbacks of a higher spectral resolution and the necessary additional effort during fieldwork. The produced maps of the parameters examined allow us to critically reflect on the stratigraphic interpretation and offer a more objective basis for layer delineation in general. Our study therefore promotes more transparent and reproducible documentation for often destructive archaeological fieldwork

    Risk factors for antimicrobial use in Dutch pig farms: A cross-sectional study

    BACKGROUND: Antimicrobial use (AMU) has decreased significantly in Dutch pig farms since 2009. However, this decrease has stagnated recently, with relatively high AMU levels persisting mainly among weaners. The aim of this study was to identify farm-level characteristics associated with: i) total AMU and ii) use of specific antimicrobial classes. METHODS: In 2020, cross-sectional data from 154 Dutch pig farms were collected, including information on AMU and farm characteristics. A mixed-effects conditional Random Forest analysis was applied to select the subset of features that was best associated with AMU. RESULTS: The main risk factors for total AMU in weaners were vaccination for PRRS in sucklings, being a conventional farm (vs. not), high within-farm density, and early weaning. The main protective factors for total AMU in sows/sucklings were E. coli vaccination in sows and having boars for estrus detection from own production. Regarding antimicrobial class-specific outcomes, several risk factors overlapped for weaners and sows/sucklings, such as farmer's non-tertiary education, not having free-sow systems during lactation, and conventional farming. An additional risk factor for weaners was having fully slatted floors. For fatteners, the main risk factor for total AMU was PRRS vaccination in sucklings. CONCLUSIONS: Several factors found here to be associated with AMU. Some were known but others were novel, such as farmer's tertiary education, low pig aggression and free-sow systems which were all associated with lower AMU. These factors provide targets for developing tailor-made interventions, as well as an evidence-based selection of features for further causal assessment and mediation analysis

    Deep Multi Temporal Scale Networks for Human Motion Analysis

    The movement of human beings appears to respond to a complex motor system that contains signals at different hierarchical levels. For example, an action such as ``grasping a glass on a table'' represents a high-level action, but to perform this task, the body needs several motor inputs that include the activation of different joints of the body (shoulder, arm, hand, fingers, etc.). Each of these different joints/muscles have a different size, responsiveness, and precision with a complex non-linearly stratified temporal dimension where every muscle has its temporal scale. Parts such as the fingers responds much faster to brain input than more voluminous body parts such as the shoulder. The cooperation we have when we perform an action produces smooth, effective, and expressive movement in a complex multiple temporal scale cognitive task. Following this layered structure, the human body can be described as a kinematic tree, consisting of joints connected. Although it is nowadays well known that human movement and its perception are characterised by multiple temporal scales, very few works in the literature are focused on studying this particular property. In this thesis, we will focus on the analysis of human movement using data-driven techniques. In particular, we will focus on the non-verbal aspects of human movement, with an emphasis on full-body movements. The data-driven methods can interpret the information in the data by searching for rules, associations or patterns that can represent the relationships between input (e.g. the human action acquired with sensors) and output (e.g. the type of action performed). Furthermore, these models may represent a new research frontier as they can analyse large masses of data and focus on aspects that even an expert user might miss. The literature on data-driven models proposes two families of methods that can process time series and human movement. The first family, called shallow models, extract features from the time series that can help the learning algorithm find associations in the data. These features are identified and designed by domain experts who can identify the best ones for the problem faced. On the other hand, the second family avoids this phase of extraction by the human expert since the models themselves can identify the best set of features to optimise the learning of the model. In this thesis, we will provide a method that can apply the multi-temporal scales property of the human motion domain to deep learning models, the only data-driven models that can be extended to handle this property. We will ask ourselves two questions: what happens if we apply knowledge about how human movements are performed to deep learning models? Can this knowledge improve current automatic recognition standards? In order to prove the validity of our study, we collected data and tested our hypothesis in specially designed experiments. Results support both the proposal and the need for the use of deep multi-scale models as a tool to better understand human movement and its multiple time-scale nature

    Data-Driven and Hybrid Methods for Naval Applications

    The goal of this PhD thesis is to study, design and develop data analysis methods for naval applications. Data analysis is improving our ways to understand complex phenomena by profitably taking advantage of the information laying behind a collection of data. In fact, by adopting algorithms coming from the world of statistics and machine learning it is possible to extract valuable information, without requiring specific domain knowledge of the system generating the data. The application of such methods to marine contexts opens new research scenarios, since typical naval problems can now be solved with higher accuracy rates with respect to more classical techniques, based on the physical equations governing the naval system. During this study, some major naval problems have been addressed adopting state-of-the-art and novel data analysis techniques: condition-based maintenance, consisting in assets monitoring, maintenance planning, and real-time anomaly detection; energy and consumption monitoring, in order to reduce vessel consumption and gas emissions; system safety for maneuvering control and collision avoidance; components design, in order to detect possible defects at design stage. A review of the state-of-the-art of data analysis and machine learning techniques together with the preliminary results of the application of such methods to the aforementioned problems show a growing interest in these research topics and that effective data-driven solutions can be applied to the naval context. Moreover, for some applications, data-driven models have been used in conjunction with domain-dependent methods, modelling physical phenomena, in order to exploit both mechanistic knowledge of the system and available measurements. These hybrid methods are proved to provide more accurate and interpretable results with respect to both the pure physical or data-driven approaches taken singularly, thus showing that in the naval context it is possible to offer new valuable methodologies by either providing novel statistical methods or improving the state-of-the-art ones