16 research outputs found
Machine Learning Methods with Noisy, Incomplete or Small Datasets
In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods with Noisy, Incomplete or Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide a variety of novel approaches to real-world machine learning problems where available datasets suffer from imperfections such as missing values, noise or artefacts. Contributions in applied sciences include medical applications, epidemic management tools, methodological work, and industrial applications, among others. We believe that this special issue will bring new ideas for solving this challenging problem, and will provide clear examples of application in real-world scenarios.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Zhe, Sun. Lab. Adaptive Intelligence - Riken; JapónFil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; JapónFil: Marti Puig, Pere. University of Vic; EspañaFil: Solé Casals, Jordi. University of Vic; Españ
From progesterone in biopsies to estimates of pregnancy rates: Large scale reproductive patterns of two sympatric species of common dolphin, Delphinus spp. off California, USA and Baja, Mexico
Blubber progesterone levels were measured in biopsy samples and used to predict the pregnancy status of 507 female common dolphins (204 long-beaked common dolphins, Delphinus capensis, and 303 short-beaked common dolphins, D. delphis). Samples were collected in the coastal waters of the eastern North Pacific between central California, USA and the southern end of Baja California, Mexico. The percentage of females pregnant was similar between the two species: 22.1% (n = 45) of D. capensis and 28.1% (n = 85) of D. delphis. For both species we found strong geographic patterns in pregnancy, suggesting that some areas were more conducive for pregnant females. A sizable drop in percent pregnant from early (38.8%, n = 133) to late (25.3%, n = 91) autumn was found in D. delphis but not in D. capensis. The potential for sample selectivity was examined via biopsies collected either from a large research ship or from a small, rigid-hull inflatable boat (RHIB) launched from the larger ship. An analysis of “Tandem Biopsy Sampling”, replicate biopsy effort on the same schools from each vessel/platform, yielded little evidence that disproportionately more pregnant female common dolphins were biopsied from one platform versus the other. This result plus an analysis of pregnancy status relative to the duration of biopsy operations failed to uncover strong evidence of unaccounted sampling bias with respect to pregnancy state. In total, these results demonstrate the utility of blubber progesterone concentrations to assess pregnancy status in free-ranging cetaceans and they highlight potential factors associated with population-level variation in dolphin pregnancy rates
The changing ecology of the Old Crow Flats wetlands
This work grew from concern expressed principally by elders of the Vuntut Gwitchin First Nation, that the wetlands of the Crow Flats upon which generations have depended, are showing distressing changes. The thought was that, remembering several citizens were involved in wetland research about 40 years ago, a new but similar effort could document and perhaps explain those changes." -- from pg. 2
Modeling Susceptibility of Forests to Hurricane Damage Based on Forest Ownership, Age, and Type
This study examined the severity of wind damage created by Hurricane Katrina in southeast Mississippi to determine how the disturbance was influenced by fragmentation based on different forest ownership groups (Non-corporate private forest, corporate private forest and public forest). MODIS-NDVI percent change products were coupled with ownership, rainfall, and Landsat based thematic maps depicting forest age and forest types using GIS techniques to examine potential contributing factors to possible damage for the study area. Multiple linear and binary logistic regression methods were used to explain the relationship between severity of damage and forest age, forest type, ownership, and rainfall. Results indicate that the NDVI percent change had a negative relationship with forest age diversity and a positive relationship with forest type diversity and rainfall. There was no clear and direct consistent relationship between NDVI percent change and forest ownership
Modeling Susceptibility of Forests to Hurricane Damage Based on Forest Ownership, Age, and Type
This study examined the severity of wind damage created by Hurricane Katrina in southeast Mississippi to determine how the disturbance was influenced by fragmentation based on different forest ownership groups (Non-corporate private forest, corporate private forest and public forest). MODIS-NDVI percent change products were coupled with ownership, rainfall, and Landsat based thematic maps depicting forest age and forest types using GIS techniques to examine potential contributing factors to possible damage for the study area. Multiple linear and binary logistic regression methods were used to explain the relationship between severity of damage and forest age, forest type, ownership, and rainfall. Results indicate that the NDVI percent change had a negative relationship with forest age diversity and a positive relationship with forest type diversity and rainfall. There was no clear and direct consistent relationship between NDVI percent change and forest ownership
Mapping Coral Reef Habitats in Southeast Florida Using a Combined Technique Approach
To create maps of nearshore benthic habitats of Broward County, Florida, from 0 to 35 m depth, we combined laser bathymetry, acoustic ground discrimination, subbottom profiling, and aerial photography data in a geographic information system (GIS). A mosaic of interpolated, sun-shaded, laser bathymetry data served as the foundation upon which acoustic ground discrimination, limited subbottom profiling and aerial photography, and groundtruthing data aided in interpretation of habitats. Mapping criteria similar to NOAA biogeographic Caribbean mapping were used to allow for a comparable output. Expert-driven visual interpretation outlined geomorphological features at a scale of 1 : 6000 with a minimum mapping unit of 1 acre. Acoustic data were then used to differentiate areas of similar geomorphology by their acoustic diversity into areas of high and low scatter, which could be equated to rugosity created by either the substratum or benthic fauna. Of the approximately 112 km² mapped, 56.62 km² were coral reef and colonized hard bottom (50.42%), 54.78 km² were unconsolidated sediments (46.80%), and 0.43 km² were other categories (2.78%). Three linear reef complexes exist. The outermost linear reef has a mature windward reef morphology including a drowned spur and groove system, which was absent on the other two reef lines. The acoustic ground discrimination and groundtruthing showed different benthic habitats on the outer vs. middle and inner reefs. Higher acoustic scatter could be related to taller benthos and more rugose substratum. A considerable amount of colonized pavement (nearshore hard grounds) was found inshore. The map of Broward County yielded a high overall accuracy of 89.6%, only slightly less than the photo-interpreted NOAA Caribbean maps (overall accuracy of 91.1%). User and producer accuracies within each category were also similar. The combined technique approach was effective and accurate, and similar methodology can be used in other areas where photo interpretation is not feasible because of turbidity or depth limitations
Machine Learning Methods with Noisy, Incomplete or Small Datasets
In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods with Noisy, Incomplete or Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide a variety of novel approaches to real-world machine learning problems where available datasets suffer from imperfections such as missing values, noise or artefacts. Contributions in applied sciences include medical applications, epidemic management tools, methodological work, and industrial applications, among others. We believe that this special issue will bring new ideas for solving this challenging problem, and will provide clear examples of application in real-world scenarios.Instituto Argentino de Radioastronomí
Machine Learning Methods with Noisy, Incomplete or Small Datasets
In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios
Recommended from our members
Beyond taxonomic identification: integration of ecological responses to a soil bacterial 16S rRNA gene database
High-throughput sequencing 16S rRNA gene surveys have enabled new insights into the diversity of soil bacteria, and furthered understanding of the ecological drivers of abundances across landscapes. However, current analytical approaches are of limited use in formalizing syntheses of the ecological attributes of taxa discovered, because derived taxonomic units are typically unique to individual studies and sequence identification databases only characterize taxonomy. To address this, we used sequences obtained from a large nationwide soil survey (GB Countryside Survey, henceforth CS) to create a comprehensive soil specific 16S reference database, with coupled ecological information derived from survey metadata. Specifically, we modeled taxon responses to soil pH at the OTU level using hierarchical logistic regression (HOF) models, to provide information on both the shape of landscape scale pH-abundance responses, and pH optima (pH at which OTU abundance is maximal). We identify that most of the soil OTUs examined exhibited a non-flat relationship with soil pH. Further, the pH optima could not be generalized by broad taxonomy, highlighting the need for tools and databases synthesizing ecological traits at finer taxonomic resolution. We further demonstrate the utility of the database by testing against geographically dispersed query 16S datasets; evaluating efficacy by quantifying matches, and accuracy in predicting pH responses of query sequences from a separate large soil survey. We found that the CS database provided good coverage of dominant taxa; and that the taxa indicating soil pH in a query dataset corresponded with the pH classifications of top matches in the CS database. Furthermore we were able to predict query dataset community structure, using predicted abundances of dominant taxa based on query soil pH data and the HOF models of matched CS database taxa. The database with associated HOF model outputs is released as an online portal for querying single sequences of interest (https://shiny-apps.ceh.ac.uk/ID-TaxER/), and flat files are made available for use in bioinformatic pipelines. The further development of advanced informatics infrastructures incorporating modeled ecological attributes along with new functional genomic information will likely facilitate large scale exploration and prediction of soil microbial functional biodiversity under current and future environmental change scenarios