84,006 research outputs found
Effective sampling for large-scale automated writing evaluation systems
Automated writing evaluation (AWE) has been shown to be an effective
mechanism for quickly providing feedback to students. It has already seen wide
adoption in enterprise-scale applications and is starting to be adopted in
large-scale contexts. Training an AWE model has historically required a single
batch of several hundred writing examples and human scores for each of them.
This requirement limits large-scale adoption of AWE since human-scoring essays
is costly. Here we evaluate algorithms for ensuring that AWE models are
consistently trained using the most informative essays. Our results show how to
minimize training set sizes while maximizing predictive performance, thereby
reducing cost without unduly sacrificing accuracy. We conclude with a
discussion of how to integrate this approach into large-scale AWE systems
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
Data assimilation of in situ soil moisture measurements in hydrological models: first annual doctoral progress report, work plan and achievements
Water scarcity and the presence of water of good quality is a serious public concern since it determines the availability of water to society. Water scarcity especially in arid climates and due to extreme droughts related to climate change drive water use technologies such as irrigation to become more efficient and sustainable. Plant root water and nutrient uptake is one of the most important processes in subsurface unsaturated flow and transport modeling, as root uptake controls actual plant evapotranspiration, water recharge and nutrient leaching to the groundwater, and exerts a major influence on predictions of global climate models. To improve irrigation strategies, water flow needs to be accurately described using advanced monitoring and modeling. Our study focuses on the assimilation of hydrological data in hydrological models that predict water flow and solute (pollutants and salts) transport and water redistribution in agricultural soils under irrigation. Field plots of a potato farmer in a sandy region in Belgium were instrumented to continuously monitor soil moisture and water potential before, during and after irrigation in dry summer periods. The aim is to optimize the irrigation process by assimilating online sensor field data into process based models.
Over the past year, we demonstrated the calibration and optimization of the Hydrus 1D model for an irrigated grassland on sandy soil. Direct and inverse calibration and optimization for both heterogeneous and homogeneous conceptualizations was applied. Results show that Hydrus 1D closely simulated soil water content at five depths as compared to water content measurements from soil moisture probes, by stepwise calibration and local sensivity analysis and optimization the Ks, n and Îą value in the calibration and optimization analysis. The errors of the model, expressed by deviations between observed and modeled soil water content were, however, different for each individual depth. The smallest differences between the observed value and soil-water content were attained when using an automated inverse optimization method. The choice of the initial parameter value can be optimized using a stepwise approach. Our results show that statistical evaluation coefficients (R2, Ce and RMSE) are suitable benchmarks to evaluate the performance of the model in reproducing the data. The degree of water stress simulated with Hydrus 1D suggested to increase irrigation at least one time, i.e. at the beginning of the simulation period and further distribute the amount of irrigation during the growing season, instead of using a huge amount of irrigation later in the season.
In the next year, we will further look for to the best method (using soft data and methods for instance PTFs, EMI, Penetrometer) to derive and predict the spatial variability of soil hydraulic properties (saturated hydraulic conductivity) of the soil and link to crop yield at the field scale. Linear and non-linear pedotransfer functions (PTFs) have been assessed to predict penetrometer resistance of soils from their water status (matric potential, Ď and degree of saturation, S) and bulk density, Ďb, and some other soil properties such as sand content, Ks etc. The geophysical EMI (electromagnetic induction) technique provides a versatile and robust field instrument for determining apparent soil electrical conductivity (ECa). ECa, a quick and reliable measurement, is one of ancillary properties (secondary information) of soil, can improve the spatial and temporal estimation of soil characteristics e.g., salinity, water content, texture, prosity and bulk density at different scales and depths. According to previous literature on penetrometer measurements, we determined the effective stress and used some models to find the relationships between soil properties, especially Ks, and penetrometer resistance as one of the prediction methods for Ks. The initial results obtained in the first yearshowed that a new data set would be necessary to validate the results of this part.
In the third year, quasi 3D-modelling of water flow at the field scale will be conducted. In this modeling set -up, the field will be modeled as a collection of 1D-columns representing the different field conditions (combination of soil properties, groundwater depth, root zone depth). The measured soil properties are extrapolated over the entire field by linking them to the available spatially distributed data (such as the EMI-images). The data set of predicted Ks and other soil properties for the whole field constructed in the previous steps will be used for parameterising the model. Sensitivity analysis âSAâ is essential to the model optimization or parametrization process. To avoid overparameterization, the use of global sensitivity analysis (SA) will be investigated. In order to include multiple objectives (irrigation management parameters, costs, âŚ) in the parameter optimization strategy, multi-objective techniques such as AMALGAM have been introduced. We will investigate multi-objective strategies in the irrigation optimization
The Hierarchic treatment of marine ecological information from spatial networks of benthic platforms
Measuring biodiversity simultaneously in different locations, at different temporal scales, and over wide spatial scales is of strategic importance for the improvement of our understanding of the functioning of marine ecosystems and for the conservation of their biodiversity. Monitoring networks of cabled observatories, along with other docked autonomous systems (e.g., Remotely Operated Vehicles [ROVs], Autonomous Underwater Vehicles [AUVs], and crawlers), are being conceived and established at a spatial scale capable of tracking energy fluxes across benthic and pelagic compartments, as well as across geographic ecotones. At the same time, optoacoustic imaging is sustaining an unprecedented expansion in marine ecological monitoring, enabling the acquisition of new biological and environmental data at an appropriate spatiotemporal scale. At this stage, one of the main problems for an effective application of these technologies is the processing, storage, and treatment of the acquired complex ecological information. Here, we provide a conceptual overview on the technological developments in the multiparametric generation, storage, and automated hierarchic treatment of biological and environmental information required to capture the spatiotemporal complexity of a marine ecosystem. In doing so, we present a pipeline of ecological data acquisition and processing in different steps and prone to automation. We also give an example of population biomass, community richness and biodiversity data computation (as indicators for ecosystem functionality) with an Internet Operated Vehicle (a mobile crawler). Finally, we discuss the software requirements for that automated data processing at the level of cyber-infrastructures with sensor calibration and control, data banking, and ingestion into large data portals.Peer ReviewedPostprint (published version
Psychometrics in Practice at RCEC
A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud
All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment
Interactive retrieval of video using pre-computed shot-shot similarities
A probabilistic framework for content-based interactive video retrieval is described. The developed indexing of video fragments originates from the probability of the user's positive judgment about key-frames of video shots. Initial estimates of the probabilities are obtained from low-level feature representation. Only statistically significant estimates are picked out, the rest are replaced by an appropriate constant allowing efficient access at search time without loss of search quality and leading to improvement in most experiments. With time, these probability estimates are updated from the relevance judgment of users performing searches, resulting in further substantial increases in mean average precision
Exploiting low-cost 3D imagery for the purposes of detecting and analyzing pavement distresses
Road pavement conditions have significant impacts on safety, travel times, costs, and environmental effects. It is the responsibility of road agencies to ensure these conditions are kept in an acceptable state. To this end, agencies are tasked with implementing pavement management systems (PMSs) which effectively allocate resources towards maintenance and rehabilitation. These systems, however, require accurate data. Currently, most agencies rely on manual distress surveys and as a result, there is significant research into quick and low-cost pavement distress identification methods. Recent proposals have included the use of structure-from-motion techniques based on datasets from unmanned aerial vehicles (UAVs) and cameras, producing accurate 3D models and associated point clouds. The challenge with these datasets is then identifying and describing distresses. This paper focuses on utilizing images of pavement distresses in the city of Palermo, Italy produced by mobile phone cameras. The work aims at assessing the accuracy of using mobile phones for these surveys and also identifying strategies to segment generated 3D imagery by considering the use of algorithms for 3D Image segmentation to detect shapes from point clouds to enable measurement of physical parameters and severity assessment. Case studies are considered for pavement distresses defined by the measurement of the area affected such as different types of cracking and depressions. The use of mobile phones and the identification of these patterns on the 3D models provide further steps towards low-cost data acquisition and analysis for a PMS
- âŚ