7,294 research outputs found
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
An exploration of the language within Ofsted reports and their influence on primary school performance in mathematics: a mixed methods critical discourse analysis
This thesis contributes to the understanding of the language of Ofsted reports, their similarity to one another and associations between different terms used within ‘areas for improvement’ sections and subsequent outcomes for pupils. The research responds to concerns from serving headteachers that Ofsted reports are overly similar, do not capture the unique story of their school, and are unhelpful for improvement. In seeking to answer ‘how similar are
Ofsted reports’ the study uses two tools, a plagiarism detection software (Turnitin) and a discourse analysis tool (NVivo) to identify trends within and across a large corpus of reports.
The approach is based on critical discourse analysis (Van Dijk, 2009; Fairclough, 1989) but shaped in the form of practitioner enquiry seeking power in the form of impact on pupils and practitioners, rather than a more traditional, sociological application of the method.
The research found that in 2017, primary school section 5 Ofsted reports had more than half of their content exactly duplicated within other primary school inspection reports published that same year. Discourse analysis showed the quality assurance process overrode variables such as inspector designation, gender, or team size, leading to three distinct patterns of duplication: block duplication, self-referencing, and template writing. The most unique part of a report was found to be the ‘area for improvement’ section, which was tracked to externally verified outcomes for pupils using terms linked to ‘mathematics’. Those
required to improve mathematics in their areas for improvement improved progress and attainment in mathematics significantly more than national rates. These findings indicate that there was a positive correlation between the inspection reporting process and a beneficial impact on pupil outcomes in mathematics, and that the significant similarity of one report to another had no bearing on the usefulness of the report for school improvement purposes
within this corpus
The Viability and Potential Consequences of IoT-Based Ransomware
With the increased threat of ransomware and the substantial growth of the Internet of Things (IoT) market, there is significant motivation for attackers to carry out IoT-based ransomware campaigns. In this thesis, the viability of such malware is tested.
As part of this work, various techniques that could be used by ransomware developers to attack commercial IoT devices were explored. First, methods that attackers could use to communicate with the victim were examined, such that a ransom note was able to be reliably sent to a victim. Next, the viability of using "bricking" as a method of ransom was evaluated, such that devices could be remotely disabled unless the victim makes a payment to the attacker. Research was then performed to ascertain whether it was possible to remotely gain persistence on IoT devices, which would improve the efficacy of existing ransomware methods, and provide opportunities for more advanced ransomware to be created. Finally, after successfully identifying a number of persistence techniques, the viability of privacy-invasion based ransomware was analysed.
For each assessed technique, proofs of concept were developed. A range of devices -- with various intended purposes, such as routers, cameras and phones -- were used to test the viability of these proofs of concept. To test communication hijacking, devices' "channels of communication" -- such as web services and embedded screens -- were identified, then hijacked to display custom ransom notes. During the analysis of bricking-based ransomware, a working proof of concept was created, which was then able to remotely brick five IoT devices. After analysing the storage design of an assortment of IoT devices, six different persistence techniques were identified, which were then successfully tested on four devices, such that malicious filesystem modifications would be retained after the device was rebooted. When researching privacy-invasion based ransomware, several methods were created to extract information from data sources that can be commonly found on IoT devices, such as nearby WiFi signals, images from cameras, or audio from microphones. These were successfully implemented in a test environment such that ransomable data could be extracted, processed, and stored for later use to blackmail the victim.
Overall, IoT-based ransomware has not only been shown to be viable but also highly damaging to both IoT devices and their users. While the use of IoT-ransomware is still very uncommon "in the wild", the techniques demonstrated within this work highlight an urgent need to improve the security of IoT devices to avoid the risk of IoT-based ransomware causing havoc in our society. Finally, during the development of these proofs of concept, a number of potential countermeasures were identified, which can be used to limit the effectiveness of the attacking techniques discovered in this PhD research
Qluster: An easy-to-implement generic workflow for robust clustering of health data
The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors
Wildlife trade in Latin America: people, economy and conservation
Wildlife trade is among the main threats to biodiversity conservation and may pose a risk to human health because of the spread of zoonotic diseases. To avoid social, economic and environmental consequences of illegal trade, it is crucial to understand the factors influencing the wildlife market and the effectiveness of policies already in place. I aim to unveil the biological and socioeconomic factors driving wildlife trade, the health risks imposed by the activity, and the effectiveness of certified captive-breeding as a strategy to curb the illegal market in Latin America through a multidisciplinary approach. I assess socioeconomic correlates of the emerging international trade in wild cat species from Latin America using a dataset of >1,000 seized cats, showing that high levels of corruption and Chinese private investment and low income per capita were related to higher numbers of jaguar seizures. I assess the effectiveness of primate captive-breeding programmes as an intervention to curb wildlife trafficking. Illegal sources held >70% of the primate market share. Legal primates are more expensive, and the production is not sufficiently high to fulfil the demand. I assess the scale of the illegal trade and ownership of venomous snakes in Brazil. Venomous snake taxa responsible for higher numbers of snakebites were those most often kept as pets. I uncover how online wildlife pet traders and consumers responded to campaigns associating the origin of the COVID-19 pandemic. Of 20,000 posts on Facebook groups, only 0.44% mentioned COVID-19 and several stimulated the trade in wild species during lockdown. Despite the existence of international and national wildlife trade regulations, I conclude that illegal wildlife trade is still an issue that needs further addressing in Latin America. I identify knowledge gaps and candidate interventions to amend the current loopholes to reduce wildlife trafficking. My aspiration with this thesis is to provide useful information that can inform better strategies to tackle illegal wildlife trade in Latin America
Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond
[ES] Esta tesis se enmarca en la intersección entre las técnicas modernas de Machine Learning, como las Redes Neuronales Profundas, y el modelado probabilÃstico confiable. En muchas aplicaciones, no solo nos importa la predicción hecha por un modelo (por ejemplo esta imagen de pulmón presenta cáncer) sino también la confianza que tiene el modelo para hacer esta predicción (por ejemplo esta imagen de pulmón presenta cáncer con 67% probabilidad). En tales aplicaciones, el modelo ayuda al tomador de decisiones (en este caso un médico) a tomar la decisión final. Como consecuencia, es necesario que las probabilidades proporcionadas por un modelo reflejen las proporciones reales presentes en el conjunto al que se ha asignado dichas probabilidades; de lo contrario, el modelo es inútil en la práctica. Cuando esto sucede, decimos que un modelo está perfectamente calibrado.
En esta tesis se exploran tres vias para proveer modelos más calibrados. Primero se muestra como calibrar modelos de manera implicita, que son descalibrados por técnicas de aumentación de datos. Se introduce una función de coste que resuelve esta descalibración tomando como partida las ideas derivadas de la toma de decisiones con la regla de Bayes. Segundo, se muestra como calibrar modelos utilizando una etapa de post calibración implementada con una red neuronal Bayesiana. Finalmente, y en base a las limitaciones estudiadas en la red neuronal Bayesiana, que hipotetizamos que se basan en un prior mispecificado, se introduce un nuevo proceso estocástico que sirve como distribución a priori en un problema de inferencia Bayesiana.[CA] Aquesta tesi s'emmarca en la intersecció entre les tècniques modernes de Machine Learning, com ara les Xarxes Neuronals Profundes, i el modelatge probabilÃstic fiable. En moltes aplicacions, no només ens importa la predicció feta per un model (per ejemplem aquesta imatge de pulmó presenta cà ncer) sinó també la confiança que té el model per fer aquesta predicció (per exemple aquesta imatge de pulmó presenta cà ncer amb 67% probabilitat). En aquestes aplicacions, el model ajuda el prenedor de decisions (en aquest cas un metge) a prendre la decisió final. Com a conseqüència, cal que les probabilitats proporcionades per un model reflecteixin les proporcions reals presents en el conjunt a què s'han assignat aquestes probabilitats; altrament, el model és inútil a la prà ctica. Quan això passa, diem que un model està perfectament calibrat.
En aquesta tesi s'exploren tres vies per proveir models més calibrats. Primer es mostra com calibrar models de manera implÃcita, que són descalibrats per tècniques d'augmentació de dades. S'introdueix una funció de cost que resol aquesta descalibració prenent com a partida les idees derivades de la presa de decisions amb la regla de Bayes. Segon, es mostra com calibrar models utilitzant una etapa de post calibratge implementada amb una xarxa neuronal Bayesiana. Finalment, i segons les limitacions estudiades a la xarxa neuronal Bayesiana, que es basen en un prior mispecificat, s'introdueix un nou procés estocà stic que serveix com a distribució a priori en un problema d'inferència Bayesiana.[EN] This thesis is framed at the intersection between modern Machine Learning techniques, such as Deep Neural Networks, and reliable probabilistic modeling. In many machine learning applications, we do not only care about the prediction made by a model (e.g. this lung image presents cancer) but also in how confident is the model in making this prediction (e.g. this lung image presents cancer with 67% probability). In such applications, the model assists the decision-maker (in this case a doctor) towards making the final decision. As a consequence, one needs that the probabilities provided by a model reflects the true underlying set of outcomes, otherwise the model is useless in practice. When this happens, we say that a model is perfectly calibrated.
In this thesis three ways are explored to provide more calibrated models. First, it is shown how to calibrate models implicitly, which are decalibrated by data augmentation techniques. A cost function is introduced that solves this decalibration taking as a starting point the ideas derived from decision making with Bayes' rule. Second, it shows how to calibrate models using a post-calibration stage implemented with a Bayesian neural network. Finally, and based on the limitations studied in the Bayesian neural network, which we hypothesize that came from a mispecified prior, a new stochastic process is introduced that serves as a priori distribution in a Bayesian inference problem.Maroñas Molano, J. (2022). Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181582TESI
Image classification over unknown and anomalous domains
A longstanding goal in computer vision research is to develop methods that are simultaneously applicable to a broad range of prediction problems. In contrast to this, models often perform best when they are specialized to some task or data type. This thesis investigates the challenges of learning models that generalize well over multiple unknown or anomalous modes and domains in data, and presents new solutions for learning robustly in this setting.
Initial investigations focus on normalization for distributions that contain multiple sources (e.g. images in different styles like cartoons or photos). Experiments demonstrate the extent to which existing modules, batch normalization in particular, struggle with such heterogeneous data, and a new solution is proposed that can better handle data from multiple visual modes, using differing sample statistics for each.
While ideas to counter the overspecialization of models have been formulated in sub-disciplines of transfer learning, e.g. multi-domain and multi-task learning, these usually rely on the existence of meta information, such as task or domain labels. Relaxing this assumption gives rise to a new transfer learning setting, called latent domain learning in this thesis, in which training and inference are carried out over data from multiple visual domains, without domain-level annotations. Customized solutions are required for this, as the performance of standard models degrades: a new data augmentation technique that interpolates between latent domains in an unsupervised way is presented, alongside a dedicated module that sparsely accounts for hidden domains in data, without requiring domain labels to do so.
In addition, the thesis studies the problem of classifying previously unseen or anomalous modes in data, a fundamental problem in one-class learning, and anomaly detection in particular. While recent ideas have been focused on developing self-supervised solutions for the one-class setting, in this thesis new methods based on transfer learning are formulated. Extensive experimental evidence demonstrates that a transfer-based perspective benefits new problems that have recently been proposed in anomaly detection literature, in particular challenging semantic detection tasks
A productive response to legacy system petrification
Requirements change. The requirements of a legacy information system change, often in unanticipated ways, and at a more rapid pace than the rate at which the information system itself can be evolved to support them. The capabilities of a legacy system progressively fall further and further behind their evolving requirements, in a degrading process termed petrification. As systems petrify, they deliver diminishing business value, hamper business effectiveness, and drain organisational resources. To address legacy systems, the first challenge is to understand how to shed their resistance to tracking requirements change. The second challenge is to ensure that a newly adaptable system never again petrifies into a change resistant legacy system. This thesis addresses both challenges. The approach outlined herein is underpinned by an agile migration process - termed Productive Migration - that homes in upon the specific causes of petrification within each particular legacy system and provides guidance upon how to address them. That guidance comes in part from a personalised catalogue of petrifying patterns, which capture recurring themes underlying petrification. These steer us to the problems actually present in a given legacy system, and lead us to suitable antidote productive patterns via which we can deal with those problems one by one. To prevent newly adaptable systems from again degrading into legacy systems, we appeal to a follow-on process, termed Productive Evolution, which embraces and keeps pace with change rather than resisting and falling behind it. Productive Evolution teaches us to be vigilant against signs of system petrification and helps us to nip them in the bud. The aim is to nurture systems that remain supportive of the business, that are adaptable in step with ongoing requirements change, and that continue to retain their value as significant business assets
AI Fairness at Subgroup Level – A Structured Literature Review
AI applications in practice often fail to gain the required acceptance by stakeholders due to unfairness issues. Research has primarily investigated AI fairness on individual or group levels. However, increasing research indicates shortcomings in this two-fold view. Particularly, the non-inclusion of the heterogeneity within different groups leads to increasing demand for specific fairness consideration at the subgroup level. Subgroups emerge from the conjunction of several protected attributes. An equal distribution of classified individuals between subgroups is the fundamental goal. This paper analyzes the fundamentals of subgroup fairness and its integration in group and individual fairness. Based on a literature review, we analyze the existing concepts of subgroup fairness in research. Our paper raises awareness for this primary neglected topic in IS research and contributes to the understanding of AI subgroup fairness by providing a deeper understanding of the underlying concepts and their implications on AI development and operation in practice
AIUCD 2022 - Proceedings
L’undicesima edizione del Convegno Nazionale dell’AIUCD-Associazione di Informatica Umanistica ha per titolo Culture digitali. Intersezioni: filosofia, arti, media. Nel titolo è presente, in maniera esplicita, la richiesta di una riflessione, metodologica e teorica, sull’interrelazione tra tecnologie digitali, scienze dell’informazione, discipline filosofiche, mondo delle arti e cultural studies
- …