18,611 research outputs found

    The Viability and Potential Consequences of IoT-Based Ransomware

    Get PDF
    With the increased threat of ransomware and the substantial growth of the Internet of Things (IoT) market, there is significant motivation for attackers to carry out IoT-based ransomware campaigns. In this thesis, the viability of such malware is tested. As part of this work, various techniques that could be used by ransomware developers to attack commercial IoT devices were explored. First, methods that attackers could use to communicate with the victim were examined, such that a ransom note was able to be reliably sent to a victim. Next, the viability of using "bricking" as a method of ransom was evaluated, such that devices could be remotely disabled unless the victim makes a payment to the attacker. Research was then performed to ascertain whether it was possible to remotely gain persistence on IoT devices, which would improve the efficacy of existing ransomware methods, and provide opportunities for more advanced ransomware to be created. Finally, after successfully identifying a number of persistence techniques, the viability of privacy-invasion based ransomware was analysed. For each assessed technique, proofs of concept were developed. A range of devices -- with various intended purposes, such as routers, cameras and phones -- were used to test the viability of these proofs of concept. To test communication hijacking, devices' "channels of communication" -- such as web services and embedded screens -- were identified, then hijacked to display custom ransom notes. During the analysis of bricking-based ransomware, a working proof of concept was created, which was then able to remotely brick five IoT devices. After analysing the storage design of an assortment of IoT devices, six different persistence techniques were identified, which were then successfully tested on four devices, such that malicious filesystem modifications would be retained after the device was rebooted. When researching privacy-invasion based ransomware, several methods were created to extract information from data sources that can be commonly found on IoT devices, such as nearby WiFi signals, images from cameras, or audio from microphones. These were successfully implemented in a test environment such that ransomable data could be extracted, processed, and stored for later use to blackmail the victim. Overall, IoT-based ransomware has not only been shown to be viable but also highly damaging to both IoT devices and their users. While the use of IoT-ransomware is still very uncommon "in the wild", the techniques demonstrated within this work highlight an urgent need to improve the security of IoT devices to avoid the risk of IoT-based ransomware causing havoc in our society. Finally, during the development of these proofs of concept, a number of potential countermeasures were identified, which can be used to limit the effectiveness of the attacking techniques discovered in this PhD research

    BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts

    Full text link
    Twitter bot detection has become a crucial task in efforts to combat online misinformation, mitigate election interference, and curb malicious propaganda. However, advanced Twitter bots often attempt to mimic the characteristics of genuine users through feature manipulation and disguise themselves to fit in diverse user communities, posing challenges for existing Twitter bot detection models. To this end, we propose BotMoE, a Twitter bot detection framework that jointly utilizes multiple user information modalities (metadata, textual content, network structure) to improve the detection of deceptive bots. Furthermore, BotMoE incorporates a community-aware Mixture-of-Experts (MoE) layer to improve domain generalization and adapt to different Twitter communities. Specifically, BotMoE constructs modal-specific encoders for metadata features, textual content, and graphical structure, which jointly model Twitter users from three modal-specific perspectives. We then employ a community-aware MoE layer to automatically assign users to different communities and leverage the corresponding expert networks. Finally, user representations from metadata, text, and graph perspectives are fused with an expert fusion layer, combining all three modalities while measuring the consistency of user information. Extensive experiments demonstrate that BotMoE significantly advances the state-of-the-art on three Twitter bot detection benchmarks. Studies also confirm that BotMoE captures advanced and evasive bots, alleviates the reliance on training data, and better generalizes to new and previously unseen user communities.Comment: Accepted at SIGIR 202

    Breast mass segmentation from mammograms with deep transfer learning

    Get PDF
    Abstract. Mammography is an x-ray imaging method used in breast cancer screening, which is a time consuming process. Many different computer assisted diagnosis have been created to hasten the image analysis. Deep learning is the use of multilayered neural networks for solving different tasks. Deep learning methods are becoming more advanced and popular for segmenting images. One deep transfer learning method is to use these neural networks with pretrained weights, which typically improves the neural networks performance. In this thesis deep transfer learning was used to segment cancerous masses from mammography images. The convolutional neural networks used were pretrained and fine-tuned, and they had an an encoder-decoder architecture. The ResNet22 encoder was pretrained with mammography images, while the ResNet34 encoder was pretrained with various color images. These encoders were paired with either a U-Net or a Feature Pyramid Network decoder. Additionally, U-Net model with random initialization was also tested. The five different models were trained and tested on the Oulu Dataset of Screening Mammography (9204 images) and on the Portuguese INbreast dataset (410 images) with two different loss functions, binary cross-entropy loss with soft Jaccard loss and a loss function based on focal Tversky index. The best models were trained on the Oulu Dataset of Screening Mammography with the focal Tversky loss. The best segmentation result achieved was a Dice similarity coefficient of 0.816 on correctly segmented masses and a classification accuracy of 88.7% on the INbreast dataset. On the Oulu Dataset of Screening Mammography, the best results were a Dice score of 0.763 and a classification accuracy of 83.3%. The results between the pretrained models were similar, and the pretrained models had better results than the non-pretrained models. In conclusion, deep transfer learning is very suitable for mammography mass segmentation and the choice of loss function had a large impact on the results.Rinnan massojen segmentointi mammografiakuvista syvä- ja siirto-oppimista hyödyntäen. Tiivistelmä. Mammografia on röntgenkuvantamismenetelmä, jota käytetään rintäsyövän seulontaan. Mammografiakuvien seulonta on aikaa vievää ja niiden analysoimisen avuksi on kehitelty useita tietokoneavusteisia ratkaisuja. Syväoppimisella tarkoitetaan monikerroksisten neuroverkkojen käyttöä eri tehtävien ratkaisemiseen. Syväoppimismenetelmät ovat ajan myötä kehittyneet ja tulleet suosituiksi kuvien segmentoimiseen. Yksi tapa yhdistää syvä- ja siirtooppimista on hyödyntää neuroverkkoja esiopetettujen painojen kanssa, mikä auttaa parantamaan neuroverkkojen suorituskykyä. Tässä diplomityössä tutkittiin syvä- ja siirto-oppimisen käyttöä syöpäisten massojen segmentoimiseen mammografiakuvista. Käytetyt konvoluutioneuroverkot olivat esikoulutettuja ja hienosäädettyjä. Lisäksi niillä oli enkooderi-dekooderi arkkitehtuuri. ResNet22 enkooderi oli esikoulutettu mammografia kuvilla, kun taas ResNet34 enkooderi oli esikoulutettu monenlaisilla värikuvilla. Näihin enkoodereihin yhdistettiin joko U-Net:n tai piirrepyramidiverkon dekooderi. Lisäksi käytettiin U-Net mallia ilman esikoulutusta. Nämä viisi erilaista mallia koulutettiin ja testattiin sekä Oulun Mammografiaseulonta Datasetillä (9204 kuvaa) että portugalilaisella INbreast datasetillä (410 kuvaa) käyttäen kahta eri tavoitefunktiota, jotka olivat binääriristientropia yhdistettynä pehmeällä Jaccard-indeksillä ja fokaaliin Tversky indeksiin perustuva tavoitefunktiolla. Parhaat mallit olivat koulutettu Oulun datasetillä fokaalilla Tversky tavoitefunktiolla. Parhaat tulokset olivat 0,816 Dice kerroin oikeissa positiivisissa segmentaatioissa ja 88,7 % luokittelutarkkuus INbreast datasetissä. Esikoulutetut mallit antoivat parempia tuloksia kuin mallit joita ei esikoulutettu. Oulun datasetillä parhaat tulokset olivat 0,763:n Dice kerroin ja 83,3 % luokittelutarkkuus. Tuloksissa ei ollut suurta eroa esikoulutettujen mallien välillä. Tulosten perusteella syvä- ja siirto-oppiminen soveltuvat hyvin massojen segmentoimiseen mammografiakuvista. Lisäksi tavoitefunktiovalinnalla saatiin suuri vaikutus tuloksiin

    Unmanned Aerial Vehicles (UAVs) to compare foraging sea turtle density and distribution of sea turtles in two contrasting habitats in the Chagos Archipelago

    Get PDF
    Unmanned Aerial Vehicles (UAVs) facilitate observation of elusive species or remote locations, and are increasingly used to survey marine habitats. Marine Protected Areas (MPAs) are a conservation tool used to protect marine species, and regular population assessments can establish if MPAs are effectively facilitating the recovery of endangered species. Sea turtles in the Western Indian Ocean have been historically exploited through trade and by-catch causing a reduction in numbers. Here, UAVs were utilised to assess the population density and distribution of green (Chelonia mydas) and hawksbill (Eretmochelys imbricata) turtles between ocean and lagoon environments in the Chagos Archipelago. Analysis protocols were developed to process UAV imagery, including carapace-measurement techniques, and certainty-classing turtle observations (Definite, Probable or Possible). Along 20 km of coastline, 5.13 km2 was surveyed across 11 days between July 2019 – February 2021 resulting in a high-certainty estimate of 381 turtles and a low-certainty estimate of 660. Species and life-stage identification implicate Chagos as developmental habitat for immature hawksbill turtles: 78.47% (n = 299/381) of identified definite turtles were immature, of which 66.55% (n = 199/299) were hawksbill. Diego Garcia Ocean Site 1, West sites and Turtle Cove were significant turtle hotspots (high-certainty results: 257.19 individuals/km2, 146.15 individuals/km2, and 135.08 individuals/km2, respectively), while Marina sites were least-dense (0 - 4.87 individuals/km2). Results for low-certainty data were comparable: 325.27 individuals/km2 in Diego Garcia Site 1, followed by 309.27 and 292.67 individuals/km2 in Turtle Cove. Population density decreased significantly with increasing distance from the shore, and decreased with increasing distance from Turtle Cove. Green turtles were smaller (50.33 ± 17.65 cm straight-carapace length, SCL) than hawksbill turtles (53.16 ± 11.17 cm SCL). This study highlights the Chagos Archipelago as developmental habitat for immature turtles, and demonstrates the applicability of UAVs for in-situ population monitoring to infer conservation status of marine megafauna

    Learning disentangled speech representations

    Get PDF
    A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody. The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions. In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks. This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
    corecore