5 research outputs found

    Automated Crowdturfing Attacks and Defenses in Online Review Systems

    Full text link
    Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect. Using Yelp reviews as an example platform, we show how a two phased review generation and customization attack can produce reviews that are indistinguishable by state-of-the-art statistical detectors. We conduct a survey-based user study to show these reviews not only evade human detection, but also score high on "usefulness" metrics by users. Finally, we develop novel automated defenses against these attacks, by leveraging the lossy transformation introduced by the RNN training and generation cycle. We consider countermeasures against our mechanisms, show that they produce unattractive cost-benefit tradeoffs for attackers, and that they can be further curtailed by simple constraints imposed by online service providers

    Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets

    No full text
    While Deep Neural Networks (DNNs) have achieved tremen-dous success for large vocabulary continuous speech recognition (LVCSR) tasks, training of these networks is slow. One reason is that DNNs are trained with a large number of training parameters (i.e., 10-50 million). Because networks are trained with a large number of output targets to achieve good performance, the majority of these parameters are in the final weight layer. In this paper, we propose a low-rank matrix factorization of the final weight layer. We apply this low-rank technique to DNNs for both acoustic modeling and lan-guage modeling. We show on three different LVCSR tasks ranging between 50-400 hrs, that a low-rank factorization reduces the num-ber of parameters of the network by 30-50%. This results in roughly an equivalent reduction in training time, without a significant loss in final recognition accuracy, compared to a full-rank representation. Index Terms—Deep Neural Networks, Speech Recognition 1

    Unlimited vocabulary speech recognition for agglutinative languages

    No full text
    It is practically impossible to build a word-based lexicon for speech recognition in agglutinative languages that would cover all the relevant words. The problem is that words are generally built by concatenating several prefixes and suffixes to the word roots. Together with compounding and inflections this leads to millions of different, but still frequent word forms. Due to inflections, ambiguity and other phenomena, it is also not trivial to automatically split the words into meaningful parts. Rule-based morphological analyzers can perform this splitting, but due to the handcrafted rules, they also suffer from an out-of-vocabulary problem. In this paper we apply a recently proposed fully automatic and rather language and vocabulary independent way to build subword lexica for three different agglutinative languages. We demonstrate the language portability as well by building a successful large vocabulary speech recognizer for each language and show superior recognition performance compared to the corresponding word-based reference systems.

    An observational, multicenter, registry-based cohort study of Turkish Neonatal Society in neonates with Hypoxic ischemic encephalopathy

    No full text
    BACKGROUND: Hypoxic ischemic encephalopathy (HIE) is a significant cause of mortality and short- and long-term morbidities. Therapeutic hypothermia (TH) has been shown to be the standard care for HIE of infants ≥36 weeks gestational age (GA), as it has been demonstrated to reduce the rates of mortality, and adverse neurodevelopmental outcomes. This study aims to determine the incidence of HIE in our country, to assess the TH management in infants with HIE, and present short-term outcomes of these infants. METHODS: The Turkish Hypoxic Ischemic Encephalopathy Online Registry database was established for this multicenter, prospective, observational, nationally-based cohort study to evaluate the data of infants born at ≥34 weeks GA who displayed evidence of neonatal encephalopathy (NE) between March, 2020 and April 2022. RESULTS: The incidence of HIE among infants born at ≥36 weeks GA (n = 965) was 2.13 per 1000 live births (517:242440), and accounting for 1.55% (965:62062) of all neonatal intensive care unit admissions. The rates of mild, moderate and severe HİE were 25.5% (n = 246), 58.9% (n = 568), and 15.6% (n = 151), respectively. Infants with severe HIE had higher rates of abnormal magnetic resonance imaging (MRI) findings, and mortality (p6 h) (p>0.05). TH was administered to 85 (34.5%) infants with mild HIE, and of those born of 34-35 weeks of GA, 67.4% (n = 31) received TH. A total of 58 (6%) deaths were reported with a higher mortality rate in infants born at 34-35 weeks of GA (OR 3.941, 95% Cl 1.446-10.7422, p = 0.007). CONCLUSION: The incidence of HIE remained similar over time with a reduction in mortality rate. The timing of TH initiation, whether <3 or 3-6 h, did not result in lower occurrences of brain lesions on MRI or mortality. An increasing number of infants with mild HIE and late preterm infants with HIE are receiving TH; however, the indications for TH require further clarification. Longer follow-up studies are necessary for this vulnerable population
    corecore