206 research outputs found

    User Behavior-Based Implicit Authentication

    Get PDF
    In this work, we proposed dynamic retraining (RU), wind vane module (WVM), BubbleMap (BMap), and reinforcement authentication (RA) to improve the efficacy of implicit authentication (IA). Motivated by the great potential of implicit and seamless user authentication, we have built an implicit authentication system with adaptive sampling that automatically selects dynamic sets of activities for user behavior extraction. Various activities, such as user location, application usage, user motion, and battery usage have been popular choices to generate behaviors, the soft biometrics, for implicit authentication. Unlike password-based or hard biometric-based authentication, implicit authentication does not require explicit user action or expensive hardware. However, user behaviors can change unpredictably, which renders it more challenging to develop systems that depend on them. In addition to dynamic behavior extraction, the proposed implicit authentication system differs from the existing systems in terms of energy efficiency for battery-powered mobile devices. Since implicit authentication systems rely on machine learning, the expensive training process needs to be outsourced to the remote server. However, mobile devices may not always have reliable network connections to send real-time data to the server for training. In addition, IA systems are still at their infancy and exhibit many limitations, one of which is how to determine the best retraining frequency when updating the user behavior model. Another limitation is how to gracefully degrade user privilege when authentication fails to identify legitimate users (i.e., false negatives) for a practical IA system.To address the retraining problem, we proposed an algorithm that utilizes Jensen-Shannon (JS)-dis(tance) to determine the optimal retraining frequency, which is discussed in Chapter 2. We overcame the limitation of traditional IA by proposing a W-layer, an overlay that provides a practical and energy-efficient solution for implicit authentication on mobile devices. The W-layer is discussed in Chapter 3 and 4. In Chapter 5, a novel privilege-control mechanism, BubbleMap (BMap), is introduced to provide fine-grained privileges to users based on their behavioral scores. In the same chapter, we describe reinforcement authentication (RA) to achieve a more reliable authentication

    Spam elimination and bias correction : ensuring label quality in crowdsourced tasks.

    Get PDF
    Crowdsourcing is proposed as a powerful mechanism for accomplishing large scale tasks via anonymous workers online. It has been demonstrated as an effective and important approach for collecting labeled data in application domains which require human intelligence, such as image labeling, video annotation, natural language processing, etc. Despite the promises, one big challenge still exists in crowdsourcing systems: the difficulty of controlling the quality of crowds. The workers usually have diverse education levels, personal preferences, and motivations, leading to unknown work performance while completing a crowdsourced task. Among them, some are reliable, and some might provide noisy feedback. It is intrinsic to apply worker filtering approach to crowdsourcing applications, which recognizes and tackles noisy workers, in order to obtain high-quality labels. The presented work in this dissertation provides discussions in this area of research, and proposes efficient probabilistic based worker filtering models to distinguish varied types of poor quality workers. Most of the existing work in literature in the field of worker filtering either only concentrates on binary labeling tasks, or fails to separate the low quality workers whose label errors can be corrected from the other spam workers (with label errors which cannot be corrected). As such, we first propose a Spam Removing and De-biasing Framework (SRDF), to deal with the worker filtering procedure in labeling tasks with numerical label scales. The developed framework can detect spam workers and biased workers separately. The biased workers are defined as those who show tendencies of providing higher (or lower) labels than truths, and their errors are able to be corrected. To tackle the biasing problem, an iterative bias detection approach is introduced to recognize the biased workers. The spam filtering algorithm proposes to eliminate three types of spam workers, including random spammers who provide random labels, uniform spammers who give same labels for most of the items, and sloppy workers who offer low accuracy labels. Integrating the spam filtering and bias detection approaches into aggregating algorithms, which infer truths from labels obtained from crowds, can lead to high quality consensus results. The common characteristic of random spammers and uniform spammers is that they provide useless feedback without making efforts for a labeling task. Thus, it is not necessary to distinguish them separately. In addition, the removal of sloppy workers has great impact on the detection of biased workers, with the SRDF framework. To combat these problems, a different way of worker classification is presented in this dissertation. In particular, the biased workers are classified as a subcategory of sloppy workers. Finally, an ITerative Self Correcting - Truth Discovery (ITSC-TD) framework is then proposed, which can reliably recognize biased workers in ordinal labeling tasks, based on a probabilistic based bias detection model. ITSC-TD estimates true labels through applying an optimization based truth discovery method, which minimizes overall label errors by assigning different weights to workers. The typical tasks posted on popular crowdsourcing platforms, such as MTurk, are simple tasks, which are low in complexity, independent, and require little time to complete. Complex tasks, however, in many cases require the crowd workers to possess specialized skills in task domains. As a result, this type of task is more inclined to have the problem of poor quality of feedback from crowds, compared to simple tasks. As such, we propose a multiple views approach, for the purpose of obtaining high quality consensus labels in complex labeling tasks. In this approach, each view is defined as a labeling critique or rubric, which aims to guide the workers to become aware of the desirable work characteristics or goals. Combining the view labels results in the overall estimated labels for each item. The multiple views approach is developed under the hypothesis that workers\u27 performance might differ from one view to another. Varied weights are then assigned to different views for each worker. Additionally, the ITSC-TD framework is integrated into the multiple views model to achieve high quality estimated truths for each view. Next, we propose a Semi-supervised Worker Filtering (SWF) model to eliminate spam workers, who assign random labels for each item. The SWF approach conducts worker filtering with a limited set of gold truths available as priori. Each worker is associated with a spammer score, which is estimated via the developed semi-supervised model, and low quality workers are efficiently detected by comparing the spammer score with a predefined threshold value. The efficiency of all the developed frameworks and models are demonstrated on simulated and real-world data sets. By comparing the proposed frameworks to a set of state-of-art methodologies, such as expectation maximization based aggregating algorithm, GLAD and optimization based truth discovery approach, in the domain of crowdsourcing, up to 28.0% improvement can be obtained for the accuracy of true label estimation

    Enhancing disaster situational awareness through scalable curation of social media

    Get PDF
    Online social media is today used during humanitarian disasters by victims, responders, journalists and others, to publicly exchange accounts of ongoing events, requests for help, aggregate reports, reflections and commentary. In many cases, incident reports become available on social media before being picked up by traditional information channels, and often include rich evidence such as photos and video recordings. However, individual messages are sparse in content and message inflow rates can reach hundreds of thousands of items per hour during large scale events. Current information management methods struggle to make sense of this vast body of knowledge, due to limitations in terms of accuracy and scalability of processing, summarization capabilities, organizational acceptance and even basic understanding of users’ needs. If solutions to these problems can be found, social media can be mined to offer disaster responders unprecedented levels of situational awareness. This thesis provides a first comprehensive overview of humanitarian disaster stakeholders and their information needs, against which the utility of the proposed and future information management solutions can be assessed. The research then shows how automated online textclustering techniques can provide report de-duplication, timely event detection, ranking and summarization of content in rapid social media streams. To identify and filter out reports that correspond to the information needs of specific stakeholders, crowdsourced information extraction is combined with supervised classification techniques to generalize human annotation behaviour and scale up processing capacity several orders of magnitude. These hybrid processing techniques are implemented in CrisisTracker, a novel software tool, and evaluated through deployment in a large-scale multi-language disaster information management setting. Evaluation shows that the proposed techniques can effectively make social media an accessible complement to currently relied-on information collection methods, which enables disaster analysts to detect and comprehend unfolding events more quickly, deeply and with greater coverage.Actualmente, m´ıdias sociais s˜ao utilizadas em crises humanit´arias por v´ıtimas, apoios de emergˆencia, jornalistas e outros, para partilhar publicamente eventos, pedidos ajuda, relat´orios, reflex˜oes e coment´arios. Frequentemente, relat´orios de incidentes est˜ao dispon´ıveis nestes servic¸o muito antes de estarem dispon´ıveis nos canais de informac¸˜ao comuns e incluem recursos adicionais, tais como fotografia e video. No entanto, mensagens individuais s˜ao escassas em conteu´do e o fluxo destas pode chegar aos milhares de unidades por hora durante grandes eventos. Actualmente, sistemas de gest˜ao de informac¸˜ao s˜ao ineficientes, em grande parte devido a limita¸c˜oes em termos de rigor e escalabilidade de processamento, sintetiza¸c˜ao, aceitac¸˜ao organizacional ou simplesmente falta de compreens˜ao das necessidades dos utilizadores. Se existissem solu¸c˜oes eficientes para extrair informa¸c˜ao de m´ıdias sociais em tempos de crise, apoios de emergˆencia teriam acesso a informac¸˜ao rigorosa, resultando em respostas mais eficientes. Esta tese cont´em a primeira lista exaustiva de parte interessada em ajuda humanit´aria e suas necessidades de informa¸c˜ao, v´alida para a utilizac¸˜ao do sistema proposto e futuras soluc¸˜oes. A investiga¸c˜ao nesta tese demonstra que sistemas de aglomera¸c˜ao de texto autom´atico podem remover redundˆancia de termos; detectar eventos; ordenar por relevˆancia e sintetizar conteu´do dinˆamico de m´ıdias sociais. Para identificar e filtrar relat´orios relevantes para diversos parte interessada, algoritmos de inteligˆencia artificial s˜ao utilizados para generalizar anotac¸˜oes criadas por utilizadores e automatizar consideravelmente o processamento. Esta solu¸c˜ao inovadora, CrisisTracker, foi testada em situa¸c˜oes de grande escala, em diversas l´ınguas, para gest˜ao de informa¸c˜ao em casos de crise humanit´aria. Os resultados demonstram que os m´etodos propostos podem efectivamente tornar a informa¸c˜ao de m´ıdias sociais acess´ıvel e complementam os m´etodos actuais utilizados para gest˜ao de informa¸c˜ao por analistas de crises, para detectar e compreender eventos eficientemente, com maior detalhe e cobertura

    Моделювання, керування та інформаційні технології

    Get PDF
    Aniksuhyn A., Zhyvolovych O. Generalized solvability and optimal control for an integro-differential equation of a hyperbolic type 8 Babudzhan R., Isaienkov K., Krasii D., Melkonian R., Vodka O., Zadorozhniy I. Collection and processing of bearing vibration data for their technical condition classification by machine learning methods 10 Bardan A., Bihun Y. Computer modeling of differential games . 16 Beridze Z., Shavadze Ju., Imnaishvili G., Geladze M. Concept and functions of building a private network (VPN) 19 Bomba A., Klymiuk Y. Computer prediction of technological modes of rapid cone shaped adsorption filters with automated discharge of part of heat from separation surfaces in filtering model 21 Boyko N., Dypko O. Analysis of machine learning methods using spam filtering 25 Boyko N., Kulchytska O. Analysis of tumor classification algorithms for breast cancer prediction by machine learning methods 29 Denysov S., Semenov V., Vedel Ya. A novel adaptive method for operator inclusions 33 Didmanidze M., Chachanidze G., Didmanidze T. Modern trends in unemployment . 36 Bagrationi I., Zaslavski V., Didmanidze I., Yamkova O. Ethics of information technology in the context of a global worldview . 38 Didmanidze D., Zoidze K., Akhvlediani N., Tsitskishvili G., Samnidze N., Diasamidze M. Use of computer teaching systems in the learning process . 42 Dobrydnyk Yu., Khrystyuk A. Analysis of the elevator as an object of automation 44 Gamzayev R., Shkoda B. Development and investigation of adaptive micro-service architecture for messaging software systems . 46 Gayev Ye. Student' own discoveries in information theory curriculum 50 Didmanidze I., Geladze D., Motskobili Ia, Akhvlediani D., Koridze L. Follow digitally by using a blog . 52 Kirpichnikov A., Khrystyuk A. Automatic apiary care system 54 Kunytskyi S., Ivanchuk N. Mathematical modeling of water purification in a bioplato filter 56 Kyrylych V., Milchenko O. Optimal control of a hyperbolic system that describes Slutsky demand . 58 6 Makaradze N., Nakashidze-Makharadze T., Zaslavski V., Gurgenidze M., Samnidze N., Diasamidze M. Challenges of using computer-based educational technologies in higher education 60 Mamenko P., Zinchenko S., Nosov P., Kyrychenko K., Popovych I., Nahrybelnyi Ya., Kobets V. Research of divergence trajectory with a given risk of ships collisions . 64 Mateichuk V., Zinchenko S., Tovstokoryi O., Nosov P., Nahrybelnyi Ya., Popovych I., Kobets V. Automatic vessel control in stormy conditions 68 Petrivskyi Ya., Petrivskyi V., Bychkov O., Pyzh O. Some features of creating a computer vision system 72 Poliakov V. Calculation of organic substrate decomposition in biofilm and bioreactor-filter taking into account its limitation and inhibition 75 Poliakov V. Mathematical modeling of suspension filtration on a rapid filter at an unregulated rate 78 Prokip V. On the semi-scalar equivalence of polynomial matrices 80 Pysarchuk O., Mironov Y. A proposal of algorithm for automated chromosomal abnormality detection . 83 Rybak O., Tarasenko S. Sperner’s Theorem . 87 Sandrakov G., Hulianytskyi A., Semenov V. Modeling of filtration processes in periodic porous media 90 Stepanets O., Mariiash Yu. Optimal control of the blowing mode parameters during basic oxygen furnace steelmaking process . 94 Stepanchenko O., Shostak L., Kozhushko O., Moshynskyi V., Martyniuk P. Modelling soil organic carbon turnover with assimilation of satellite soil moisture data 97 Vinnychenko D., Nazarova N., Vinnychenko I. The dependence of the deviation of the output stabilized current of the resonant power supply during frequency control in the systems of materials pulse processing 100 Voloshchuk V., Nekrashevych O., Gikalo P. Exergy analysis of a reversible chiller 105 Шарко О., Петрушенко Н., Мосін М., Шарко М., Василенко Н., Белоусов А. Інформаційно-керуючі системи та технології оцінки ступеня підготовленості підприємств до інноваційної діяльності за допомогою ланцюгів Маркова . 107 Барановський С., Бомба А., Прищепа О. Модифікація моделі інфекційного захворювання для урахування дифузійних збурень в умовах логістичної динаміки 110 Бомба А., Бойчура М., Мічута О. Ідентифікація параметрів структури ґрунтових криволінійних масивів числовими методами квазіконформних відображень . 112 Василець К. Метод оцінювання невизначеності вимірювання електроенергії вузлом комерційного обліку 114 Волощук В., Некрашевич О., Гікало П. Доцільність застосування критеріїв ексергетичного аналізу для оцінювання ефективності об'єктів теплоенергетики . 117 Гудь В. Математичне моделювання енергетичної ефективності постійних магнітів в циліндричних магнітних системах . 120 Демидюк М. Параметрична оптимізація циклічних транспортних операцій маніпуляторів з активними і пасивними приводами 122 Клепач М., Клепач М. Вейвлет аналіз температурних трендів днища скловарної печі 125 Козирєв С. Керування високовольтним імпульсним розрядом в екзотермічному середовищі . 127 Очко О., Аврука І. Безпечне збереження конфіденційної інформації на серверах . 131 Реут Д., Древецький В., Матус С. Застосування комп’ютерного зору для автоматичного вимірювання швидкості рідин з тонкодисперсними домішками 133 Сафоник А., Грицюк І. Розроблення інформаційної системи для спектрофотометричного аналізу . 135 Ткачук В. Квантовий генетичний алгоритм та його реалізація на квантовому компютері 137 Цвєткова Т. Комп’ютерна візуалізація гідродинамічного поля в області зкриволінійними межами 140 Шпортько О., Бомба А., Шпортько Л. Пристосування словникових методів компресії до прогресуючого ієрархічного стиснення зображень без втрат . 142 Сафоник А., Таргоній І. Розробка системи керування напруженістю магнітного поля для процесу знезалізнення технологічних вод . 14

    Resilience Strategies for Network Challenge Detection, Identification and Remediation

    Get PDF
    The enormous growth of the Internet and its use in everyday life make it an attractive target for malicious users. As the network becomes more complex and sophisticated it becomes more vulnerable to attack. There is a pressing need for the future internet to be resilient, manageable and secure. Our research is on distributed challenge detection and is part of the EU Resumenet Project (Resilience and Survivability for Future Networking: Framework, Mechanisms and Experimental Evaluation). It aims to make networks more resilient to a wide range of challenges including malicious attacks, misconfiguration, faults, and operational overloads. Resilience means the ability of the network to provide an acceptable level of service in the face of significant challenges; it is a superset of commonly used definitions for survivability, dependability, and fault tolerance. Our proposed resilience strategy could detect a challenge situation by identifying an occurrence and impact in real time, then initiating appropriate remedial action. Action is autonomously taken to continue operations as much as possible and to mitigate the damage, and allowing an acceptable level of service to be maintained. The contribution of our work is the ability to mitigate a challenge as early as possible and rapidly detect its root cause. Also our proposed multi-stage policy based challenge detection system identifies both the existing and unforeseen challenges. This has been studied and demonstrated with an unknown worm attack. Our multi stage approach reduces the computation complexity compared to the traditional single stage, where one particular managed object is responsible for all the functions. The approach we propose in this thesis has the flexibility, scalability, adaptability, reproducibility and extensibility needed to assist in the identification and remediation of many future network challenges

    Combating Attacks and Abuse in Large Online Communities

    Get PDF
    Internet users today are connected more widely and ubiquitously than ever before. As a result, various online communities are formed, ranging from online social networks (Facebook, Twitter), to mobile communities (Foursquare, Waze), to content/interests based networks (Wikipedia, Yelp, Quora). While users are benefiting from the ease of access to information and social interactions, there is a growing concern for users' security and privacy against various attacks such as spam, phishing, malware infection and identity theft. Combating attacks and abuse in online communities is challenging. First, today’s online communities are increasingly dependent on users and user-generated content. Securing online systems demands a deep understanding of the complex and often unpredictable human behaviors. Second, online communities can easily have millions or even billions of users, which requires the corresponding security mechanisms to be highly scalable. Finally, cybercriminals are constantly evolving to launch new types of attacks. This further demands high robustness of security defenses. In this thesis, we take concrete steps towards measuring, understanding, and defending against attacks and abuse in online communities. We begin with a series of empirical measurements to understand user behaviors in different online services and the uniquesecurity and privacy challenges that users are facing with. This effort covers a broad set of popular online services including social networks for question and answering (Quora), anonymous social networks (Whisper), and crowdsourced mobile communities (Waze). Despite the differences of specific online communities, our study provides a first look at their user activity patterns based on empirical data, and reveals the need for reliable mechanisms to curate user content, protect privacy, and defend against emerging attacks. Next, we turn our attention to attacks targeting online communities, with focus on spam campaigns. While traditional spam is mostly generated by automated software, attackers today start to introduce "human intelligence" to implement attacks. This is maliciouscrowdsourcing (or crowdturfing) where a large group of real-users are organized to carry out malicious campaigns, such as writing fake reviews or spreading rumors on social media. Using collective human efforts, attackers can easily bypass many existing defenses (e.g.,CAPTCHA). To understand the ecosystem of crowdturfing, we first use measurements to examine their detailed campaign organization, workers and revenue. Based on insights from empirical data, we develop effective machine learning classifiers to detect crowdturfingactivities. In the meantime, considering the adversarial nature of crowdturfing, we also build practical adversarial models to simulate how attackers can evade or disrupt machine learning based defenses. To aid in this effort, we next explore using user behavior models to detect a wider range of attacks. Instead of making assumptions about attacker behavior, our idea is to model normal user behaviors and capture (malicious) behaviors that are deviated from norm. In this way, we can detect previously unknown attacks. Our behavior model is based on detailed clickstream data, which are sequences of click events generated by users when using the service. We build a similarity graph where each user is a node and the edges are weightedby clickstream similarity. By partitioning this graph, we obtain "clusters" of users with similar behaviors. We then use a small set of known good users to "color" these clusters to differentiate the malicious ones. This technique has been adopted by real-world social networks (Renren and LinkedIn), and already detected unexpected attacks. Finally, we extend clickstream model to understanding more-grained behaviors of attackers (and real users), and tracking how user behavior changes over time. In summary, this thesis illustrates a data-driven approach to understanding and defending against attacks and abuse in online communities. Our measurements have revealed new insights about how attackers are evolving to bypass existing security defenses today. Inaddition, our data-driven systems provide new solutions for online services to gain a deep understanding of their users, and defend them from emerging attacks and abuse

    Modeling the detection of textual cyber-bullying

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 91-96).The scourge of cyber-bullying has received widespread attention at all levels of society including parents, educators, adolescents, social scientists, psychiatrists and policy makers at the highest echelons of power. Cyber-bullying and it's complex intermingling with traditional bullying has been shown to have a deeply negative impact on both the bully as well as the victim. We hypothesize that tackling cyber-bullying entails two parts - detection and user-interaction strategies for effective mitigation. In this thesis, we investigate the problem of detecting textual cyber-bullying. A companion thesis by Birago Jones will investigate use-interaction strategies. In this thesis, we explore mechanisms to tackle the problem of textual cyber-bullying using computational empathy - a combination of detection and intervention techniques informed by scoping the social parameters that underlie the problem as well as a socio-linguistic treatment of the underlying socially mediated communication on the web. We begin by presenting a qualitative analysis of textual cyber-bullying based on data gathered from two major social networking websites and decompose the problem of detection into sub-problems. I then present Ruminati - a society of models of models involving supervised learning, commonsense reasoning and probabilistic topic modeling to tackle each sub-problem.by Karthik Dinakar.S.M

    Detection and Evaluation of Clusters within Sequential Data

    Full text link
    Motivated by theoretical advancements in dimensionality reduction techniques we use a recent model, called Block Markov Chains, to conduct a practical study of clustering in real-world sequential data. Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees and can be deployed in sparse data regimes. Despite these favorable theoretical properties, a thorough evaluation of these algorithms in realistic settings has been lacking. We address this issue and investigate the suitability of these clustering algorithms in exploratory data analysis of real-world sequential data. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. In order to evaluate the determined clusters, and the associated Block Markov Chain model, we further develop a set of evaluation tools. These tools include benchmarking, spectral noise analysis and statistical model selection tools. An efficient implementation of the clustering algorithm and the new evaluation tools is made available together with this paper. Practical challenges associated to real-world data are encountered and discussed. It is ultimately found that the Block Markov Chain model assumption, together with the tools developed here, can indeed produce meaningful insights in exploratory data analyses despite the complexity and sparsity of real-world data.Comment: 37 pages, 12 figure
    corecore