391 research outputs found

    A Dynamic and Adaptable Service Composition Architecture in the Cloud Based on Multi-Agent Systems

    Full text link
    Nowadays, service composition is one of the major problems in the Cloud due to the exceptional growth in the number of services deployed by providers. Recently, atomic services have been found to be unable to deal with all client requirements. Traditional service composition gives the clients a composite service without non-functional parameters. To respond to both functional and non-functional parameters, we need a service composition. Since web services cannot communicate with each other or participate dynamically to handle changes service parameters in service composition, this issue has led us to use a dynamic entity represented by an agent based on dynamic architecture. This work proposes an agent-based architecture with a new cooperation protocol that can offer an automatic and adaptable service composition by providing a composite service with the maximum quality of service. The implementation of this model has been provided in order to evaluate the authors' system. The obtained results demonstrate the effectiveness of their proposed system

    Mining Twitter for crisis management: realtime floods detection in the Arabian Peninsula

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of doctor of Philosophy.In recent years, large amounts of data have been made available on microblog platforms such as Twitter, however, it is difficult to filter and extract information and knowledge from such data because of the high volume, including noisy data. On Twitter, the general public are able to report real-world events such as floods in real time, and act as social sensors. Consequently, it is beneficial to have a method that can detect flood events automatically in real time to help governmental authorities, such as crisis management authorities, to detect the event and make decisions during the early stages of the event. This thesis proposes a real time flood detection system by mining Arabic Tweets using machine learning and data mining techniques. The proposed system comprises five main components: data collection, pre-processing, flooding event extract, location inferring, location named entity link, and flooding event visualisation. An effective method of flood detection from Arabic tweets is presented and evaluated by using supervised learning techniques. Furthermore, this work presents a location named entity inferring method based on the Learning to Search method, the results show that the proposed method outperformed the existing systems with significantly higher accuracy in tasks of inferring flood locations from tweets which are written in colloquial Arabic. For the location named entity link, a method has been designed by utilising Google API services as a knowledge base to extract accurate geocode coordinates that are associated with location named entities mentioned in tweets. The results show that the proposed location link method locate 56.8% of tweets with a distance range of 0 – 10 km from the actual location. Further analysis has shown that the accuracy in locating tweets in an actual city and region are 78.9% and 84.2% respectively

    Linguistic Threat Assessment: Understanding Targeted Violence through Computational Linguistics

    Get PDF
    Language alluding to possible violence is widespread online, and security professionals are increasingly faced with the issue of understanding and mitigating this phenomenon. The volume of extremist and violent online data presents a workload that is unmanageable for traditional, manual threat assessment. Computational linguistics may be of particular relevance to understanding threats of grievance-fuelled targeted violence on a large scale. This thesis seeks to advance knowledge on the possibilities and pitfalls of threat assessment through automated linguistic analysis. Based on in-depth interviews with expert threat assessment practitioners, three areas of language are identified which can be leveraged for automation of threat assessment, namely, linguistic content, style, and trajectories. Implementations of each area are demonstrated in three subsequent quantitative chapters. First, linguistic content is utilised to develop the Grievance Dictionary, a psycholinguistic dictionary aimed at measuring concepts related to grievance-fuelled violence in text. Thereafter, linguistic content is supplemented with measures of linguistic style in order to examine the feasibility of author profiling (determining gender, age, and personality) in abusive texts. Lastly, linguistic trajectories are measured over time in order to assess the effect of an external event on an extremist movement. Collectively, the chapters in this thesis demonstrate that linguistic automation of threat assessment is indeed possible. The concluding chapter describes the limitations of the proposed approaches and illustrates where future potential lies to improve automated linguistic threat assessment. Ideally, developers of computational implementations for threat assessment strive for explainability and transparency. Furthermore, it is argued that computational linguistics holds particular promise for large-scale measurement of grievance-fuelled language, but is perhaps less suited to prediction of actual violent behaviour. Lastly, researchers and practitioners involved in threat assessment are urged to collaboratively and critically evaluate novel computational tools which may emerge in the future

    Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

    Full text link
    Ensuring alignment, which refers to making models behave in accordance with human intentions [1,2], has become a critical task before deploying large language models (LLMs) in real-world applications. For instance, OpenAI devoted six months to iteratively aligning GPT-4 before its release [3]. However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations. This obstacle hinders systematic iteration and deployment of LLMs. To address this issue, this paper presents a comprehensive survey of key dimensions that are crucial to consider when assessing LLM trustworthiness. The survey covers seven major categories of LLM trustworthiness: reliability, safety, fairness, resistance to misuse, explainability and reasoning, adherence to social norms, and robustness. Each major category is further divided into several sub-categories, resulting in a total of 29 sub-categories. Additionally, a subset of 8 sub-categories is selected for further investigation, where corresponding measurement studies are designed and conducted on several widely-used LLMs. The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered. This highlights the importance of conducting more fine-grained analyses, testing, and making continuous improvements on LLM alignment. By shedding light on these key dimensions of LLM trustworthiness, this paper aims to provide valuable insights and guidance to practitioners in the field. Understanding and addressing these concerns will be crucial in achieving reliable and ethically sound deployment of LLMs in various applications

    Plataforma para descoberta de eventos de interesse noticioso no Twitter

    Get PDF
    The new communication paradigm established by Social Media, along with its growing popularity in recent years, have contributed to attract an increasing interest by several research fields. One such research field is the field of event detection in Social Media, whose relevance stems from its potential applicability in many diverse applications. One such application is the detection of newsworthy events in Social Media. The purpose of this work is therefore to implement a system to detect newsworthy events in Twitter. A similar system proposed in the literature is used as the base of this implementation. For this purpose a segmentation algorithm was implemented using a dynamic programming approach in order to split the tweets into segments. A weighting scheme that takes into account the burstiness, user support and newsworthiness of the segments was then used to rank these segments. Wikipedia was leveraged in order to derive this newsworthiness. The top K segments in this ranking were further processed and clustered into candidate events according to their similarity. These candidate events were then filtered by an SVM model trained on manually annotated data in order to retain only those related to real-world newsworthy events. The support infrastructure required by the system, namely regarding the precomputed values considered necessary to its operation was also implemented. The implemented system was tested with three months of data, representing a total of 4,770,636 tweets created in Portugal and mostly written in the Portuguese language. The precision obtained by the system was 76.9 % with a recall of 41.6%.O novo paradigma de comunicação estabelecido pelas Redes Sociais, aliado à sua crescente popularidade no passado recente, contribuíram para suscitar o interesse de diversas áreas de investigação. Uma dessas áreas é a detecção de eventos em Redes Sociais, cuja relevância deriva do seu elevado potencial de aplicabilidade num conjunto diverso de aplicações. Uma dessas aplicações é a deteção de eventos de interesse noticioso em redes Sociais. O objectivo deste trabalho é por isso o de implementar um sistema para deteção de eventos de interesse noticioso no Twitter. Um sistema semelhante proposto na literatura é usado como base desta implementação. Para atingir este propósito foi implementado um algoritmo de segmentação utilizando uma abordagem baseada em programação dinâmica por forma a separar os tweets em segmentos. Um esquema de ponderação tendo em conta o aumento intermitente da frequência dos segmentos, a sua base de suporte em termos de utilizadores e o seu potencial noticioso foi então utilizado para gerar um ranking destes segmentos. A Wikipédia foi utilizada como meio para calcular este potencial noticioso. Os top K segmentos neste ranking foram sujeitos a processamento posterior e agrupados em eventos candidatos de acordo com a sua similaridade. Por sua vez estes eventos candidatos foram filtrados por um modelo SVM, treinado em dados anotados manualmente, por forma a reter apenas aqueles relacionados com eventos do mundo real com interesse noticioso. Foi também implementada toda a infra-estrutura de suporte necessária ao sistema, nomeadamente no que diz respeito aos valores pré-calculados considerados necessários ao seu funcionamento. O sistema implementado foi testado com três meses de dados representando um total de 4,770,636 de tweets criados em Portugal e maioritariamente escritos em português. A precisão obtida pelo sistema foi de 76.9 % e a sua sensibilidade de 41.6%.Mestrado em Engenharia Informátic

    Spotting Icebergs by the Tips: Rumor and Persuasion Campaign Detection in Social Media

    Full text link
    Identifying different types of events in social media, i.e., collective online activities or posts, is critical for researchers who study data mining and online communication. However, the online activities of more than one billion social media users from around the world constitute an ocean of data that is hard to study and understand. In this dissertation, we study the problem of event detection with a focus on two important applications---rumor and persuasion campaign detection. Detecting events such as rumors and persuasion campaigns is particularly important for social media users and researchers. Events in social media spread and influence people much more quickly than traditional news media reporting. Viral spreading of specific events, such as rumors and persuasion campaigns, can cause substantial damage in online communities. Automatic detection of these can benefit analysts in many different research domains. In this thesis, we extend the existing research on social media event detection of online events such as rumors and persuasion campaigns. We conducted content analysis and found that the emergence and spreading of certain types of online events often result in similar user reactions. For example, some users will react to the spreading of a rumor by questioning its truth, even though most posts will not explicitly question it. These explicit questions serve as signals for detecting the underlying events. Our approach to detecting a given type of event first identifies the signals from the myriad of posts in the data corpus. We then use these signals to find the rest of the targeted events. Different types of events have different signals. As case studies, we analyze and identify the signals for rumors and persuasion campaigns, and we apply our proposed framework to detect these two types of events. We began by analyzing large-scale online activities in order to understand the relation between events and their signals. We focused on detecting and analyzing users' question-asking activities. We found that many social media users react to popular and fast-emerging memes by explicitly asking questions. Compared to other user activities, these questions are more likely to be correlated to bursty events and emergent information needs. We use some of our findings to detect trending rumors. We find that in the case of rumors, a common reaction regardless of the content of the rumor is to question the truth of the statement. We use these questioning activities as signals for detecting rumors. Our experimental results show that our rumor detector can effectively and efficiently detect social media rumors at an early stage. As in the case of rumors, the emergence and spreading of persuasion campaigns can result in similar reactions from the online audience. However, the explicit signals for detecting persuasion campaigns are not clearly understood and are difficult to label. We propose an algorithm that automatically learns these signals from data, by maximizing an objective that considers their key properties. We then use the learned signals in our proposed framework for detecting persuasion campaigns in social media. In our evaluation, we find that the learned signals can improve the performance of persuasion campaign detection compared to frameworks that use signals generated by alternative methods as well as those that do not use signals.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138726/1/zhezhao_1.pd

    Localized Events in Social Media Streams: Detection, Tracking, and Recommendation

    Get PDF
    From the recent proliferation of social media channels to the immense amount of user-generated content, an increasing interest in social media mining is currently being witnessed. Messages continuously posted via these channels report a broad range of topics from daily life to global and local events. As a consequence, this has opened new opportunities for mining event information crucial in many application domains, especially in increasing the situational awareness in critical scenarios. Interestingly, many of these messages are enriched with location information, due to the wide- spread of mobile devices and the recent advancements of today’s location acquisition techniques. This enables location-aware event mining, i.e., the detection and tracking of localized events. In this thesis, we propose novel frameworks and models that digest social media content for localized event detection, tracking, and recommendation. We first develop KeyPicker, a framework to extract and score event-related keywords in an online fashion, accounting for high levels of noise, temporal heterogeneity and outliers in the data. Then, LocEvent is proposed to incrementally detect and track events using a 4-stage procedure. That is, LocEvent receives the keywords extracted by KeyPicker, identifies local keywords, spatially clusters them, and finally scores the generated clusters. For each detected event, a set of descriptive keywords, a location, and a time interval are estimated at a fine-grained resolution. In addition to the sparsity of geo-tagged messages, people sometimes post about events far away from an event’s location. Such spatial problems are handled by novel spatial regularization techniques, namely, graph- and gazetteer-based regularization. To ensure scalability, we utilize a hierarchical spatial index in addition to a multi-stage filtering procedure that gradually suppresses noisy words and considers only event-related ones for complex spatial computations. As for recommendation applications, we propose an event recommender system built upon model-based collaborative filtering. Our model is able to suggest events to users, taking into account a number of contextual features including the social links between users, the topical similarities of events, and the spatio-temporal proximity between users and events. To realize this model, we employ and adapt matrix factorization, which allows for uncovering latent user-event patterns. Our proposed features contribute to directing the learning process towards recommendations that better suit the taste of users, in particular when new users have very sparse (or even no) event attendance history. To evaluate the effectiveness and efficiency of our proposed approaches, extensive comparative experiments are conducted using datasets collected from social media channels. Our analysis of the experimental results reveals the superiority and advantages of our frameworks over existing methods in terms of the relevancy and precision of the obtained results
    corecore