32 research outputs found

    Vorhersage der Aktualisierungen auf Social Media Plattformen

    Get PDF
    Social Media Plattformen wie Facebook, Twitter und YouTube sind nicht nur bei Endbenutzern, sondern auch bei Unternehmen seit Jahren sehr beliebt. Unternehmen nutzen diese Plattformen insbesondere für Marketingzwecke, womit herkömmliche Marketinginstrumente zunehmend in den Hintergrund rücken. Neben Unternehmen verwenden auch politische Parteien, Universitäten, Forschungseinrichtungen und viele weitere Organisationen die Möglichkeiten von Social Media für ihre Belange. Das große Interesse von Endbenutzern und Institutionen an Social Media macht es interessant für viele Anwendungen in Wirtschaft und Wissenschaft. Um Marktbeobachtung und Forschung zu Social Media zu betreiben, werden Daten benötigt, die meist über dedizierte Werkzeuge erhoben und ausgewertet werden, wobei die Einschränkungen vorhandener technischer Schnittstellen der Social Media Plattformen zu beachten sind. Für ausgewählte Forschungsfragen sind Aspekte wie Umfang und Aktualität der Daten von besonderer Bedeutung. Ein Abfragen von Aktualisierungen aus den Social Media Plattformen kann mit heute verfügbaren Mitteln nur über Polling-Verfahren durchgeführt werden. Zum Berechnen der Aktualisierungsintervalle nutzt man häufig statistische Modelle. Das Ziel der vorliegenden Arbeit ist es, geeignete Zeitpunkte zum Abruf vorgegebener Feeds auf Social Media Plattformen zu bestimmen, um neue Beiträge zeitnah abzurufen und zu verarbeiten. Die Berechnung geeigneter Aktualisierungszeitpunkte dient der Optimierung des Ressourceneinsatzes und einer Reduktion der Verzögerung der Verarbeitung. Viele Anwendungen können davon profitieren. Die vorliegende Arbeit leistet mehrere Beiträge im Hinblick auf die Zielsetzung. Zunächst wurden Arbeiten zu Social Media und angrenzenden Datenquellen im Umfeld des World Wide Web, welche die Bestimmung von Änderungsraten oder die Vorhersage von Aktualisierungen verfolgen, auf die eigene Problemstellung übertragen. Ferner wurde die Eignung der Algorithmen zur Vorhersage der Aktualisierungszeitpunkte aus bestehenden Ansätzen mithilfe quantitativer Messungen bestimmt. Die Ansätze wurden dazu auf reale Daten aus Facebook, Twitter und YouTube angewendet und mithilfe geeigneter Metriken evaluiert. Die gewonnenen Erkenntnisse zeigen, dass die Qualität der Vorhersagen wesentlich von der Wahl des Algorithmus abhängt. Hierbei konnte eine Forschungslücke im Hinblick auf die Auswahl geeigneter Algorithmen identifiziert werden, da diese nach bisherigen Erkenntnissen üblicherweise nur manuell oder nach statischen Regeln erfolgt. Ein eigener Ansatz zur Vorhersage bildet den Kern der Arbeit und bezieht die individuellen Aktualisierungsmuster bestehender Social Media Feeds ein, um für neue Feeds die geeigneten Algorithmen zur Vorhersage, mit passender Parametrisierung, auszuwählen. Entsprechend den Ergebnissen der Evaluation wird gegenüber dem Stand der Technik eine höhere Qualität der Vorhersagen bei gleichzeitiger Reduktion des Aufwands für die Auswahl erreicht

    Timeliness Evaluation of Intermittent Mobile Connectivity over Pub/Sub Systems

    Get PDF
    International audienceSystems deployed in mobile environments are typically characterized by intermittent connectivity and asynchronous sending/reception of data. To create effective mobile systems for such environments, it is essential to guarantee acceptable levels of timeliness between sending and receiving mobile users. In order to provide QoS guarantees in different application scenarios and contexts, it is necessary to model the system performance by incorporating the intermittent connectivity. Queueing Network Models (QNMs) offer a simple modeling environment, which can be used to represent various application scenarios, and provide accurate analytical solutions for performance metrics, such as system response time. In this paper, we provide an analytical solution regarding the end-to-end response time between users sending and receiving data by modeling the intermittent connectivity of mobile users with QNMs. We utilize the publish/subscribe (pub/sub) middleware as the underlying communication infrastructure for the mobile users. To represent the user's connections/disconnections, we model and solve analytically an ON/OFF queueing system by applying a mean value approach. Finally, we validate our model using simulations with real-world workload traces. The deviations between the performance results foreseen by the analytical model and the ones provided by the simulator are shown to be less than 5% for a variety of scenarios

    Load Management for Publish/Subscribe Message Oriented Middleware

    Get PDF
    PhDTo provide time-critical early warnings in a Tsunami Warning System (TWS), the time delay for both the sensor data exchange process (upstream) and the warning message dissemination processes (downstream) should be minimal, maximising the time available for accurately analysing the situation and giving more time for people in the affected region to react to the warnings. Publish/Subscribe Message-oriented Middleware (PSMOM) in combination with a novel use of a federated broker (broker overlay) can be deployed in TWS to support both time-critical and resilient communication. PSMOM can better manage message bursts caused by a sudden increase in sensor data exchange frequency or by additional sensors coming online. PSMOM can better manage the decrease in available system resources (bottlenecks) caused by a disruption in the underlying network infrastructure (limited resource case). Otherwise, these burst and bottlenecks can cause some brokers to become overloaded, which may in turn degrade the overall system performance and delay decision-making. Existing PSMOM load management solutions have two key limitations when applied to TWS. First, existing work does not consider the message delay requirements for the redistribution and offloading phases of load management. Here, some data is only useful or valid for a short time-span (from tens of seconds to tens of minutes); hence, it needs to be exchanged within this maximum allowed end-to-end transmission delay. Time critical subscribers need to be de-prioritised from being offloaded, as the offloading processes take some time to complete, introducing unexpected delays to message exchange. Second, existing solutions assume that there are surplus system resources for offloading, i.e., less loaded brokers can accept loads from overloaded brokers. However, in a TWS, the underlying network infrastructure may be disrupted which in turn reduces the system capacity. It always takes time to recover from the limited resources situation and during that time, brokers may not have enough system resources to accept loads from overloaded brokers, which may result in total failure of the overloaded brokers. A novel load management framework called ePEER is proposed that extends an existing messaging system, Publish/Subscribe Efficient Event Routing (PEER), with the following main contributions. First, for the surplus resource case, the message delay requirements for different subscription services are considered in the load analysis process when offloading the load to different brokers. Second, for the limited resource case, a feedback driven congestion control mechanism can be used when the underlay network infrastructure is damaged, reducing the available bandwidth of PSMOM. This mechanism limits the publication rate of messages with less value, to better maintain the quality of experience (QoE) of subscribers for the more important messages ePEER is validated with emulation-based experiments. The results show that ePEER outperforms the state of the art load management solution used by PEER: through preventing unnecessary delays introduced to time critical services, and through ensuring important messages can be more efficiently exchanged to improve the QoE of subscribers

    Management of Temporally and Spatially Correlated Failures in Federated Message Oriented Middleware for Resilient and QoS-Aware Messaging Services.

    Get PDF
    PhDMessage Oriented Middleware (MOM) is widely recognized as a promising solution for the communications between heterogeneous distributed systems. Because the resilience and quality-of-service of the messaging substrate plays a critical role in the overall system performance, the evolution of these distributed systems has introduced new requirements for MOM, such as inter domain federation, resilience and QoS support. This thesis focuses on a management frame work that enhances the Resilience and QoS-awareness of MOM, called RQMOM, for federated enterprise systems. A common hierarchical MOM architecture for the federated messaging service is assumed. Each bottom level local domain comprises a cluster of neighbouring brokers that carry a local messaging service, and inter domain messaging are routed through the gateway brokers of the different local domains over the top level federated overlay. Some challenges and solutions for the intra and inter domain messaging are researched. In local domain messaging the common cause of performance degradation is often the fluctuation of workloads which might result in surge of total workload on a broker and overload its processing capacity, since a local domain is often within a well connected network. Against performance degradation, a combination of novel proactive risk-aware workload allocation, which exploits the co-variation between workloads, in addition to existing reactive load balancing is designed and evaluated. In federated inter domain messaging an overlay network of federated gateway brokers distributed in separated geographical locations, on top of the heterogeneous physical network is considered. Geographical correlated failures are threats to cause major interruptions and damages to such systems. To mitigate this rarely addressed challenge, a novel geographical location aware route selection algorithm to support uninterrupted messaging is introduced. It is used with existing overlay routing mechanisms, to maintain routes and hence provide more resilient messaging against geographical correlated failures

    RSS v2.0: Spamming, User Experience and Formalization

    Get PDF
    RSS, once the most popular publish/subscribe system is believed to have come to an end due to reasons unexplored yet. The aim of this thesis is to examine one such reason, spamming. The context of this thesis is limited to spamming related to RSS v2.0. The study discusses RSS as a publish/subscribe system and investigates the possible reasons for the decline in the use of such a system and possible solutions to address RSS spamming. The thesis introduces RSS (being dependent on feed readers) and tries to find its relationship with spamming. In addition, the thesis tries to investigate possible socio-technical influences on spamming in RSS. The author presents the idea of applying formalization (formal specification technique) to open standards, RSSv2.0 in particular. Formal specifications are more concise, consistent, unambiguous and highly reusable in many cases. The merging of formal specification methods and open standards allows for i) a more concrete standard design, ii) an improved understanding of the environment under design, iii) an enforced certain level of precision into the specification, and also iv) provides software engineers with extended property checking/verification capabilities. The author supports and proposes the use of formalization in RSS. Based on the inferences gathered from the user experiment conducted during the course of this study, an analysis on the downfall of RSS is presented. However, the user experiment opens up different directions for future work in the evolution of RSS v3.0 which could be supported by formalization. The thesis concludes that RSS is on the verge of death/discontinuation due to the adverse effects of spamming and lack of its development which is evident from the limited amount of available research literature. RSS Feeds is a perfect example of what happens to a software if it fails to evolve itself with time

    Leveraging CDR datasets for Context-Rich Performance Modeling of Large-Scale Mobile Pub/Sub Systems

    Get PDF
    International audienceLarge-scale mobile environments are characterized by, among others, a large number of mobile users, intermittent connectivity and non-homogeneous arrival rate of data to the users, depending on the region's context. Multiple application scenarios in major cities need to address the above situation for the creation of robust mobile systems. Towards this, it is fundamental to enable system designers to tune a communication infrastructure using various parameters depending on the specific context. In this paper, we take a first step towards enabling an application platform for large-scale information management relying on mobile social crowd-sourcing. To inform the stakeholders of expected loads and costs, we model a large-scale mobile pub/sub system as a queueing network. We introduce additional timing constraints such as i) mobile user's intermittent connectivity period; and ii) data validity lifetime period (e.g. that of sensor data). Using our MobileJINQS simulator, we parameterize our model with realistic input loads derived from the D4D dataset (CDR) and varied lifetime periods in order to analyze the effect on response time. This work provides system designers with coarse grain design time information when setting realistic loads and time constraints

    Engineering an Open Web Syndication Interchange with Discovery and Recommender Capabilities

    Get PDF
    Web syndication has become a popular means of delivering relevant information to people online but the complexity of standards, algorithms and applications pose considerable challenges to engineers.  This paper describes the design and development of a novel Web-based syndication intermediary called InterSynd and a simple Web client as a proof of concept. We developed format-neutral middleware that sits between content sources and the user. Additional objectives were to add feed discovery and recommendation components to the intermediary. A search-based feed discovery module helps users find relevant feed sources. Implicit collaborative recommendations of new feeds are also made to the user. The syndication software built uses open standard XML technologies and the free open source libraries. Extensibility and re-configurability were explicit goals. The experience shows that a modular architecture can combine open source modules to build state-of-the-art syndication middleware and applications. The data produced by software metrics indicate the high degree of modularity retained

    High-performance and fault-tolerant techniques for massive data distribution in online communities

    Get PDF
    The amount of digital information produced and consumed is increasing each day. This rapid growth is motivated by the advances in computing power, hardware technologies, and the popularization of user generated content networks. New hardware is able to process larger quantities of data, which permits to obtain finer results, and as a consequence more data is generated. In this respect, scientific applications have evolved benefiting from the new hardware capabilities. This type of application is characterized by requiring large amounts of information as input, generating a significant amount of intermediate data resulting in large files. This increase not only appears in terms of volume, but also in terms of size, we need to provide methods that permit a efficient and reliable data access mechanism. Producing such a method is a challenging task due to the amount of aspects involved. However, we can leverage the knowledge found in social networks to improve the distribution process. In this respect, the advent of the Web 2.0 has popularized the concept of social network, which provides valuable knowledge about the relationships among users, and the users with the data. However, extracting the knowledge and defining ways to actively use it to increase the performance of a system remains an open research direction. Additionally, we must also take into account other existing limitations. In particular, the interconnection between different elements of the system is one of the key aspects. The availability of new technologies such as the mass-production of multicore chips, large storage media, better sensors, etc. contributed to the increase of data being produced. However, the underlying interconnection technologies have not improved with the same speed as the others. This leads to a situation where vast amounts of data can be produced and need to be consumed by a large number of geographically distributed users, but the interconnection between both ends does not match the required needs. In this thesis, we address the problem of efficient and reliable data distribution in a geographically distributed systems. In this respect, we focus on providing a solution that 1) optimizes the use of existing resources, 2) does not requires changes in the underlying interconnection, and 3) provides fault-tolerant capabilities. In order to achieve this objectives, we define a generic data distribution architecture composed of three main components: community detection module, transfer scheduling module, and distribution controller. The community detection module leverages the information found in the social network formed by the users requesting files and produces a set of virtual communities grouping entities with similar interests. The transfer scheduling module permits to produce a plan to efficiently distribute all requested files improving resource utilization. For this purpose, we model the distribution problem using linear programming and offer a method to permit a distributed solving of the problem. Finally, the distribution controller manages the distribution process using the aforementioned schedule, controls the available server infrastructure, and launches new on-demand resources when necessary
    corecore