26 research outputs found

    A novel Fireworks Algorithm with wind inertia dynamics and its application to traffic forecasting

    Get PDF
    Fireworks Algorithm (FWA) is a recently contributed heuristic optimization method that has shown a promising performance in applications stemming from different domains. Improvements to the original algorithm have been designed and tested in the related literature. Nonetheless, in most of such previous works FWA has been tested with standard test functions, hence its performance when applied to real application cases has been scarcely assessed. In this manuscript a mechanism for accelerating the convergence of this meta-heuristic is proposed based on observed wind inertia dynamics (WID) among fireworks in practice. The resulting enhanced algorithm will be described algorithmically and evaluated in terms of convergence speed by means of test functions. As an additional novel contribution of this work FWA and FWA-WID are used in a practical application where such heuristics are used as wrappers for optimizing the parameters of a road traffic short-term predictive model. The exhaustive performance analysis of the FWA and FWA-ID in this practical setup has revealed that the relatively high computational complexity of this solver with respect to other heuristics makes it critical to speed up their convergence (specially in cases with a costly fitness evaluation as the one tackled in this work), observation that buttresses the utility of the proposed modifications to the naive FWA solver

    Local News And Event Detection In Twitter

    Get PDF
    Twitter, one of the most popular micro-blogging services, allows users to publish short messages on a wide variety of subjects such as news, events, stories, ideas, and opinions, called tweets. The popularity of Twitter, to some extent, arises from its capability of letting users promptly and conveniently contribute tweets to convey diverse information. Specifically, with people discussing what is happening outside in the real world by posting tweets, Twitter captures invaluable information about real-world news and events, spanning a wide scale from large national or international stories like a presidential election to small local stories such as a local farmers market. Detecting and extracting small news and events for a local place is a challenging problem and is the focus of this thesis. In particular, we explore several directions to extract and detect local news and events using tweets in Twitter: a) how to identify local influential people on Twitter for potential news seeders; b) how to recognize unusualness in tweet volume as signals of potential local events; c) how to overcome the data sparsity of local tweets to detect more and smaller undergoing local news and events. Additionally, we also try to uncover implicit correlations between location, time, and text in tweets by learning embeddings for them using a universal representation under the same semantic space. In the first part, we investigate how to measure the spatial influence of Twitter users by their interactions and thereby identify the locally influential users, which we found are usually good news and event seeders in practice. In order to do this, we built a large-scale directed interaction graph of Twitter users. Such a graph allows us to exploit PageRank based ranking procedures to select top local influential people after innovatively incorporating in geographical distance to the transition matrix used for the random walking. In the second part, we study how to recognize the unusualness in tweet volume at a local place as signals of potential ongoing local events. The intuition is that if there is suddenly an abnormal change in the number of tweets at a location (e.g., a significant increase), it may imply a potential local event. We, therefore, present DeLLe, a methodology for automatically Detecting Latest Local Events from geotagged tweet streams (i.e., tweets that contain GPS points). With the help of novel spatiotemporal tweet count prediction models, DeLLe first finds unusual locations which have aggregated an unexpected number of tweets in the latest time period and then calculates, for each such unusual location, a ranking score to identify the ones most likely to have ongoing local events by addressing the temporal burstiness, spatial business, and topical coherence. In the third part, we explore how to overcome the data sparsity of local tweets when trying to discover more and smaller local news or events. Local tweets are those whose locations fall inside a local place. They are very sparse in Twitter, which hinders the detection of small local news or events that have only a handful of tweets. A system, called Firefly, is proposed to enhance the local live tweet stream by tracking the tweets of a large body of local people, and further perform a locality-aware keyword based clustering for event detection. The intuition is that local tweets are published by local people, and tracking their tweets naturally yields a source of local tweets. However, in practice, only 20% Twitter users provide information about where they come from. Thus, a social network-based geotagging procedure is subsequently proposed to estimate locations for Twitter users whose locations are missing. Finally, in order to discover correlations between location, time and text in geotagged tweets, e.g., “find which locations are mostly related to the given topics“ and “find which locations are similar to a given location“, we present LeGo, a methodology for Learning embeddings of Geotagged tweets with respect to entities such as locations, time units (hour-of-day and day-of-week) and textual words in tweets. The resulting compact vector representations of these entities hence make it easy to measure the relatedness between locations, time and words in tweets. LeGo comprises two working modes: crossmodal search (LeGo-CM) and location-similarity search (LeGo-LS), to answer these two types of queries accordingly. In LeGo-CM, we first build a graph of entities extracted from tweets in which each edge carries the weight of co-occurrences between two entities. The embeddings of graph nodes are then learned in the same latent space under the guidance of approximating stationary residing probabilities between nodes which are computed using personalized random walk procedures. In comparison, we supplement edges between locations in LeGo-LS to address their underlying spatial proximity and topic likeliness to support location-similarity search queries

    Efficient Learning Machines

    Get PDF
    Computer scienc

    Engineering Innovation (TRIZ based Computer Aided Innovation)

    Get PDF
    This thesis describes the approach and results of the research to create a TRIZ based computer aided innovation tools (AEGIS and Design for Wow). This research has mainly been based around two tools created under this research: called AEGIS (Accelerated Evolutionary Graphics Interface System), and Design for Wow. Both of these tools are discussed in this thesis in detail, along with the test data, design methodology, test cases, and research. Design for Wow (http://www.designforwow.com) is an attempt to summarize the successful inventions/ designs from all over the world on a web portal which has multiple capabilities. These designs/innovations are then linked to the TRIZ Principles in order to determine whether innovative aspects of these successful innovations are fully covered by the forty TRIZ principles. In Design for Wow, a framework is created which is implemented through a review tool. The Design for Wow website includes this tool which has been used by researcher and the users of the site and reviewers to analyse the uploaded data in terms of strength of TRIZ Principles linked to them. AEGIS (Accelerated Evolutionary Graphics Interface System) is a software tool developed under this research aimed to help the graphic designers to make innovative graphic designs. Again it uses the forty TRIZ Principles as a set of guiding rules in the software. AEGIS creates graphic design prototypes according to the user input and uses TRIZ Principles framework as a guide to generate innovative graphic design samples. The AEGIS tool created is based on TRIZ Principles discussed in Chapter 3 (a subset of them). In AEGIS, the TRIZ Principles are used to create innovative graphic design effects. The literature review on innovative graphic design (in chapter 3) has been analysed for links with TRIZ Principles and then the DNA of AEGIS has been built on the basis of this study. Results from various surveys/ questionnaires indicated were used to collect the innovative graphic design samples and then TRIZ was mapped to it (see section 3.2). The TRIZ effects were mapped to the basic graphic design elements and the anatomy of the graphic design letters was studied to analyse the TRIZ effects in the collected samples. This study was used to build the TRIZ based AEGIS tool. Hence, AEGIS tool applies the innovative effects using TRIZ to basic graphic design elements (as described in section 3.3). the working of AEGIS is designed based on Genetic Algorithms coded specifically to implement TRIZ Principles specialized for Graphic Design, chapter 4 discusses the process followed to apply TRIZ Principles to graphic design and coding them using Genetic Algorithms, hence resulting in AEGIS tool. Similarly, in Design for Wow, the content uploaded has been analysed for its link with TRIZ Principles (see section 3.1 for TRIZ Principles). The tool created in Design for Wow is based on the framework of analysing the TRIZ links in the uploaded content. The ‘Wow’ concept discussed in the section 5.1 and 5.2 is the basis of the concept of Design for Wow website, whereby the users upload the content they classify as ‘Wow’. This content then is further analysed for the ‘Wow factor’ and then mapped to TRIZ Principles as TRIZ tagging methodology is framed (section 5.5). From the results of the research, it appears that the TRIZ Principles are a comprehensive set of innovation basic building blocks. Some surveys suggest that amongst other tools, TRIZ Principles were the first choice and used most .They have thus the potential of being used in other innovation domains, to help in their analysis, understanding and potential development.Great Western Research and Systematic Innovation Ltd U

    Scanning the Science-Society Horizon

    No full text
    Science communication approaches have evolved over time gradually placing more importance on understanding the context of the communication and audience. The increase in people participating in social media on the Internet offers a new resource for monitoring what people are discussing. People self publish their views on social media, which provides a rich source of every day, every person thinking. This introduces the possibility of using passive monitoring of this public discussion to find information useful to science communicators, to allow them to better target their communications about different topics. This research study is focussed on understanding what open source intelligence, in the form of public tweets on Twitter, reveals about the contexts in which the word 'science' is used by the English speaking public. By conducting a series of studies based on simpler questions, I gradually build up a view of who is contributing on Twitter, how often, and what topics are being discussed that include the keyword 'science'. An open source a data gathering tool for Twitter data was developed and used to collect a dataset from Twitter with the keyword 'science' during 2011. After collection was completed, data was prepared for analysis by removing unwanted tweets. The size of the dataset (12.2 million tweets by 3.6 million users (authors)) required the use of mainly quantitative approaches, even though this only represents a very small proportion, about 0.02%, of the total tweets per day on Twitter Fourier analysis was used to create a model of the underlying temporal pattern of tweets per day and revealed a weekly pattern. The number of users per day followed a similar pattern, and most of these users did not use the word 'science' often on Twitter. An investigation of types of tweets suggests that people using the word 'science' were engaged in more sharing of both links, and other peoples tweets, than is usual on Twitter. Consideration of word frequency and bigrams in the text of the tweets found that while word frequencies were not particularly effective when trying to understand such a large dataset, bigrams were able to give insight into the contexts in which 'science' is being used in up to 19.19% of the tweets. The final study used Latent Dirichlet Allocation (LDA) topic modelling to identify the contexts in which 'science' was being used and gave a much richer view of the whole corpus than the bigram analysis. Although the thesis has focused on the single keyword 'science' the techniques developed should be applicable to other keywords and so be able to provide science communicators with a near real time source of information about what issues the public is concerned about, what they are saying about those issues and how that is changing over time

    Evaluating Privacy-Friendly Mobility Analytics on Aggregate Location Data

    Get PDF
    Information about people's movements and the locations they visit enables a wide number of mobility analytics applications, e.g., real-time traffic maps or urban planning, aiming to improve quality of life in modern smart-cities. Alas, the availability of users' fine-grained location data reveals sensitive information about them such as home and work places, lifestyles, political or religious inclinations. In an attempt to mitigate this, aggregation is often employed as a strategy that allows analytics and machine learning tasks while protecting the privacy of individual users' location traces. In this thesis, we perform an end-to-end evaluation of crowdsourced privacy-friendly location aggregation aiming to understand its usefulness for analytics as well as its privacy implications towards users who contribute their data. First, we present a time-series methodology which, along with privacy-friendly crowdsourcing of aggregate locations, supports mobility analytics such as traffic forecasting and mobility anomaly detection. Next, we design quantification frameworks and methodologies that let us reason about the privacy loss stemming from the collection or release of aggregate location information against knowledgeable adversaries that aim to infer users' profiles, locations, or membership. We then utilize these frameworks to evaluate defenses ranging from generalization and hiding, to differential privacy, which can be employed to prevent inferences on aggregate location statistics, in terms of privacy protection as well as utility loss towards analytics tasks. Our results highlight that, while location aggregation is useful for mobility analytics, it is a weak privacy protection mechanism in this setting and that additional defenses can only protect privacy if some statistical utility is sacrificed. Overall, the tools presented in this thesis can be used by providers who desire to assess the quality of privacy protection before data release and its results have several implications about current location data practices and applications

    Term-driven E-Commerce

    Get PDF
    Die Arbeit nimmt sich der textuellen Dimension des E-Commerce an. Grundlegende Hypothese ist die textuelle Gebundenheit von Information und Transaktion im Bereich des elektronischen Handels. Überall dort, wo Produkte und Dienstleistungen angeboten, nachgefragt, wahrgenommen und bewertet werden, kommen natürlichsprachige Ausdrücke zum Einsatz. Daraus resultiert ist zum einen, wie bedeutsam es ist, die Varianz textueller Beschreibungen im E-Commerce zu erfassen, zum anderen können die umfangreichen textuellen Ressourcen, die bei E-Commerce-Interaktionen anfallen, im Hinblick auf ein besseres Verständnis natürlicher Sprache herangezogen werden

    Understanding Quantum Technologies 2022

    Full text link
    Understanding Quantum Technologies 2022 is a creative-commons ebook that provides a unique 360 degrees overview of quantum technologies from science and technology to geopolitical and societal issues. It covers quantum physics history, quantum physics 101, gate-based quantum computing, quantum computing engineering (including quantum error corrections and quantum computing energetics), quantum computing hardware (all qubit types, including quantum annealing and quantum simulation paradigms, history, science, research, implementation and vendors), quantum enabling technologies (cryogenics, control electronics, photonics, components fabs, raw materials), quantum computing algorithms, software development tools and use cases, unconventional computing (potential alternatives to quantum and classical computing), quantum telecommunications and cryptography, quantum sensing, quantum technologies around the world, quantum technologies societal impact and even quantum fake sciences. The main audience are computer science engineers, developers and IT specialists as well as quantum scientists and students who want to acquire a global view of how quantum technologies work, and particularly quantum computing. This version is an extensive update to the 2021 edition published in October 2021.Comment: 1132 pages, 920 figures, Letter forma
    corecore