10,874 research outputs found

    Forecasting of commercial sales with large scale Gaussian Processes

    Full text link
    This paper argues that there has not been enough discussion in the field of applications of Gaussian Process for the fast moving consumer goods industry. Yet, this technique can be important as it e.g., can provide automatic feature relevance determination and the posterior mean can unlock insights on the data. Significant challenges are the large size and high dimensionality of commercial data at a point of sale. The study reviews approaches in the Gaussian Processes modeling for large data sets, evaluates their performance on commercial sales and shows value of this type of models as a decision-making tool for management.Comment: 1o pages, 5 figure

    State-of-the-art in data stream mining

    Get PDF

    Features for Detecting Aggression in Social Media: An Exploratory Study

    Get PDF
    Cyberbullying and cyberaggression are serious and widespread issues increasingly affecting Internet users. With the “help" of the widespread of social media networks, bullying once limited to particular places or times of the day, can now occur anytime and anywhere. Cyberaggression refers to aggressive online behaviour intending to cause harm to another person, involving rude, insulting, offensive, teasing or demoralising comments through online social media. Considering the gravity of the consequences that cyberaggression has on its victims and its rapid spread amongst internet users (specially kids and teens), there is an imperious need for research aiming at understanding how cyberbullying occurs, in order to prevent it from escalating. Given the massive information overload on the Web, it is crucial to develop intelligent techniques to automatically detect harmful content, which would allow the large-scale social media monitoring and early detection of undesired situations. Considering the challenges posed by the characteristics of social media content and the cyberaggression task, this paper focuses on the detection of aggressive content in the context of multiple social media sites by exploring diverse types of features. Experimental evaluation conducted on two real-world social media dataset showed the difficulty of the task, confirming the limitations of traditionally used features.Sociedad Argentina de Informática e Investigación Operativ

    RT-MOVICAB-IDS: Addressing real-time intrusion detection

    Get PDF
    This study presents a novel Hybrid Intelligent Intrusion Detection System (IDS) known as RT-MOVICAB-IDS that incorporates temporal control. One of its main goals is to facilitate real-time Intrusion Detection, as accurate and swift responses are crucial in this field, especially if automatic abortion mechanisms are running. The formulation of this hybrid IDS combines Artificial Neural Networks (ANN) and Case-Based Reasoning (CBR) within a Multi-Agent System (MAS) to detect intrusions in dynamic computer networks. Temporal restrictions are imposed on this IDS, in order to perform real/execution time processing and assure system response predictability. Therefore, a dynamic real-time multi-agent architecture for IDS is proposed in this study, allowing the addition of predictable agents (both reactive and deliberative). In particular, two of the deliberative agents deployed in this system incorporate temporal-bounded CBR. This upgraded CBR is based on an anytime approximation, which allows the adaptation of this Artificial Intelligence paradigm to real-time requirements. Experimental results using real data sets are presented which validate the performance of this novel hybrid IDSMinisterio de Economía y Competitividad (TIN2010-21272-C02-01, TIN2009-13839-C03-01), Ministerio de Ciencia e Innovación (CIT-020000-2008-2, CIT-020000-2009-12

    A Data-driven, High-performance and Intelligent CyberInfrastructure to Advance Spatial Sciences

    Get PDF
    abstract: In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation data about earthquake, climate, ocean, hydrology, volcano, glacier, etc., are being collected on a daily basis by a wide range of organizations. In addition to the observation data, human-generated data including microblogs, photos, consumption records, evaluations, unstructured webpages and other Volunteered Geographical Information (VGI) are incessantly generated and shared on the Internet. Meanwhile, the emerging cyberinfrastructure rapidly increases our capacity for handling such massive data with regard to data collection and management, data integration and interoperability, data transmission and visualization, high-performance computing, etc. Cyberinfrastructure (CI) consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high-performance networks to improve research productivity and enable breakthroughs that are not otherwise possible. The Geospatial CI (GCI, or CyberGIS), as the synthesis of CI and GIScience has inherent advantages in enabling computationally intensive spatial analysis and modeling (SAM) and collaborative geospatial problem solving and decision making. This dissertation is dedicated to addressing several critical issues and improving the performance of existing methodologies and systems in the field of CyberGIS. My dissertation will include three parts: The first part is focused on developing methodologies to help public researchers find appropriate open geo-spatial datasets from millions of records provided by thousands of organizations scattered around the world efficiently and effectively. Machine learning and semantic search methods will be utilized in this research. The second part develops an interoperable and replicable geoprocessing service by synthesizing the high-performance computing (HPC) environment, the core spatial statistic/analysis algorithms from the widely adopted open source python package – Python Spatial Analysis Library (PySAL), and rich datasets acquired from the first research. The third part is dedicated to studying optimization strategies for feature data transmission and visualization. This study is intended for solving the performance issue in large feature data transmission through the Internet and visualization on the client (browser) side. Taken together, the three parts constitute an endeavor towards the methodological improvement and implementation practice of the data-driven, high-performance and intelligent CI to advance spatial sciences.Dissertation/ThesisDoctoral Dissertation Geography 201

    Perspective chapter: MOOC: a decade later! What Is the current situation in teacher education?

    Get PDF
    The growth of distance education, even if in emergency modalities such as those compelled by the pandemic context in which we live in the last 3 years, seems to have potentiated a second massification of the use of MOOC. In this sense, it looks important to us to make research on the current status of these courses, namely, regarding their adoption in the continuing education of teachers. The study was based on a survey of the MOOCs carried out in the context of in-service teacher train- ing in Portugal. The exploratory research focused only on the last 3 years—which involved the pandemic. The information was collected in the NAU platform, but also by searching for keywords in the search engines of MOOCs carried out in the field of teacher training. This way it will be possible to understand, in a more concrete and deeper way, the impact of this technology on teacher training in Portugal and the contribution it can make to improve the quality of the teacher training.info:eu-repo/semantics/publishedVersio
    corecore