213 research outputs found

    Information System Articulation Development - Managing Veracity Attributes and Quantifying Relationship with Readability of Textual Data

    Get PDF
    Often the textual data are either disorganized or misinterpreted because of unstructured Big Data in multiple dimensions. Managing readable textual alphanumeric data and its analytics is challenging. In spatial dimensions, the facts can be ambiguous and inconsistent, posing interpretation and new knowledge discovery challenges. The information can be wordy, erratic, and noisy. The research aims to assimilate the data characteristics through Information System (IS) artefacts that are appropriate to data analytics, especially in application domains that involve big data sources. Data heterogeneity and multidimensionality can make and preclude IS-guided veracity models in the data integration process, including customer analytics services. The veracity of big data thus can impact visualization and value, including knowledge enhancement in the vast amount of textual data qualitatively. The manner the veracity features construed in each schematic, semantic and syntactic attribute dimension in several IS artefacts and relevant documents can enhance the readability of textual data robustly

    DTRM: A new reputation mechanism to enhance data trustworthiness for high-performance cloud computing

    Get PDF
    This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record.Cloud computing and the mobile Internet have been the two most influential information technology revolutions, which intersect in mobile cloud computing (MCC). The burgeoning MCC enables the large-scale collection and processing of big data, which demand trusted, authentic, and accurate data to ensure an important but often overlooked aspect of big data - data veracity. Troublesome internal attacks launched by internal malicious users is one key problem that reduces data veracity and remains difficult to handle. To enhance data veracity and thus improve the performance of big data computing in MCC, this paper proposes a Data Trustworthiness enhanced Reputation Mechanism (DTRM) which can be used to defend against internal attacks. In the DTRM, the sensitivity-level based data category, Metagraph theory based user group division, and reputation transferring methods are integrated into the reputation query and evaluation process. The extensive simulation results based on real datasets show that the DTRM outperforms existing classic reputation mechanisms under bad mouthing attacks and mobile attacks.This work was supported by the National Natural Science Foundation of China (61602360, 61772008, 61472121), the Pilot Project of Fujian Province (formal industry key project) (2016Y0031), the Foundation of Science and Technology on Information Assurance Laboratory (KJ-14-109) and the Fujian Provincial Key Lab of Network Security and Cryptology Research Fund (15012)

    An Online Social Network model through Twitter to build a social perception variable to measure the violence in Mexico

    Get PDF
    This paper describes the methodology and the model that used in Twitter to create an indicator that allows us to denote a social perception about violence, a topic of high impact in Mexico. We investigated and validated the keywords that Mexicans used related to this topic, in a specific time-lapse defined by the researchers. We implemented two analysis levels, the first one relative to the sum of tweets, and the second one with a rate of total tweets per 100,000 inhabitanThis paper describes the methodology and the model that used in Twitter to create an indicator that allows us to denote a social perception about violence, a topic of high impact in Mexico. We investigated and validated the keywords that Mexicans used related to this topic, in a specific time-lapse defined by the researchers. We implemented two analysis levels, the first one relative to the sum of tweets, and the second one with a rate of total tweets per 100,000 inhabita

    FACTS-ON : Fighting Against Counterfeit Truths in Online social Networks : fake news, misinformation and disinformation

    Full text link
    L'évolution rapide des réseaux sociaux en ligne (RSO) représente un défi significatif dans l'identification et l'atténuation des fausses informations, incluant les fausses nouvelles, la désinformation et la mésinformation. Cette complexité est amplifiée dans les environnements numériques où les informations sont rapidement diffusées, nécessitant des stratégies sophistiquées pour différencier le contenu authentique du faux. L'un des principaux défis dans la détection automatique de fausses informations est leur présentation réaliste, ressemblant souvent de près aux faits vérifiables. Cela pose de considérables défis aux systèmes d'intelligence artificielle (IA), nécessitant des données supplémentaires de sources externes, telles que des vérifications par des tiers, pour discerner efficacement la vérité. Par conséquent, il y a une évolution technologique continue pour contrer la sophistication croissante des fausses informations, mettant au défi et avançant les capacités de l'IA. En réponse à ces défis, ma thèse introduit le cadre FACTS-ON (Fighting Against Counterfeit Truths in Online Social Networks), une approche complète et systématique pour combattre la désinformation dans les RSO. FACTS-ON intègre une série de systèmes avancés, chacun s'appuyant sur les capacités de son prédécesseur pour améliorer la stratégie globale de détection et d'atténuation des fausses informations. Je commence par présenter le cadre FACTS-ON, qui pose les fondements de ma solution, puis je détaille chaque système au sein du cadre : EXMULF (Explainable Multimodal Content-based Fake News Detection) se concentre sur l'analyse du texte et des images dans les contenus en ligne en utilisant des techniques multimodales avancées, couplées à une IA explicable pour fournir des évaluations transparentes et compréhensibles des fausses informations. En s'appuyant sur les bases d'EXMULF, MythXpose (Multimodal Content and Social Context-based System for Explainable False Information Detection with Personality Prediction) ajoute une couche d'analyse du contexte social en prédisant les traits de personnalité des utilisateurs des RSO, améliorant la détection et les stratégies d'intervention précoce contre la désinformation. ExFake (Explainable False Information Detection Based on Content, Context, and External Evidence) élargit encore le cadre, combinant l'analyse de contenu avec des insights du contexte social et des preuves externes. Il tire parti des données d'organisations de vérification des faits réputées et de comptes officiels, garantissant une approche plus complète et fiable de la détection de la désinformation. La méthodologie sophistiquée d'ExFake évalue non seulement le contenu des publications en ligne, mais prend également en compte le contexte plus large et corrobore les informations avec des sources externes crédibles, offrant ainsi une solution bien arrondie et robuste pour combattre les fausses informations dans les réseaux sociaux en ligne. Complétant le cadre, AFCC (Automated Fact-checkers Consensus and Credibility) traite l'hétérogénéité des évaluations des différentes organisations de vérification des faits. Il standardise ces évaluations et évalue la crédibilité des sources, fournissant une évaluation unifiée et fiable de l'information. Chaque système au sein du cadre FACTS-ON est rigoureusement évalué pour démontrer son efficacité dans la lutte contre la désinformation sur les RSO. Cette thèse détaille le développement, la mise en œuvre et l'évaluation complète de ces systèmes, soulignant leur contribution collective au domaine de la détection des fausses informations. La recherche ne met pas seulement en évidence les capacités actuelles dans la lutte contre la désinformation, mais prépare également le terrain pour de futures avancées dans ce domaine critique d'étude.The rapid evolution of online social networks (OSN) presents a significant challenge in identifying and mitigating false information, which includes Fake News, Disinformation, and Misinformation. This complexity is amplified in digital environments where information is quickly disseminated, requiring sophisticated strategies to differentiate between genuine and false content. One of the primary challenges in automatically detecting false information is its realistic presentation, often closely resembling verifiable facts. This poses considerable challenges for artificial intelligence (AI) systems, necessitating additional data from external sources, such as third-party verifications, to effectively discern the truth. Consequently, there is a continuous technological evolution to counter the growing sophistication of false information, challenging and advancing the capabilities of AI. In response to these challenges, my dissertation introduces the FACTS-ON framework (Fighting Against Counterfeit Truths in Online Social Networks), a comprehensive and systematic approach to combat false information in OSNs. FACTS-ON integrates a series of advanced systems, each building upon the capabilities of its predecessor to enhance the overall strategy for detecting and mitigating false information. I begin by introducing the FACTS-ON framework, which sets the foundation for my solution, and then detail each system within the framework: EXMULF (Explainable Multimodal Content-based Fake News Detection) focuses on analyzing both text and image in online content using advanced multimodal techniques, coupled with explainable AI to provide transparent and understandable assessments of false information. Building upon EXMULF’s foundation, MythXpose (Multimodal Content and Social Context-based System for Explainable False Information Detection with Personality Prediction) adds a layer of social context analysis by predicting the personality traits of OSN users, enhancing the detection and early intervention strategies against false information. ExFake (Explainable False Information Detection Based on Content, Context, and External Evidence) further expands the framework, combining content analysis with insights from social context and external evidence. It leverages data from reputable fact-checking organizations and official social accounts, ensuring a more comprehensive and reliable approach to the detection of false information. ExFake's sophisticated methodology not only evaluates the content of online posts but also considers the broader context and corroborates information with external, credible sources, thereby offering a well-rounded and robust solution for combating false information in online social networks. Completing the framework, AFCC (Automated Fact-checkers Consensus and Credibility) addresses the heterogeneity of ratings from various fact-checking organizations. It standardizes these ratings and assesses the credibility of the sources, providing a unified and trustworthy assessment of information. Each system within the FACTS-ON framework is rigorously evaluated to demonstrate its effectiveness in combating false information on OSN. This dissertation details the development, implementation, and comprehensive evaluation of these systems, highlighting their collective contribution to the field of false information detection. The research not only showcases the current capabilities in addressing false information but also sets the stage for future advancements in this critical area of study

    Framework for a semantic data transformation in solving data quality issues in big data

    Get PDF
    Purpose - Today organizations and companies are generating a tremendous amount of data.At the same time, an enormous amount of data is being received and acquired from various resources and being stored which brings us to the era of Big Data (BD). BD is a term used to describe massive datasets that are of diverse format created at a very high speed, the management of which is near impossible by using traditional database management systems (Kanchi et al., 2015). With the dawn of BD, Data Quality (DQ) has become very imperative.Volume, velocity and variety – the initial 3Vs characteristics of BD are usually used to describe the main properties of BD.But for extraction of value (which is another V property) and make BD effective and efficient for organizational decision making, the significance of another V of BD, veracity, is gradually coming to light. Veracity straightly denotes inconsistency and DQ issues.Today, veracity in data analysis is the biggest challenge when compared to other aspects such as volume and velocity. Trusting the data acquired goes a long way in implementing decisions from an automated decision making system and veracity helps to validate the data acquired (Agarwal, Ravikumar, & Saha, 2016).DQ represents an important issue in every business.To be successful, companies need high-quality data on inventory, supplies, customers, vendors and other vital enterprise information in order to run efficiently their data analysis applications (e.g. decision support systems, data mining, customer relationship management) and produce accurate results (McAfee & Brynjolfsson, 2012).During the transformation of huge volume of data, there might exist data mismatch, miscalculation and/or loss of useful data that leads to an unsuccessful data transformation (Tesfagiorgish, & JunYi, 2015) which will in turn leads to poor data quality. In addition of external data, particularly RDF data, increase some challenges for data transformation when compared with the traditional transformation process. For example, the drawbacks of using BD in the business analysis process is that the data is almost schema less, and RDF data contains poor or complex schema. Traditional data transformation tools are not able to process such inconsistent and heterogeneous data because they do not support semantic-aware data, they are entirely schema-dependent and they do not focus on expressive semantic relationships to integrate data from different sources.Thus, BD requires more powerful tools to transform data semantically. While the research on this area so far offer different frameworks, to the best of the researchers knowledge, not much research has been done in relation to transformation of DQ in BD. The much that has been done has not gone beyond cleansing incoming data generally (Merino et al., 2016).The proposed framework presents the method for the analysis of DQ using BD from various domains and applying semantic technologies in the ETL transformation stage to create a semantic model for the enablement of quality in the data

    Spatial modelling of air pollution for open smart cities

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Geographic Information SystemsHalf of the world’s population already lives in cities, and by 2050 two-thirds of the world’s population are expected to further move into urban areas. This urban growth leads to various environmental, social and economic challenges in cities, hampering the Quality of Life (QoL). Although recent trends in technologies equip us with various tools and techniques that can help in improving quality of life, air pollution remains the ‘biggest environmental health risk’ for decades, impacting individuals’ quality of life and well-being according to World Health Organisation (WHO). Many efforts have been made to measure air quality, but the sparse arrangement of monitoring stations and the lack of data currently make it challenging to develop systems that can capture within-city air pollution variations. To solve this, flexible methods that allow air quality monitoring using easily accessible data sources at the city level are desirable. The present thesis seeks to widen the current knowledge concerning detailed air quality monitoring by developing approaches that can help in tackling existing gaps in the literature. The thesis presents five contributions which address the issues mentioned above. The first contribution is the choice of a statistical method which can help in utilising existing open data and overcoming challenges imposed by the bigness of data for detailed air pollution monitoring. The second contribution concerns the development of optimisation method which helps in identifying optimal locations for robust air pollution modelling in cities. The third contribution of the thesis is also an optimisation method which helps in initiating systematic volunteered geographic information (VGI) campaigns for detailed air pollution monitoring by addressing sparsity and scarcity challenges of air pollution data in cities. The fourth contribution is a study proposing the involvement of housing companies as a stakeholder in the participatory framework for air pollution data collection, which helps in overcoming certain gaps existing in VGI-based approaches. Finally, the fifth contribution is an open-hardware system that aids in collecting vehicular traffic data using WiFi signal strength. The developed hardware can help in overcoming traffic data scarcity in cities, which limits detailed air pollution monitoring. All the contributions are illustrated through case studies in Muenster and Stuttgart. Overall, the thesis demonstrates the applicability of the developed approaches for enabling air pollution monitoring at the city-scale under the broader framework of the open smart city and for urban health research
    • …
    corecore