464 research outputs found

    Explainable automatic industrial carbon footprint estimation from bank transaction classification using natural language processing

    Get PDF
    Concerns about the effect of greenhouse gases have motivated the development of certification protocols to quantify the industrial carbon footprint (cf). These protocols are manual, work-intensive, and expensive. All of the above have led to a shift towards automatic data-driven approaches to estimate the cf, including Machine Learning (ml) solutions. Unfortunately, as in other sectors of interest, the decision-making processes involved in these solutions lack transparency from the end user’s point of view, who must blindly trust their outcomes compared to intelligible traditional manual approaches. In this research, manual and automatic methodologies for cf estimation were reviewed, taking into account their transparency limitations. This analysis led to the proposal of a new explainable ml solution for automatic cf calculations through bank transaction classification. Consideration should be given to the fact that no previous research has considered the explainability of bank transaction classification for this purpose. For classification, different ml models have been employed based on their promising performance in similar problems in the literature, such as Support Vector Machine, Random Forest, and Recursive Neural Networks. The results obtained were in the 90 % range for accuracy, precision, and recall evaluation metrics. From their decision paths, the proposed solution estimates the co2 emissions associated with bank transactions. The explainability methodology is based on an agnostic evaluation of the influence of the input terms extracted from the descriptions of transactions using locally interpretable models. The explainability terms were automatically validated using a similarity metric over the descriptions of the target categories. Conclusively, the explanation performance is satisfactory in terms of the proximity of the explanations to the associated activity sector descriptions, endorsing the trustworthiness of the process for a human operator and end users.Xunta de Galicia, Spain | Ref. ED481B-2021-118Xunta de Galicia, Spain | Ref. ED481B-2022-093Centro para el Desarrollo Tecnológico Industrial | Ref. EXP00146826/IDI-2022029

    24th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The series of European – Japanese Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1982. The practical operations were then organised by professor Ohsuga in Japan and professors Hannu Kangassalo and Hannu Jaakkola in Finland (Nordic countries). Geographical scope has expanded to cover Europe and also other countries. Workshop characteristic - discussion, enough time for presentations and limited number of participants (50) / papers (30) - is typical for the conference. Suggested topics include, but are not limited to: 1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models. 2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling. 3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models. 4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction. 5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualisation; Hazard prediction, prevention and steering systems. 6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Contentbased multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems. Overall we received 56 submissions. After careful evaluation, 16 papers have been selected as long paper, 17 papers as short papers, 5 papers as position papers, and 3 papers for presentation of perspective challenges. We thank all colleagues for their support of this issue of the EJC conference, especially the program committee, the organising committee, and the programme coordination team. The long and the short papers presented in the conference are revised after the conference and published in the Series of “Frontiers in Artificial Intelligence” by IOS Press (Amsterdam). The books “Information Modelling and Knowledge Bases” are edited by the Editing Committee of the conference. We believe that the conference will be productive and fruitful in the advance of research and application of information modelling and knowledge bases. Bernhard Thalheim Hannu Jaakkola Yasushi Kiyok

    AIRM: a new AI Recruiting Model for the Saudi Arabian labour market

    Get PDF
    One of the goals of Saudi Vision 2030 is to keep the unemployment rate at the lowest level to empower the economy. Prior research has shown that an increase in unemployment has a negative effect on a country’s Gross Domestic Product. This research aims to utilise cutting-edge technology such as Data Lake (DL), Machine Learning (ML) and Artificial Intelligence (AI) to assist the Saudi labour market bymatching job seekers with vacant positions. Currently, human experts carry out this process; however, this is time consuming and labour intensive. Moreover, in the Saudi labour market, this process does not use a cohesive data centre to monitor, integrate, or analyse labour market data, resulting in inefficiencies, such as bias and latency. These inefficiencies arise from a lack of technologies and, more importantly, from having an open labour market without a national labour market data centre. This research proposes a new AI Recruiting Model (AIRM) architecture that exploits DLs, ML and AI to rapidly and efficiently match job seekers to vacant positions in the Saudi labour market. A Minimum Viable Product (MVP) is employed to test the proposed AIRM architecture using a labour market dataset simulation corpus for training purposes; the architecture is further evaluated against three research-collaborative Human Resources (HR) professionals. As this research is data-driven in nature, it requires collaboration from domain experts. The first layer of the AIRM architecture uses balanced iterative reducing and clustering using hierarchies (BIRCH) as a clustering algorithm for the initial screening layer. The mapping layer uses sentence transformers with a robustly optimised BERTt pre-training approach (RoBERTa) as the base model, and ranking is carried out using the Facebook AI Similarity Search (FAISS). Finally, the preferences layer takes the user’s preferences as a list and sorts the results using the pre-trained cross-encoders model, considering the weight of the more important words. This new AIRM has yielded favourable outcomes: This research considered accepting an AIRM selection ratified by at least one HR expert to account for the subjective character of the selection process when exclusively handled by human HR experts. The research evaluated the AIRM using two metrics: accuracy and time. The AIRM had an overall matching accuracy of 84%, with at least one expert agreeing with the system’s output. Furthermore, it completed the task in 2.4 minutes, whereas human experts took more than six days on average. Overall, the AIRM outperforms humans in task execution, making it useful in pre-selecting a group of applicants and positions. The AIRM is not limited to government services. It can also help any commercial business that uses Big Data

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    14th Conference on DATA ANALYSIS METHODS for Software Systems

    Get PDF
    DAMSS-2023 is the 14th International Conference on Data Analysis Methods for Software Systems, held in Druskininkai, Lithuania. Every year at the same venue and time. The exception was in 2020, when the world was gripped by the Covid-19 pandemic and the movement of people was severely restricted. After a year’s break, the conference was back on track, and the next conference was successful in achieving its primary goal of lively scientific communication. The conference focuses on live interaction among participants. For better efficiency of communication among participants, most of the presentations are poster presentations. This format has proven to be highly effective. However, we have several oral sections, too. The history of the conference dates back to 2009 when 16 papers were presented. It began as a workshop and has evolved into a well-known conference. The idea of such a workshop originated at the Institute of Mathematics and Informatics, now the Institute of Data Science and Digital Technologies of Vilnius University. The Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea, which gained enthusiastic acceptance from both the Lithuanian and international scientific communities. This year’s conference features 84 presentations, with 137 registered participants from 11 countries. The conference serves as a gathering point for researchers from six Lithuanian universities, making it the main annual meeting for Lithuanian computer scientists. The primary aim of the conference is to showcase research conducted at Lithuanian and foreign universities in the fields of data science and software engineering. The annual organization of the conference facilitates the rapid exchange of new ideas within the scientific community. Seven IT companies supported the conference this year, indicating the relevance of the conference topics to the business sector. In addition, the conference is supported by the Lithuanian Research Council and the National Science and Technology Council (Taiwan, R. O. C.). The conference covers a wide range of topics, including Applied Mathematics, Artificial Intelligence, Big Data, Bioinformatics, Blockchain Technologies, Business Rules, Software Engineering, Cybersecurity, Data Science, Deep Learning, High-Performance Computing, Data Visualization, Machine Learning, Medical Informatics, Modelling Educational Data, Ontological Engineering, Optimization, Quantum Computing, Signal Processing. This book provides an overview of all presentations from the DAMSS-2023 conference

    The Challenge of Spoken Language Systems: Research Directions for the Nineties

    Get PDF
    A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area
    • …
    corecore