Search CORE

1,620 research outputs found

Recent Trends in Computational Intelligence

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

Directory of Open Access Books (DOAB)

Trusta: Reasoning about Assurance Cases with Formal Methods and Large Language Models

Author: Chen Zezhong
Deng Yuxin
Du Wenjie
Publication venue
Publication date: 22/09/2023
Field of study

Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. Trustworthiness Derivation Trees (TDTs) enhance assurance cases by incorporating formal methods, rendering it possible for automatic reasoning about assurance cases. We present Trustworthiness Derivation Tree Analyzer (Trusta), a desktop application designed to automatically construct and verify TDTs. The tool has a built-in Prolog interpreter in its backend, and is supported by the constraint solvers Z3 and MONA. Therefore, it can solve constraints about logical formulas involving arithmetic, sets, Horn clauses etc. Trusta also utilizes large language models to make the creation and evaluation of assurance cases more convenient. It allows for interactive human examination and modification. We evaluated top language models like ChatGPT-3.5, ChatGPT-4, and PaLM 2 for generating assurance cases. Our tests showed a 50%-80% similarity between machine-generated and human-created cases. In addition, Trusta can extract formal constraints from text in natural languages, facilitating an easier interpretation and validation process. This extraction is subject to human review and correction, blending the best of automated efficiency with human insight. To our knowledge, this marks the first integration of large language models in automatic creating and reasoning about assurance cases, bringing a novel approach to a traditional challenge. Through several industrial case studies, Trusta has proven to quickly find some subtle issues that are typically missed in manual inspection, demonstrating its practical value in enhancing the assurance case development process.Comment: 38 page

arXiv.org e-Print Archive

Text Summarization System with Bayesian Theorem on Oil & Gas Drilling Topic

Author: Kurniawan Iwan
Publication venue: Universiti teknologi petronas
Publication date: 01/07/2007
Field of study

Text summarization is the process of identifying the important sentences or words from the article which later to be represented and combined to generate the summary. There exist numerous algorithms to address the need for text summarization including Support Vector Machine, k-nearest neighbor classifier, and decision trees. In this project, Bayes theorem algorithm is studied and experimented by the implementation of a textual summarizer. This algorithm is used to extract the important points from a lengthy document, by which it classifies each word in the document under its relevant probability of the word's likeliness to be included in the summary given the corpus containing the summary done by the experts as the initial probability. As the application is used and processed, it would learn and keep track of the probability of each keyword so that it would predict the chance of certain keywords to be included in the future summarization. The objectives of this project are to look at the current situation in the area of text summarization research, to study the statistical approach in automatic text summary generation, and then to create a simple sample of text summarization tool which takes into account the existing research. Since the area of the application is specific, which is on oil and gas drilling topic, the ready-used corpus on that area is not easy to find. The articles collected are from the journals, news and any other information sources which are related to the discussed topic. Evaluation of the application is carried out against another accompanying system-generated summarizer which is already in the market. Human-made summary are used as the ideal or reference summary in evaluating both performance; the Text Summarization system and the Word Auto Summarizer. Current results show that the Text Summarization system performs better than the Word Auto Summarizer at the compression rate 60% and 70% (2/3 of the articles' length) by 11.31% and 10.80% respectively. Optimum value for overall performance is 85.82%

UTPedia

Big Data and the Internet of Things

Author: A Baaziz
A Kleiner
ED Feigelson
MA Waller
S Boyd
S Vandermerwe
Z Zhou
Publication venue
Publication date: 24/03/2015
Field of study

Advances in sensing and computing capabilities are making it possible to embed increasing computing power in small devices. This has enabled the sensing devices not just to passively capture data at very high resolution but also to take sophisticated actions in response. Combined with advances in communication, this is resulting in an ecosystem of highly interconnected devices referred to as the Internet of Things - IoT. In conjunction, the advances in machine learning have allowed building models on this ever increasing amounts of data. Consequently, devices all the way from heavy assets such as aircraft engines to wearables such as health monitors can all now not only generate massive amounts of data but can draw back on aggregate analytics to "improve" their performance over time. Big data analytics has been identified as a key enabler for the IoT. In this chapter, we discuss various avenues of the IoT where big data analytics either is already making a significant impact or is on the cusp of doing so. We also discuss social implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski (eds.) Big Data Analysis: New algorithms for a new society, Springer Series on Studies in Big Data, to appea

arXiv.org e-Print Archive

Crossref

Maine Occupational Wage Survey, 1971

Author: Maine Department of Economic Development
Maine Department of Labor and Industry
Maine Division of Research and Statistics
Publication venue: Digital Maine
Publication date: 01/02/1972
Field of study

Maine State Library

Maine State Documents (Maine State Library)

Exploratory Content Analysis Using Text Data Mining: Corporate Citizenship Reports of Seven US Companes from 2004 to 2012

Author: Castellanos Arturo
Parra Carlos M.
Paul Karen
Tremblay Monica
Publication venue: W&M ScholarWorks
Publication date: 01/06/2017
Field of study

This study demonstrates the use of Text Data Mining (TDM) for exploring the content of a collection of Corporate Citizenship(CC) reports. The collection analyzed comprises CC reports produced by seven Dow Jones companies (Citi, Coca-Cola, ExxonMobil, General Motors, Intel, McDonalds and Microsoft) in2004, 2008 and 2012.Exploratory con-tent analysis using TDM enables insights for CC professionals and analysts, in less time using fewer resources, which in turn could help them explore collaboration opportunities around supply chains, re-training programs, and alternative risk mitigation strategies in terms of governance and compliance. In addition, TDM, using supervised machine learning on the whole collection (or corpus) as well as unsupervised machine learning on document collections by year, suggests the integration of CC considerations related to environmental sustain-ability in CC report components discussing the core business of some firms. This method has been used in many contexts in which a collection of documents needs to be categorized and/or analyzed to uncover new patterns and relationships

College of William & Mary: W&M Publish

The research on the current status, development impetus and technical barriers of MASS

Author: Ding Baoxi
Publication venue: The Maritime Commons: Digital Repository of the World Maritime University
Publication date: 26/08/2018
Field of study

World Maritime University

Evaluating Information Retrieval and Access Tasks

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life. Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants. This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one

OAPEN Library

Crawling, Collecting, and Condensing News Comments

Author: Gobaan Raveendran
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

Traditionally, public opinion and policy is decided by issuing surveys and performing censuses designed to measure what the public thinks about a certain topic. Within the past five years social networks such as Facebook and Twitter have gained traction for collection of public opinion about current events. Academic research on Facebook data proves difficult since the platform is generally closed. Twitter on the other hand restricts the conversation of its users making it difficult to extract large scale concepts from the microblogging infrastructure. News comments provide a rich source of discourse from individuals who are passionate about an issue. Furthermore, due to the overhead of commenting, the population of commenters is necessarily biased towards individual who have either strong opinions of a topic or in depth knowledge of the given issue. Furthermore, their comments are often a collection of insight derived from reading multiple articles on any given topic. Unfortunately the commenting systems employed by news companies are not implemented by a single entity, and are often stored and generated using AJAX, which causes traditional crawlers to ignore them. To make matters worse they are often noisy; containing spam, poor grammar, and excessive typos. Furthermore, due to the anonymity of comment systems, conversations can often be derailed by malicious users or inherent biases in the commenters. In this thesis we discuss the design and creation of a crawler designed to extract comments from domains across the internet. For practical purposes we create a semiautomatic parser generator and describe how our system attempts to employ user feedback to predict which remote procedure calls are used to load comments. By reducing comment systems into remote procedure calls, we simplify the internet into a much simpler space, where we can focus on the data, almost independently from its presentation. Thus we are able to quickly create high fidelity parsers to extract comments from a web page. Once we have our system, we show the usefulness by attempting to extract meaningful opinions from the large collections we collect. Unfortunately doing so in real time is shown to foil traditional summarization systems, which are designed to handle dozens of well formed documents. In attempting to solve this problem we create a new algorithm, KLSum+, that outperforms all its competitors in efficiency while generally scoring well against the ROUGE SU4 metric. This algorithm factors in background models to boost accuracy, but performs over 50 times faster than alternatives. Furthermore, using the summaries we see that the data collected can provide useful insight into public opinion and even provide the key points of discourse

University of Waterloo's Institutional Repository