8,677 research outputs found
Improving diagnostic procedures for epilepsy through automated recording and analysis of patients’ history
Transient loss of consciousness (TLOC) is a time-limited state of profound cognitive impairment characterised by amnesia, abnormal motor control, loss of responsiveness, a short duration and complete recovery. Most instances of TLOC are caused by one of three health conditions: epilepsy, functional (dissociative) seizures (FDS), or syncope. There is often a delay before the correct diagnosis is made and 10-20% of individuals initially receive an incorrect diagnosis. Clinical decision tools based on the endorsement of TLOC symptom lists have been limited to distinguishing between two causes of TLOC. The Initial Paroxysmal Event Profile (iPEP) has shown promise but was demonstrated to have greater accuracy in distinguishing between syncope and epilepsy or FDS than between epilepsy and FDS. The objective of this thesis was to investigate whether interactional, linguistic, and communicative differences in how people with epilepsy and people with FDS describe their experiences of TLOC can improve the predictive performance of the iPEP. An online web application was designed that collected information about TLOC symptoms and medical history from patients and witnesses using a binary questionnaire and verbal interaction with a virtual agent. We explored potential methods of automatically detecting these communicative differences, whether the differences were present during an interaction with a VA, to what extent these automatically detectable communicative differences improve the performance of the iPEP, and the acceptability of the application from the perspective of patients and witnesses. The two feature sets that were applied to previous doctor-patient interactions, features designed to measure formulation effort or detect semantic differences between the two groups, were able to predict the diagnosis with an accuracy of 71% and 81%, respectively. Individuals with epilepsy or FDS provided descriptions of TLOC to the VA that were qualitatively like those observed in previous research. Both feature sets were effective predictors of the diagnosis when applied to the web application recordings (85.7% and 85.7%). Overall, the accuracy of machine learning models trained for the threeway classification between epilepsy, FDS, and syncope using the iPEP responses from patients that were collected through the web application was worse than the performance observed in previous research (65.8% vs 78.3%), but the performance was increased by the inclusion of features extracted from the spoken descriptions on TLOC (85.5%). Finally, most participants who provided feedback reported that the online application was acceptable. These findings suggest that it is feasible to differentiate between people with epilepsy and people with FDS using an automated analysis of spoken seizure descriptions. Furthermore, incorporating these features into a clinical decision tool for TLOC can improve the predictive performance by improving the differential diagnosis between these two health conditions. Future research should use the feedback to improve the design of the application and increase perceived acceptability of the approach
Convolutional Neural Networks for Sentiment Analysis on Weibo Data: A Natural Language Processing Approach
This study addressed the complex task of sentiment analysis on a dataset of
119,988 original tweets from Weibo using a Convolutional Neural Network (CNN),
offering a new approach to Natural Language Processing (NLP). The data, sourced
from Baidu's PaddlePaddle AI platform, were meticulously preprocessed,
tokenized, and categorized based on sentiment labels. A CNN-based model was
utilized, leveraging word embeddings for feature extraction, and trained to
perform sentiment classification. The model achieved a macro-average F1-score
of approximately 0.73 on the test set, showing balanced performance across
positive, neutral, and negative sentiments. The findings underscore the
effectiveness of CNNs for sentiment analysis tasks, with implications for
practical applications in social media analysis, market research, and policy
studies. The complete experimental content and code have been made publicly
available on the Kaggle data platform for further research and development.
Future work may involve exploring different architectures, such as Recurrent
Neural Networks (RNN) or transformers, or using more complex pre-trained models
like BERT, to further improve the model's ability to understand linguistic
nuances and context
Improving The Usability of Software Systems Using Group Discussions: A Case Study on Galaxy
Usability problems in software systems cause performance degradation, user dissatisfaction and loss in terms of cost. There is a growing need for the software systems to become more accessible, retrievable and usable for the users. The usability test of a software is conducted by getting the opinions directly from the users and its goal is to identify problems, uncover opportunities and learn about target users' preferences. But accessing real users is very difficult for certain software systems. However, there are many popular user forums such as Stack Overflow, Quora, Stack Exchange, Eclipse Community Forum etc. and people from different domains of knowledge use these forums to ask about their problems and post their concerns. So exploring these forums should provide significant knowledge for getting information about a system's usability issues. Previous studies show that investigating these group discussion forums discovered several usability issues that the system was unaware of such as topic categorization, automatic tag prediction, identifying reproducible codes etc. However, there are many Scientific Workflow Management Systems (SWfMSs) such as Galaxy, Taverna, Kepler, iPlant, VizSciFlow etc. and although these SWfMSs are emerging and important for data extensive research, no study has been done earlier to figure out the usability problems of these systems. Therefore, in this thesis, we take Galaxy, a well-known SWfMS, as our use case. We explore the user forum that Galaxy offers where users ask for help from experts and other Galaxy users. We search for the issues users are discussing in the forum and find out several usability problems in different categories. In our first study, we try to group the usability problems to easily identify them and galaxy community can be informed of the existing usability problems of the system. While exploring the posts, we find a significant percentage (up to 28\%) of them lack tags. If tags are found, they do not reflect the context of the posts properly. This leads to one of the major usability problems for the discussion forums as users will be unable to identify suitable posts without proper tags. Moreover, users will face difficulties to explore the answers in those untagged questions. So in our second study, we try to suggest tags based on the context and proposed a method for automatically suggesting tags. Again in our extensive investigation, we find lots of usability issues but among them, the problem of finding and searching for the appropriate workflows emerges as a great usability problem of the system. Users, especially novice users, ask for workflow design recommendations from the experts but because of the domain-specific nature of SWfMSs, it gets difficult for them to design or implement a workflow according to their new requirements. Any software system's usability is called into question if users face trouble specifying or carrying out certain tasks and are not given the necessary resources. Therefore, to increase the usability of Galaxy, in our third study, we introduce a NLP-based workflow recommendation system where anyone can write their queries using natural language. Our system can recommend the users with the most relevant workflows in return. We develop a tool on the Galaxy platform based on the idea of the proposed method. Lastly, we believe our study findings can guide the Galaxy community to improve and extend the services according to the users' requirements. We are confident that our proposed methods can be applied to any software system to improve the usability of the system by exploring the user forums
Journal of Applied Hydrography
Fokusthema: International Issue: Joint publication by AFHy and DHyG for HYDRO 2
Intelligent computing : the latest advances, challenges and future
Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing
A large-scale dataset for end-to-end table recognition in the wild
Table recognition (TR) is one of the research hotspots in pattern
recognition, which aims to extract information from tables in an image. Common
table recognition tasks include table detection (TD), table structure
recognition (TSR) and table content recognition (TCR). TD is to locate tables
in the image, TCR recognizes text content, and TSR recognizes spatial ogical
structure. Currently, the end-to-end TR in real scenarios, accomplishing the
three sub-tasks simultaneously, is yet an unexplored research area. One major
factor that inhibits researchers is the lack of a benchmark dataset. To this
end, we propose a new large-scale dataset named Table Recognition Set
(TabRecSet) with diverse table forms sourcing from multiple scenarios in the
wild, providing complete annotation dedicated to end-to-end TR research. It is
the largest and first bi-lingual dataset for end-to-end TR, with 38.1K tables
in which 20.4K are in English\, and 17.7K are in Chinese. The samples have
diverse forms, such as the border-complete and -incomplete table, regular and
irregular table (rotated, distorted, etc.). The scenarios are multiple in the
wild, varying from scanned to camera-taken images, documents to Excel tables,
educational test papers to financial invoices. The annotations are complete,
consisting of the table body spatial annotation, cell spatial logical
annotation and text content for TD, TSR and TCR, respectively. The spatial
annotation utilizes the polygon instead of the bounding box or quadrilateral
adopted by most datasets. The polygon spatial annotation is more suitable for
irregular tables that are common in wild scenarios. Additionally, we propose a
visualized and interactive annotation tool named TableMe to improve the
efficiency and quality of table annotation
Improved Naive Bayes with Mislabeled Data
Labeling mistakes are frequently encountered in real-world applications. If
not treated well, the labeling mistakes can deteriorate the classification
performances of a model seriously. To address this issue, we propose an
improved Naive Bayes method for text classification. It is analytically simple
and free of subjective judgements on the correct and incorrect labels. By
specifying the generating mechanism of incorrect labels, we optimize the
corresponding log-likelihood function iteratively by using an EM algorithm. Our
simulation and experiment results show that the improved Naive Bayes method
greatly improves the performances of the Naive Bayes method with mislabeled
data
A Data-driven Approach to Large Knowledge Graph Matching
In the last decade, a remarkable number of open Knowledge Graphs (KGs) were developed, such as DBpedia, NELL, and YAGO. While some of such KGs are curated via crowdsourcing platforms, others are semi-automatically constructed. This has resulted in a significant degree of semantic heterogeneity and overlapping facts. KGs are highly complementary; thus, mapping them can benefit intelligent applications that require integrating different KGs such as recommendation systems, query answering, and semantic web navigation.
Although the problem of ontology matching has been investigated and a significant number of systems have been developed, the challenges of mapping large-scale KGs remain significant. KG matching has been a topic of interest in the Semantic Web community since it has been introduced to the Ontology Alignment Evaluation Initiative (OAEI) in 2018. Nonetheless, a major limitation of the current benchmarks is their lack of representation of real-world KGs. This work also highlights a number of limitations with current matching methods, such as: (i) they are highly dependent on string-based similarity measures, and (ii) they are primarily built to handle well-formed ontologies. These features make them unsuitable for large, (semi/fully) automatically constructed KGs with hundreds of classes and millions of instances. Another limitation of current work is the lack of benchmark datasets that represent the challenging task of matching real-world KGs.
This work addresses the limitation of the current datasets by first introducing two gold standard datasets for matching the schema of large, automatically constructed, less-well-structured KGs based on common KGs such as NELL, DBpedia, and Wikidata. We believe that the datasets which we make public in this work make the largest domain-independent benchmarks for matching KG classes. As many state-of-the-art methods are not suitable for matching large-scale and cross-domain KGs that often suffer from highly imbalanced class distribution, recent studies have revisited instance-based matching techniques in addressing this task. This is because such large KGs often lack a well-defined structure and descriptive metadata about their classes, but contain numerous class instances. Therefore, inspired by the role of instances in KGs, we propose a hybrid matching approach. Our method composes an instance-based matcher that casts the schema-matching process as a text classification task by exploiting instances of KG classes, and a string-based matcher. Our method is domain-independent and is able to handle KG classes with imbalanced populations. Further, we show that incorporating an instance-based approach with the appropriate data balancing strategy results in significant results in matching large and common KG classes
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
- …