6,610 research outputs found
Review of Current Student-Monitoring Techniques used in eLearning-Focused recommender Systems and Learning analytics. The Experience API & LIME model Case Study
Recommender systems require input information in
order to properly operate and deliver content or behaviour
suggestions to end users. eLearning scenarios are no exception.
Users are current students and recommendations can be built
upon paths (both formal and informal), relationships, behaviours,
friends, followers, actions, grades, tutor interaction, etc. A
recommender system must somehow retrieve, categorize and
work with all these details. There are several ways to do so: from
raw and inelegant database access to more curated web APIs or
even via HTML scrapping. New server-centric user-action
logging and monitoring standard technologies have been
presented in past years by several groups, organizations and
standard bodies. The Experience API (xAPI), detailed in this
article, is one of these. In the first part of this paper we analyse
current learner-monitoring techniques as an initialization phase
for eLearning recommender systems. We next review
standardization efforts in this area; finally, we focus on xAPI and
the potential interaction with the LIME model, which will be also
summarized below
Negative Statements Considered Useful
Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities
Recommended from our members
OBOME - Ontology based opinion mining in UBIPOL
Ontologies have a special role in the UBIPOL system, they help to structure the policy related context, provide conceptualization for policy domain and use in the opinion mining process. In this work we presented a system called Ontology Based Opinion Mining Engine (OBOME) for analyzing a domain-specific opinion corpus by first assisting the user with the creation of a domain ontology from the corpus. We determined the polarity of opinion on the various domain aspects. In the former step, the policy domain aspect has are identified (namely which policy category is represented by the concept). This identification is supported by the policy modelling ontology, which describe the most important policy – related classes and structure. Then the most informative documents from the corpus are extracted and asked the user to create a set of aspects and related keywords using these documents. In the latter step, we used the corpus specific ontology to model the domain and extracted aspect-polarity associations using grammatical dependencies between words. Later, summarized results are shown to the user to analyze and store. Finally, in an offline process policy modeling ontology is updated
Supporting the Maintenance of Identifier Names: A Holistic Approach to High-Quality Automated Identifier Naming
A considerable part of the source code is identifier names-- unique lexical tokens that provide information about entities, and entity interactions, within the code. Identifier names provide human-readable descriptions of classes, functions, variables, etc. Poor or ambiguous identifier names (i.e., names that do not correctly describe the code behavior they are associated with) will lead developers to spend more time working towards understanding the code\u27s behavior. Bad naming can also have detrimental effects on tools that rely on natural language clues; degrading the quality of their output and making them unreliable. Additionally, misinterpretations of the code, caused by poor names, can result in the injection of quality issues into the system under maintenance. Thus, improved identifier naming increases developer effectiveness, higher-quality software, and higher-quality software analysis tools. In this dissertation, I establish several novel concepts that help measure and improve the quality of identifiers. The output of this dissertation work is a set of identifier name appraisal and quality tools that integrate into the developer workflow. Through a sequence of empirical studies, I have formulated a series of heuristics and linguistic patterns to evaluate the quality of identifier names in the code and provide naming structure recommendations. I envision and working towards supporting developers in integrating my contributions, discussed in this dissertation, into their development workflow to significantly improve the process of crafting and maintaining high-quality identifier names in the source code
Prioritization of Software and System Requirements through Natural Language Processing for Testing Software
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2021Safety¬critical systems have been a constant and increased presence in industrial production, such as railways and vehicles. These systems are highly configurable and must be intensively tested by system engineers before being deliverable to customers. This process is highly time¬consuming and might require associations between the product features and requirements demanded by customers. Requirement prioritization looks to recognize the most relevant requirements of a system, aiming to reduce the costs and time of the testing process. Machine Learning has been shown useful in helping engineers in this task, automating associations between features and requirements. However, its application can be more difficult when requirements are written in natural language and if a ground truth dataset does not exist with them. In our work, we present ARRINA, a Natural Language Processing¬based recommendation system able to extract and associate components from safety¬critical systems with their specifications written in natural language and process customer requirements and map them to components. The system integrates a Weight Association Rule Mining framework to extract the components and their associations and generates visualizations that can help engineers understand which components are generally introduced in project requirements. The system also includes a recommendation framework that can associate in put requirements to existing subsystems, reducing engineers’ effort in terms of requirement analysis and prioritization. We performed several experiments to evaluate the different components of ARRINA over four railway’s subsystems and input requirements. As a result, the system achieved 90% of accuracy, which denotes its importance in reducing the time¬consuming of engineers in discovering the correct subsystem links and prioritizing requirements for the testing process
Using data-driven sublanguage pattern mining to induce knowledge models: application in medical image reports knowledge representation
Background: The use of knowledge models facilitates information retrieval, knowledge base development, and therefore supports new knowledge discovery that ultimately enables decision support applications. Most existing works have employed machine learning techniques to construct a knowledge base. However, they often suffer from low precision in extracting entity and relationships. In this paper, we described a data-driven sublanguage pattern mining method that can be used to create a knowledge model. We combined natural language processing (NLP) and semantic network analysis in our model generation pipeline.
Methods: As a use case of our pipeline, we utilized data from an open source imaging case repository, Radiopaedia.org, to generate a knowledge model that represents the contents of medical imaging reports. We extracted entities and relationships using the Stanford part-of-speech parser and the “Subject:Relationship:Object” syntactic data schema. The identified noun phrases were tagged with the Unified Medical Language System (UMLS) semantic types. An evaluation was done on a dataset comprised of 83 image notes from four data sources.
Results: A semantic type network was built based on the co-occurrence of 135 UMLS semantic types in 23,410 medical image reports. By regrouping the semantic types and generalizing the semantic network, we created a knowledge model that contains 14 semantic categories. Our knowledge model was able to cover 98% of the content in the evaluation corpus and revealed 97% of the relationships. Machine annotation achieved a precision of 87%, recall of 79%, and F-score of 82%.
Conclusion: The results indicated that our pipeline was able to produce a comprehensive content-based knowledge model that could represent context from various sources in the same domain
Theory and Applications for Advanced Text Mining
Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields
- …