304 research outputs found

    Complexity modelling for case knowledge maintenance in case-based reasoning.

    Get PDF
    Case-based reasoning solves new problems by re-using the solutions of previously solved similar problems and is popular because many of the knowledge engineering demands of conventional knowledge-based systems are removed. The content of the case knowledge container is critical to the performance of case-based classification systems. However, the knowledge engineer is given little support in the selection of suitable techniques to maintain and monitor the case base. This research investigates the coverage, competence and problem-solving capacity of case knowledge with the aim of developing techniques to model and maintain the case base. We present a novel technique that creates a model of the case base by measuring the uncertainty in local areas of the problem space based on the local mix of solutions present. The model provides an insight into the structure of a case base by means of a complexity profile that can assist maintenance decision-making and provide a benchmark to assess future changes to the case base. The distribution of cases in the case base is critical to the performance of a case-based reasoning system. We argue that classification boundaries represent important regions of the problem space and develop two complexity-guided algorithms which use boundary identification techniques to actively discover cases close to boundaries. We introduce a complexity-guided redundancy reduction algorithm which uses a case complexity threshold to retain cases close to boundaries and delete cases that form single class clusters. The algorithm offers control over the balance between maintaining competence and reducing case base size. The performance of a case-based reasoning system relies on the integrity of its case base but in real life applications the available data invariably contains erroneous, noisy cases. Automated removal of these noisy cases can improve system accuracy. In addition, error rates can often be reduced by removing cases to give smoother decision boundaries between classes. We show that the optimal level of boundary smoothing is domain dependent and, therefore, our approach to error reduction reacts to the characteristics of the domain by setting an appropriate level of smoothing. We introduce a novel algorithm which identifies and removes both noisy and boundary cases with the aid of a local distance ratio. A prototype interface has been developed that shows how the modelling and maintenance approaches can be used in practice in an interactive manner. The interface allows the knowledge engineer to make informed maintenance choices without the need for extensive evaluation effort while, at the same time, retaining control over the process. One of the strengths of our approach is in applying a consistent, integrated method to case base maintenance to provide a transparent process that gives a degree of explanation

    An Analysis of the Structure of the Fante Verb With Special Reference to Tone and Glottalisation.

    Get PDF
    The tonal phonemes which occur in utterances containing only one sentence are (i) high tone, (ii) downstep between successive high tones, and (iii) a slight rise towards the end of a prepausal high tone. The phonemic status of the second and third of these is very largely accounted for by low tones becoming high in agreement with adjacent high tones; downstep is basically an automatic feature of the second of two high tones which are separated by one or more low tones, but if a low tone between two high tones becomes high in tonal aagreement with the preceding or following high the downstep remains, occurring between the agreeing high and the high with which it is not in agreement. The slight rise towards the end of a prepausal high tone is basically an automatic feature of a high tone which is in pause and is borne by a tone-bearing unit without a final glottal stop, but if a low tone becomes high in pause in agreement with the preceding high it does not have the slight rise. The remaining occurrences of downstep and non- occurrences of the slight rise can be accounted for by the postulation of zero tone-bearing units with low or high tone (which mostly turn out to correspond to non-zero tone-bearing units in other dialects or languages). The glottal stop is an accentual rather than a consonantal phoneme. It sometimes represents a separate morpheme which might reasonably be looked upon as a morpheme of intonation, but apart from that it is basically an automatic feature of a tone-bearing unit of the pattern consonant-vowel-consonant which is in pause

    Content type profiling of data-to-text generation datasets.

    Get PDF
    Data-to-Text Generation (D2T) problems can be considered as a stream of time-stamped events with a text summary being produced for each. The problem becomes more challenging when event summaries contain complex insights derived from multiple records either within an event, or across several events from the event stream. It is important to understand the different types of content present in the summary to help us better define the system requirements so that we can build better systems. In this paper, we propose a novel typology of content types, that we use to classify the contents of event summaries. Using the typology, a profile of a dataset is generated as the distribution of the aggregated content types which captures the specific characteristics of the dataset and gives a measure of the complexity present in the problem. Through extensive experiments on different D2T datasets we demonstrate that neural generative systems specifically struggle to generate contents of complex types, highlighting the need for improved D2T techniques

    Improving e-learning recommendation by using background knowledge.

    Get PDF
    There is currently a large amount of e-Learning resources available to learners on the Web. However, learners often have difficulty finding and retrieving relevant materials to support their learning goals because they lack the domain knowledge to craft effective queries that convey what they wish to learn. In addition, the unfamiliar vocabulary often used by domain experts makes it difficult to map a learner's query to a relevant learning material. We address these challenges by introducing an innovative method that automatically builds background knowledge for a learning domain. In creating our method, we exploit a structured collection of teaching materials as a guide for identifying the important domain concepts. We enrich the identified concepts with discovered text from an encyclopedia, thereby increasing the richness of our acquired knowledge. We employ the developed background knowledge for influencing the representation and retrieval of learning resources to improve e-Learning recommendation. The effectiveness of our method is evaluated using a collection of Machine Learning and Data Mining papers. Our method outperforms the benchmark, demonstrating the advantage of using background knowledge for improving the representation and recommendation of e-Learning materials

    Angles of vision: digital storytelling on the cosmic tide?

    Get PDF
    In this report, a collaboration between Robert Gordon University and the University of the Highlands and Islands Institute for Northern Studies, the authors bring together findings from four workshops hosted as part of the My Orkney Story project. It aims to address the opportunities and challenges of developing digital storytelling platforms through the lens of Orkney as a case study. However, the findings from this report are also intended to have a wider relevancy to the development and implementation of digital story platforms at a local and international level

    Harnessing background knowledge for e-learning recommendation.

    Get PDF
    The growing availability of good quality, learning-focused content on the Web makes it an excellent source of resources for e-learning systems. However, learners can find it hard to retrieve material well-aligned with their learning goals because of the difficulty in assembling effective keyword searches due to both an inherent lack of domain knowledge, and the unfamiliar vocabulary often employed by domain experts. We take a step towards bridging this semantic gap by introducing a novel method that automatically creates custom background knowledge in the form of a set of rich concepts related to the selected learning domain. Further, we develop a hybrid approach that allows the background knowledge to influence retrieval in the recommendation of new learning materials by leveraging the vocabulary associated with our discovered concepts in the representation process. We evaluate the effectiveness of our approach on a dataset of Machine Learning and Data Mining papers and show it to outperform the benchmark methods. This paper has won the Donald Michie Memorial Award for Best Technical Paper at AI-2016

    Case-based approach to automated natural language generation for obituaries.

    Get PDF
    Automated generation of human readable text from structured information is challenging because grammatical rules are complex making good quality outputs difficult to achieve. Textual Case-Based Reasoning provides one approach in which the text from previously solved examples with similar inputs is reused as a template solution to generate text for the current problem. Natural Language Generation also poses a challenge when evaluating the quality of the text generated due to the high cost of human labelling and the variety in potential good quality solutions. In this paper, we propose two case-based approaches for reusing text to automatically generate an obituary from a set of input attribute-value pairs. The case-base is acquired by crawling and then tagging existing solutions published on the web to create cases as problem-solution pairs. We evaluate the quality of the text generation system with a novel unsupervised case alignment metric using normalised discounted cumulative gain which is compared to a supervised approach and human evaluation. Initial results show that our proposed evaluation measure is effective and correlates well with average attribute error evaluation which is a crude surrogate to human feedback. The system is being deployed in a real-world application with a startup company in Aberdeen to produce automated obituaries

    Music recommendation: audio neighbourhoods to discover music in the long tail.

    Get PDF
    Millions of people use online music services every day and recommender systems are essential to browse these music collections. Users are looking for high quality recommendations, but also want to discover tracks and artists that they do not already know, newly released tracks, and the more niche music found in the 'long tail' of on-line music. Tag-based recommenders are not effective in this 'long tail' because relatively few people are listening to these tracks and so tagging tends to be sparse. However, similarity neighbourhoods in audio space can provide additional tag knowledge that is useful to augment sparse tagging. A new recommender exploits the combined knowledge, from audio and tagging, using a hybrid representation that extends the track's tag-based representation by adding semantic knowledge extracted from the tags of similar music tracks. A user evaluation and a larger experiment using Last.fm user data both show that the new hybrid recommender provides better quality recommendations than using only tags, together with a higher level of discovery of unknown and niche music. This approach of augmenting the representation for items that have missing information, with corresponding information from similar items in a complementary space, offers opportunities beyond content-based music recommendation

    Fall prediction using behavioural modelling from sensor data in smart homes.

    Get PDF
    The number of methods for identifying potential fall risk is growing as the rate of elderly fallers continues to rise in the UK. Assessments for identifying risk of falling are usually performed in hospitals and other laboratory environments, however these are costly and cause inconvenience for the subject and health services. Replacing these intrusive testing methods with a passive in-home monitoring solution would provide a less time-consuming and cheaper alternative. As sensors become more readily available, machine learning models can be applied to the large amount of data they produce. This can support activity recognition, falls detection, prediction and risk determination. In this review, the growing complexity of sensor data, the required analysis, and the machine learning techniques used to determine risk of falling are explored. The current research on using passive monitoring in the home is discussed, while the viability of active monitoring using vision-based and wearable sensors is considered. Methods of fall detection, prediction and risk determination are then compared

    Wifi-based human activity recognition using Raspberry Pi.

    Get PDF
    Ambient, non-intrusive approaches to smart home health monitoring, while limited in capability, are preferred by residents. More intrusive methods of sensing, such as video and wearables, can offer richer data but at the cost of lower resident uptake, in part due to privacy concerns. A radio frequency-based approach to sensing, Channel State Information (CSI),can make use of low cost off-the-shelf WiFi hardware. We have implemented an activity recognition system on the Raspberry Pi 4, one of the world’s most popular embedded boards. We have implemented an classification system using the Pi to demonstrate its capability for activity recognition. This involves performing data collection, interpretation and windowing, before supplying the data to a classification model. In this paper, the capabilities of the Raspberry Pi 4 at performing activity recognition on CSI data are investigated. We have developed and publicly released a data interaction framework, capable of interpreting, processing and visualising data from a range of CSI-capable hardware. Furthermore, CSI data captured for these experiments during various activity performances have also been made publically available. We then train a Deep Convolutional LSTM model to classify the activities. Our experiments, performed in a small apartment, achieve 92% average accuracy on 11 activity classes
    corecore