22 research outputs found

    The interplay of information retrieval and query by singing with words

    No full text
    Speech recognition can be used in music retrieval systems to identify the words in users' sung queries. Our aim was to determine which of several techniques is most suitable for retrieving songs given a sung query with words. We used Sphinx for speech recognition, and tested several retrieval techniques on the output of the recognition system. The most effective retrieval technique was a combination of Edit Distance and Okapi, which persistently retrieved the correct song at the top one ranked results given that the queries were at least 50% correct. However, techniques performed differently when the queries were split into four buckets with varying level of correctness in the range of 0 to 73%

    Temporally robust software features for authorship attribution

    No full text
    Authorship attribution is used to determine the creator of works among many candidates, playing a vital role in software forensics, authorship disputes and academic integrity investigations. The evolving coding style of individuals may degrade the performance of systems that attribute authorship of source code, and has not been previously studied. This paper uses a collection of six programming assignments with guaranteed relative timestamps from 272 students to examine evolution of coding style

    Comparing techniques for authorship attribution of source code

    No full text
    Attributing authorship of documents with unknown creators has been studied extensively for natural language text such as essays and literature, but less so for non-natural languages such as computer source code. Previous attempts at attributing authorship of source code can be categorised by two attributes: the software features used for the classification, either strings of n tokens/bytes (n-grams) orsoftware metrics; and the classification technique that exploits those features, either information retrieval ranking or machine learning. The results of existing studies, however, are not directly comparable as all use different test beds and evaluation methodologies, making it difficult to assess which approach is superior. This paper summarises all previous techniques to source code authorship attribution, implements feature sets that are motivated by the literature, and applies information retrieval ranking methods or machine classifiers for each approach. Importantly, all approaches are tested on identical collections from varying programming languages and author types. Our conclusions are as follows: (i) ranking and machine classifier approaches are around 90% and 85% accurate, respectively, for a one-in-10 classification problem; (ii) the byte-level n-gram approach is best used with different parameters to those previously published; (iii) neural networks and support vector machines were found to be the most accurate machine classifiers of the eight evaluated; (iv) use of n-gram features in combination with machine classifiers shows promise, but there are scalability problems that still must be overcome; and (v) approaches based on information retrieval techniques are currently more accurate than approaches based on machine learning

    Evaluating Cross-Device Transitioning Experience in Seated and Moving Contexts

    No full text
    Cross-platform services allow access to information across different devices in different locations and situational contexts. We observed forty-five participants completing tasks while transitioning between a laptop and a mobile phone across different contexts (seated-moving and seated-seated). Findings showed that in each test setting, users were sensitive to the same cross-platform user experience (UX) elements. However, the seated-moving settings generated more issues, for example, more consistency problems. Two moving-related factors (attentiveness and manageability) also affected cross-platform UX. In addition, we found design issues associated with using mobile user interfaces (UIs) while walking. We analyzed the issues and proposed a set of UX design principles for mobile UIs in moving situations, such as reduction and aesthetic simplicity. This suggests designing context-aware cross-platform services that take transitioning into account for enhanced mobility

    Defining a Unified Model of Vocabulary Acquisition via Extensive Reading

    No full text
    Post-secondary students with English as a second language (L2) often struggle with reading their textbooks, and other language-related tasks associated with their coursework, despite achieving the minimum IELTS entry requirement. Much of the difficulty is due to the large difference in the known English vocabulary range of L2 speakers compared to native speakers. Extensive reading, or reading for pleasure, is one method of naturally acquiring vocabulary. However, while past studies into vocabulary acquisition agree that vocabulary is acquired, they disagree greatly about the rate of acquisition with some saying the rate is too low to be useful, especially for academic environments. Many conclusions of these studies were inconsistent due to not applying knowledge of the statistical properties of text. In this project we will test our model based on a meta-study of past research with learners of English so that extensive reading can be appropriately used in future language learning courses
    corecore