1,825 research outputs found

    Designing Systems that Support the Blogosphere for Deliberative Discourse

    Get PDF
    Web 2.0 has great potential to serve as a public sphere (Habermas, 1974; Habermas, 1989) – a distributed arena of voices where all who want to do so can participate. A well-functioning public sphere is important for pluralistic decision-making at many levels, ranging from small organizations to society at large. In this paper, we analyze the capability of the blogosphere in its current form to support such a role. This analysis leads to the identification of the principal issues that prevent the blogosphere from realizing its full potential as a public sphere. Most significantly, we propose that the sheer volume of content overwhelms blog readers, forcing them to restrict themselves to only a small subset of valuable content. This ultimately reduces their level of informedness. Based on past research on managing discourse, we propose four design artifacts that would alleviate these issues: a communal repository, textual clustering, visual cues, and a participation facility for blog users. We present a prototype system, called FeedWiz, which implements several of these design artifacts. Based on this initial design, we formulate a research agenda for the creation of new tools that effectively harness the potential of the growing body of user-generated content in the blogosphere and beyond

    Extensions to the ant-miner classification rule discovery algorithm

    Get PDF
    Ant-Miner is an application of ACO in data mining. It has been introduced by Parpinelli et al. in 2002 as an ant-based algorithm for the discovery of classification rules. Ant-Miner has proved to be a very promising technique for classification rules discovery. Ant-Miner generates a fewer number of rules, fewer terms per each rule and performs competitively in terms of efficiency compared to the C4.5 algorithm (see experimental results in [20]). Hence, it has been a focus area of research and a lot of modification has been done to it in order to increase its quality in terms of classification accuracy and output rules comprehensibility (reducing the size of the rule set). The thesis proposes five extensions to Ant-Miner. 1) The thesis proposes the use of a logical negation operator in the antecedents of constructed rules, so the terms in the rule antecedents could be in the form of . This tends to generate rules with higher coverage and reduce the size of the generated rule set. 2) The thesis proposes the use stubborn ants, an ACO-variation in which an ant is allowed to take into consideration its own personal past history. Stubborn ants tend to generate rules with higher classification accuracy in fewer trials per iteration. 3) The thesis proposes the use multiple types of pheromone; one for each permitted rule class, i.e. an ant would first select the rule class and then deposit the corresponding type of pheromone. The multi-pheromone system improves the quality of the output in terms of classification accuracy as well as it comprehensibility. 4) Along with the multi-pheromone system, the thesis proposes a new pheromone update strategy, called quality contrast intensifier. Such a strategy rewards rules with high confidence by depositing more pheromone and penalizes rules with low confidence by removing pheromone. 5) The thesis proposes that each ant to have its own value of α and β parameters, which in a sense means that each ant has its own individual personality. In order to verify the efficiency of these modifications, several cross-validation experiments have been applied on each of eight datasets used in the experiment. Average output results have been recorded, and a test of statistical significance has been applied to indicate improvement significance. Empirical results show improvements in the algorithm\u27s performance in terms of the simplicity of the generated rule set, the number of trials, and the predictive accuracy

    Trace Clustering for User Behavior Mining

    Get PDF
    Business information systems support a large variety of business processes and tasks, yet organizations rarely understand how users interact with these systems. User Behavior Mining aims to address this by applying process mining techniques to UI logs, i.e., detailed records of interactions with a system\u27s user interface. Insights gained from this type of data hold great potential for usability engineering and task automation, but the complexity of UI logs can make them challenging to analyze. In this paper, we explore trace clustering as a means to structure UI logs and reduce this complexity. In particular, we apply different trace clustering approaches to a real-life UI log and show that the cluster-level process models reveal useful information about user behavior. At the same time, we find configurations in which trace clustering fails to generate satisfactory partitions. Our results also demonstrate that recently proposed representation learning techniques for process traces can be effectively employed in a realistic setting

    Utilizing graph-based representation of text in a hybrid approach to multiple documents summarization

    Get PDF
    The aim of automatic text summarization is to process text with the purpose of identifying and presenting the most important information appearing in the text. In this research, we aim to investigate automatic multiple document summarization using a hybrid approach of extractive and “shallow abstractive methods. We aim to utilize the graph-based representation approach proposed in [1] and [2] as part of our method to multiple document summarization aiming to provide concise, informative and coherent summaries. We start by scoring sentences based on significance to extract top scoring ones from each document of the set of documents being summarized. In this step, we look into different criteria of scoring sentences, which include: the presence of highly frequent words of the document, the presence of highly frequent words of the set of documents and the presence of words found in the first and last sentence of the document and the different combination of such features. Upon running our experiments we found that the best combination of features to use is utilizing the presence of highly frequent words of the document and presence of words found in the first and last sentences of the document. The average f-score of those features had an average of 7.9% increase to other features\u27 f-scores. Secondly, we address the issue of redundancy of information through clustering sentences of same or similar information into one cluster that will be compressed into one sentence, thus avoiding redundancy of information as much as possible. We investigated clustering the extracted sentences based on two criteria for similarity, the first of which uses word frequency vector for similarity measure and the second of which uses word semantic similarity. Through our experiment, we found that the use of the word vector features yields much better clusters in terms of sentence similarity. The word feature vector had a 20% more number of clusters labeled to contain similar sentences as opposed to those of the word semantic feature. We then adopted a graph-based representation of text proposed in [1] and [2] to represent each sentence in a cluster, and using the k-shortest paths we found the shortest path to represent the final compressed sentence and use it as a final sentence in the summary. Human evaluator scored sentences based on grammatical correctness and almost 74% of 51 sentences evaluated got a perfect score of 2 which is a perfect or near perfect sentence. We finally propose a method for scoring the compressed sentences according to the order in which they should appear in the final summary. We used the Document Understanding Conference dataset for year 2014 as the evaluating dataset for our final system. We used the ROUGE system for evaluation which stands for Recall-Oriented Understudy for Gisting Evaluation. This system compare the automatic summaries to “ideal human references. We also compared our summaries ROUGE scores to those of summaries generated using the MEAD summarization tool. Our system provided better precision and f-score as well as comparable recall scores. On average our system has a percentage increase of 2% for precision and 1.6% increase in f-score than those of MEAD while MEAD has an increase of 0.8% in recall. In addition, our system provided more compressed version of the summary as opposed to that generated by MEAD. We finally ran an experiment to evaluate the order of sentences in the final summary and its comprehensibility where we show that our ordering method produced a comprehensible summary. On average, summaries that scored a perfect score in term of comprehensibility constitute 72% of the evaluated summaries. Evaluators were also asked to count the number of ungrammatical and incomprehensible sentences in the evaluated summaries and on average they were only 10.9% of the summaries sentences. We believe our system provide a \u27shallow abstractive summary to multiple documents that does not require intensive Natural Language Processing.

    Data Mining for Studying the Impact of Reflection on Learning

    Get PDF
    Title: Data Mining for Studying the Impact of Reflection on Learning Keywords: educational data mining, Reflect, learning behaviour, impact Abstract On-line Web-based education learning systems generate a large amount of students' log data and profiles that could be useful for educators and students. Hence, data mining techniques that enable the extraction of hidden and potentially useful information in educational databases have been employed to explore educational data. A new promising area of research called educational data mining (EDM) has emerged. Reflect is a Web-based learning system that supports learning by reflection. Reflection is a process in which individuals explore their experiences in order to gain new understanding and appreciation, and research suggests that reflection improves learning. The Reflect system has been used at the University of Sydney’s School of Information Technology for several years as a source of learning and practice in addition to the classroom teaching. Using the data from a system that promotes reflection for learning (such as the Reflect system), this thesis focuses on the investigation of how reflection helps students in their learning. The main objective is to study students' learning behaviour associated with positive and negative outcomes (in exams) by utilising data mining techniques to search for previously unknown, potentially useful hidden information in the database. The approach in this study was, first, to explore the data by means of statistical analyses. Then, popular data mining algorithms such as the K-means and J48 algorithms were utilised to cluster and classify students according to their learning behaviours in using Reflect. The Apriori algorithm was also employed to find associations among the data attributes that lead to success. We were able to group and classify students according to their activities in the Reflect system, and identified some activities associated with student performance and learning outcomes (high, moderate or low exam marks). We concluded that the approach resulted in the identification of some learning behaviours that have important impacts on student performance

    Discriminatory Expressions to Produce Interpretable Models in Short Documents

    Full text link
    Social Networking Sites (SNS) are one of the most important ways of communication. In particular, microblogging sites are being used as analysis avenues due to their peculiarities (promptness, short texts...). There are countless researches that use SNS in novel manners, but machine learning has focused mainly in classification performance rather than interpretability and/or other goodness metrics. Thus, state-of-the-art models are black boxes that should not be used to solve problems that may have a social impact. When the problem requires transparency, it is necessary to build interpretable pipelines. Although the classifier may be interpretable, resulting models are too complex to be considered comprehensible, making it impossible for humans to understand the actual decisions. This paper presents a feature selection mechanism that is able to improve comprehensibility by using less but more meaningful features while achieving good performance in microblogging contexts where interpretability is mandatory. Moreover, we present a ranking method to evaluate features in terms of statistical relevance and bias. We conducted exhaustive tests with five different datasets in order to evaluate classification performance, generalisation capacity and complexity of the model. Results show that our proposal is better and the most stable one in terms of accuracy, generalisation and comprehensibility

    Effectiveness of Corporate Social Media Activities to Increase Relational Outcomes

    Get PDF
    This study applies social media analytics to investigate the impact of different corporate social media activities on user word of mouth and attitudinal loyalty. We conduct a multilevel analysis of approximately 5 million tweets regarding the main Twitter accounts of 28 large global companies. We empirically identify different social media activities in terms of social media management strategies (using social media management tools or the web-frontend client), account types (broadcasting or receiving information), and communicative approaches (conversational or disseminative). We find positive effects of social media management tools, broadcasting accounts, and conversational communication on public perception

    Toward an Effective Automated Tracing Process

    Get PDF
    Traceability is defined as the ability to establish, record, and maintain dependency relations among various software artifacts in a software system, in both a forwards and backwards direction, throughout the multiple phases of the project’s life cycle. The availability of traceability information has been proven vital to several software engineering activities such as program comprehension, impact analysis, feature location, software reuse, and verification and validation (V&V). The research on automated software traceability has noticeably advanced in the past few years. Various methodologies and tools have been proposed in the literature to provide automatic support for establishing and maintaining traceability information in software systems. This movement is motivated by the increasing attention traceability has been receiving as a critical element of any rigorous software development process. However, despite these major advances, traceability implementation and use is still not pervasive in industry. In particular, traceability tools are still far from achieving performance levels that are adequate for practical applications. Such low levels of accuracy require software engineers working with traceability tools to spend a considerable amount of their time verifying the generated traceability information, a process that is often described as tedious, exhaustive, and error-prone. Motivated by these observations, and building upon a growing body of work in this area, in this dissertation we explore several research directions related to enhancing the performance of automated tracing tools and techniques. In particular, our work addresses several issues related to the various aspects of the IR-based automated tracing process, including trace link retrieval, performance enhancement, and the role of the human in the process. Our main objective is to achieve performance levels, in terms of accuracy, efficiency, and usability, that are adequate for practical applications, and ultimately to accomplish a successful technology transfer from research to industry

    The effectiveness of M-health technologies for improving health and health services: a systematic review protocol

    Get PDF
    BACKGROUND: The application of mobile computing and communication technology is rapidly expanding in the fields of health care and public health. This systematic review will summarise the evidence for the effectiveness of mobile technology interventions for improving health and health service outcomes (M-health) around the world. FINDINGS: To be included in the review interventions must aim to improve or promote health or health service use and quality, employing any mobile computing and communication technology. This includes: (1) interventions designed to improve diagnosis, investigation, treatment, monitoring and management of disease; (2) interventions to deliver treatment or disease management programmes to patients, health promotion interventions, and interventions designed to improve treatment compliance; and (3) interventions to improve health care processes e.g. appointment attendance, result notification, vaccination reminders.A comprehensive, electronic search strategy will be used to identify controlled studies, published since 1990, and indexed in MEDLINE, EMBASE, PsycINFO, Global Health, Web of Science, the Cochrane Library, or the UK NHS Health Technology Assessment database. The search strategy will include terms (and synonyms) for the following mobile electronic devices (MEDs) and a range of compatible media: mobile phone; personal digital assistant (PDA); handheld computer (e.g. tablet PC); PDA phone (e.g. BlackBerry, Palm Pilot); Smartphone; enterprise digital assistant; portable media player (i.e. MP3 or MP4 player); handheld video game console. No terms for health or health service outcomes will be included, to ensure that all applications of mobile technology in public health and health services are identified. Bibliographies of primary studies and review articles meeting the inclusion criteria will be searched manually to identify further eligible studies. Data on objective and self-reported outcomes and study quality will be independently extracted by two review authors. Where there are sufficient numbers of similar interventions, we will calculate and report pooled risk ratios or standardised mean differences using meta-analysis. DISCUSSION: This systematic review will provide recommendations on the use of mobile computing and communication technology in health care and public health and will guide future work on intervention development and primary research in this field
    • …
    corecore