3 research outputs found

    Computer-Aided Analysis of Video Comments for Requirements Analysis

    Get PDF
    In dieser Arbeit werden Anforderungen für die Anforderungsanalyse aus den Youtube Kommentaren von vision videos extrahiert. Der Prozess der Erstellung und Vorbereitung eines Datensatzes wird beschrieben und die Güte von verschiedenen automatisierten Ansätzen wird evaluiert. Die YouTube API wird benutzt um Kommentare zu extrahieren, diese werden dann in Spam bzw. Ham kategorisiert. Die manuelle Klassifikation ist nötig um die Ergebnisse der automatischen zu verifizieren. Um Einsichten in die relevanten Kommentar zu erhalten und spezifischere Kategorien zu finden werden word clouds benutzt. Die gefundenen Kategorien sind Feature Request, Flaw Report, Safety Related, Efficiency Related und manchmal Questions. Für die automatische Klassifikation in die Kategorien Spam / Ham werden die Algorithmen Random Forest, Support Vector Machine, Linear Regression Classifier, Naive Bayes und ein Voting Classifier welcher die ersten drei kombiniert benutzt. Für die Klassifizierung in spezifische Kategorien wird ebenfalls der Voting Classifier verwendet. Für die Analyse der Stimmung werden TextBlob und SentiStrength, und um die relevanten Kommentare zusammenzufassen wird SumBasic benutzt.In this thesis requirements suitable for requirements engineering are extracted from comments below vision videos on the platform YouTube. The process of creating and preparing a dataset is described and the performance of different automated approaches is evaluated. The YouTube API is used to extract the comments, that are then classified into the categories Spam / Ham according to their content and sentiment. The manual classification is necessary to evaluate the results of the automated one. Word clouds are used to get an insight into the content of the relevant comments and decide on more specific categories to classify them according to their content. More specifically the categories Feature Request, Flaw Report, Safety Related, Efficiency Related and sometimes Questions are found. For the automated classification into the categories Spam / Ham the algorithms Random Forest, Support Vector Machine, Linear Regression Classifier, Naive Bayes, and a Voting Classifier that combines the first three are used. To classify comments according to their sentiment TextBlob and SentiStrength are used. For the classification into specific categories, the Voting Classifier is used again. The SumBasic algorithm is used to summarize the relevant comments

    A multi-disciplinary co-design approach to social media sensemaking with text mining

    Get PDF
    This thesis presents the development of a bespoke social media analytics platform called Sentinel using an event driven co-design approach. The performance and outputs of this system, along with its integration into the routine research methodology of its users, were used to evaluate how the application of an event driven co-design approach to system design improves the degree to which Social Web data can be converted into actionable intelligence, with respect to robustness, agility, and usability. The thesis includes a systematic review into the state-of-the-art technology that can support real-time text analysis of social media data, used to position the text analysis elements of the Sentinel Pipeline. This is followed by research chapters that focus on combinations of robustness, agility, and usability as themes, covering the iterative developments of the system through the event driven co-design lifecycle. Robustness and agility are covered during initial infrastructure design and early prototyping of bottom-up and top-down semantic enrichment. Robustness and usability are then considered during the development of the Semantic Search component of the Sentinel Platform, which exploits the semantic enrichment developed in the prototype, alpha, and beta systems. Finally, agility and usability are used whilst building upon the Semantic Search functionality to produce a data download functionality for rapidly collecting corpora for further qualitative research. These iterations are evaluated using a number of case studies that were undertaken in conjunction with a wider research programme, within the field of crime and security, that the Sentinel platform was designed to support. The findings from these case studies are used in the co-design process to inform how developments should evolve. As part of this research programme the Sentinel platform has supported the production of a number of research papers authored by stakeholders, highlighting the impact the system has had in the field of crime and security researc