Search CORE

6,228 research outputs found

Vision of a Visipedia

Author: Perona Pietro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2010
Field of study

The web is not perfect: while text is easily searched and organized, pictures (the vast majority of the bits that one can find online) are not. In order to see how one could improve the web and make pictures first-class citizens of the web, I explore the idea of Visipedia, a visual interface for Wikipedia that is able to answer visual queries and enables experts to contribute and organize visual knowledge. Five distinct groups of humans would interact through Visipedia: users, experts, editors, visual workers, and machine vision scientists. The latter would gradually build automata able to interpret images. I explore some of the technical challenges involved in making Visipedia happen. I argue that Visipedia will likely grow organically, combining state-of-the-art machine vision with human labor

Caltech Authors

Automatization of incident categorization

Author: Silva Sara Alexandra Teixeira da
Publication venue
Publication date: 04/12/2018
Field of study

To be able to keep up with the grow of the created incidents quantity in an organization nowadays, there was the need to increase the resources to ensure the management of all incidents. Incident Management is composed by several activities, being one of them, Incident Categorization. Merging Natural Language and Text Mining techniques and Machine Learning algorithms, we propose improve this activity, specifically the Incident Management Process. For that, we propose replace the manual sub-process of Categorization inherent to the Incident Management Process by an automatic sub-process, without any human interaction. The goal of this dissertation is to propose a solution to categorize correctly and automatically the incidents. For that, there are real data provided by a company, which due to privacy questions will not be mention along dissertation. The datasets are composed by incidents correctly categorized, which leverage us to apply supervised learning algorithms. It is supposed to obtain as output a developed method through the merge of Natural Language Processing techniques and classification algorithms with better performance on the data. At the end, the proposed method is assessed comparatively with the current categorization done to conclude if our proposal really improves the Incident Management Process and which are the advantages brought by the automation.De forma a acompanhar o crescimento da quantidade de incidentes criados no diaa-dia de uma organização, houve a necessidade de aumentar a quantidade de recursos, de maneira a assegurar a gestão de todos os incidentes. A gestão de incidentes é composta por várias atividades, sendo uma delas, a categorização de incidentes. Através da junção de técnicas de Linguagem Natural e Processamento de Texto e de Algoritmos de Aprendizagem Automática propomos melhorar esta atividade, especificamente o Processo de Gestão de Incidentes. Para tal, propomos a substituição do subprocesso manual de Categorização inerente ao Processo de Gestão de Incidentes por um subprocesso automatizado, sem qualquer interação humana. A dissertação tem como objetivo propor uma solução para categorizar corretamente e automaticamente incidentes. Para tal, temos dados reais de uma organização, que devido a questões de privacidade não será mencionada ao longo da dissertação. Os datasets são compostos por incidentes corretamente categorizados o que nos leva a aplicar algoritmos de aprendizagem supervisionada. Pretendemos ter como resultado final um método desenvolvido através da junção das diferentes técnicas de Linguagem Natural e dos algoritmos com melhor performance para classificar os dados. No final será avaliado o método proposto comparativamente à categorização que é realizada atualmente, de modo a concluir se a nossa proposta realmente melhora o Processo de Gestão de Incidentes e quais são as vantagens trazidas pela automatização

Repositório Institucional do ISCTE-IUL

Analyzing collaborative learning processes automatically

Author: A. C. Graesser
A. King
A. King
A. King
A. M. O'Donnell
A. Stolcke
A. Weinberger
A. Weinberger
A. Yeh
Armin Weinberger
B. Goodman
B. Weiner
B. Wever De
C. P. Rosé
C. Rosé
Carolyn Rosé
D. Kuhn
D. Lewis
D. Litman
E. B. Page
E. B. Page
E. Schegloff
F. Fischer
F. Henri
Frank Fischer
G. Erkens
G. Gweon
G. Salomon
I. H. Witten
I. Kollar
I. Kollar
J. F. Voss
J. Fuernkranz
J. L. Fleiss
J. Piaget
J. Pol van der
J. W. Pennebaker
J. W. Pennebaker
J. W. Pennebaker
J. Wiebe
Jaime Arguello
K. Krippendorf
K. Krippendorff
K. VanLehn
Karsten Stegmann
M. Berkowitz
M. Evens
M. T. H. Chi
N. M. Webb
P. Dillenbourg
P. Dönmez
P. Foltz
R. Kumar
R. Luckin
R. Wegerif
S. D. Teasley
S. Leitão
T. Landauer
V. Aleven
V. Carvalho
V. Vapnik
Yi-Chia Wang
Yue Cui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in

Crossref

Open Access LMU

A HAPA Inspired, Agent-Based Model and Simulation of Activity in an Online Community

Author: Reid James
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2016
Field of study

This thesis is an examination of a Health Action Process Approach (HAPA) developed originally by R. Schwarzer for use in understanding and effecting health behaviour adoption. Although HAPA provides an integral aspect of formulating health treatment strategies by human practitioners for human patients, at the present time no simulation models suited to computer implementation and usage exist for the study of and support for health behaviour adoption within a HAPA framework. This thesis examines the relevant research with respect to HAPA and the components necessary to build a simulation model and platform for an online, self-managing SCI community. We design an architecture for the platform that satisfies the primary requirements suggested by HAPA and SCI patients, particularly directed at gathering relevant data consisting of health indicators. Also, we develop several algorithms used for analysis of HAPA related health states and transitions between states. Since this research did not involve any human subjects, the intention was to simulate certain critical behaviours and changes using an agent based modeling approach. Inasmuch as agents can provide only approximations to real human behaviour, they are still useful and informative. As part of our results, we show that an automated HAPA classification can reduce the risk of agents dropping a health behaviour or program due to misclassification. Further, findings revealed that 6% of the agents are in danger of dropping the adoption of an individual health behaviour within two weeks and that 14% of the agents are at risk of dropping out of the community without continual HAPA reclassification

Scholarship at UWindsor

Automatization of incident resolution

Author: Costa Jorge Tafarel Morais
Publication venue
Publication date: 10/12/2019
Field of study

Incident management is a key IT Service Management sub process in every organization as a way to deal with the current volume of tickets created every year. Currently, the resolution process is still extremely human labor intensive. A large number of incidents are not from a new, never seen before problem, they have already been solved in the past and their respective resolution have been previously stored in an Incident Ticket System. Automation of repeatable tasks in IT is an important element of service management and can have a considerable impact in an organization. Using a large real-world database of incident tickets, this dissertation explores a method to automatically propose a suitable resolution for a new ticket using previous tickets’ resolution texts. At its core, the method uses machine learning, natural language parsing, information retrieval and mining. The proposed method explores machine learning models like SVM, Logistic Regression, some neural networks architecture and more, to predict an incident resolution category for a new ticket and a module to automatically retrieve resolution action phrases from tickets using part-of-speech pattern matching. In the experiments performed, 31% to 41% of the tickets from a test set was considered as solved by the proposed method, which considering the yearly volume of tickets represents a significant amount of manpower and resources that could be saved.A Gestão de incidentes é um subprocesso chave da Gestão de Serviços de TI em todas as organizações como uma forma de lidar com o volume atual de tickets criados todos os anos. Atualmente, o processo de resolução ainda exige muito trabalho humano. Um grande número de incidentes não são de um problema novo, nunca visto antes, eles já foram resolvidos no passado e sua respetiva resolução foi previamente armazenada em um Sistema de Ticket de Incidentes. A automação de tarefas repetíveis em TI é um elemento importante do Gestão de Serviços e pode ter um impacto considerável em uma organização. Usando um grande conjunto de dados reais de tickets de incidentes, esta dissertação explora um método para propor automaticamente uma resolução adequada para um novo ticket usando textos de resolução de tickets anteriores. Em sua essência, o método usa aprendizado de máquina, análise de linguagem natural, recuperação de informações e mineração. O método proposto explora modelos de aprendizagem automática como SVM, Regressão Logística, arquitetura de algumas redes neurais e mais, para prever uma categoria de resolução de incidentes para um novo ticket e um módulo para extrair automaticamente ações de resolução de tickets usando padrões de classes gramaticais. Nas experiências realizados, 31% a 41% dos tickets de um conjunto de testes foram considerados como resolvidos pelo método proposto, que considerando o volume anual de tickets representa uma quantidade significativa de mão de obra e recursos que poderiam ser economizados

Repositório Institucional do ISCTE-IUL

Optimizing E-Commerce Product Classification Using Transfer Learning

Author: Khanuja Rashmeet Kaur
Publication venue: SJSU ScholarWorks
Publication date: 20/05/2019
Field of study

The global e-commerce market is snowballing at a rate of 23% per year. In 2017, retail e-commerce users were 1.66 billion and sales worldwide amounted to 2.3 trillion US dollars, and e-retail revenues are projected to grow to 4.88 trillion USD in 2021. With the immense popularity that e-commerce has gained over past few years comes the responsibility to deliver relevant results to provide rich user experience. In order to do this, it is essential that the products on the ecommerce website be organized correctly into their respective categories. Misclassification of products leads to irrelevant results for users which not just reflects badly on the website, it could also lead to lost customers. With ecommerce sites nowadays providing their portal as a platform for third party merchants to sell their products as well, maintaining a consistency in product categorization becomes difficult. Therefore, automating this process could be of great utilization. This task of automation done on the basis of text could lead to discrepancies since the website itself, its various merchants, and users, all could use different terminologies for a product and its category. Thus, using images becomes a plausible solution for this problem. Dealing with images can best be done using deep learning in the form of convolutional neural networks. This is a computationally expensive task, and in order to keep the accuracy of a traditional convolutional neural network while reducing the hours it takes for the model to train, this project aims at using a technique called transfer learning. Transfer learning refers to sharing the knowledge gained from one task for another where new model does not need to be trained from scratch in order to reduce the time it takes for training. This project aims at using various product images belonging to five categories from an ecommerce platform and developing an algorithm that can accurately classify products in their respective categories while taking as less time as possible. The goal is to first test the performance of transfer learning against traditional convolutional networks. Then the next step is to apply transfer learning to the downloaded dataset and assess its performance on the accuracy and time taken to classify test data that the model has never seen before

SJSU ScholarWorks

Interactive information retrieval

Author: Allan
Barry
Bates
Beaulieu
Beaulieu
Belkin
Belkin
Bhavnani
Blair
Borgman
Borgman
Brajnik
Broder
Buyukkokten
Byström
Campbell
Case
Chen
Cove
Crestani
Crouch
Downie
Dumais
Eastman
Efthimiadis
Ellis
Ellis
Fidel
Ford
Ford
Foster
Fox
Hansen
Harper
Hearst
Hearst
Hearst
Heinström
Hill
Ingwersen
Ingwersen
Jansen
Jansen
Jones
Jones
Kang
Kelly
Kelly
Kim
Konstan
Kruschwitz
Kuhlthau
Legg
Lin
Lin
Lorigo
Lynch
López-Ostenero
Maña-López
Niemi
Norman
Over
Pirkola
Pu
Radev
Reid
Reid
Riedl
Rieh
Robertson
Rosenfeld
Roussinov
Ruthven
Ruthven
Savolainen
Shipman
Shneiderman
Sihvonen
Slone
Smeaton
Spink
Spink
Spink
Spink
Spink
Spink
Spärck Jones
Spärck Jones
Sweeney
Tombros
Tombros
Toms
Topi
Topi
Vakkari
Vakkari
Vakkari
Vakkari
van der Eijk
Vechtomova
Voorhees
White
White
White
White
Wiesman
Wu
Xie
Publication venue: 'Wiley'
Publication date: 01/11/2008
Field of study

Crossref

University of Strathclyde Institutional Repository

GPT Models in Construction Industry: Opportunities, Limitations, and a Use Case Validation

Author: Ajayi Saheed
Akande Kabiru
Kazemi Hadi
Saka Abdullahi
Saka Nurudeen
Salami Babatunde
Taiwo Ridwan
Publication venue
Publication date: 30/05/2023
Field of study

Large Language Models(LLMs) trained on large data sets came into prominence in 2018 after Google introduced BERT. Subsequently, different LLMs such as GPT models from OpenAI have been released. These models perform well on diverse tasks and have been gaining widespread applications in fields such as business and education. However, little is known about the opportunities and challenges of using LLMs in the construction industry. Thus, this study aims to assess GPT models in the construction industry. A critical review, expert discussion and case study validation are employed to achieve the study objectives. The findings revealed opportunities for GPT models throughout the project lifecycle. The challenges of leveraging GPT models are highlighted and a use case prototype is developed for materials selection and optimization. The findings of the study would be of benefit to researchers, practitioners and stakeholders, as it presents research vistas for LLMs in the construction industry.Comment: 58 pages, 20 figure

arXiv.org e-Print Archive

The Impact of Resource Allocation on the Machine Learning Lifecycle

Author: Duda Sebastian
Hofmann Peter
Urbach Nils
Völter Fabiane
Zwickel Amelie
Publication venue
Publication date: 01/01/2024
Field of study

EPub Bayreuth