13 research outputs found
Mining the Characteristics of Jupyter Notebooks in Data Science Projects
Nowadays, numerous industries have exceptional demand for skills in data
science, such as data analysis, data mining, and machine learning. The
computational notebook (e.g., Jupyter Notebook) is a well-known data science
tool adopted in practice. Kaggle and GitHub are two platforms where data
science communities are used for knowledge-sharing, skill-practicing, and
collaboration. While tutorials and guidelines for novice data science are
available on both platforms, there is a low number of Jupyter Notebooks that
received high numbers of votes from the community. The high-voted notebook is
considered well-documented, easy to understand, and applies the best data
science and software engineering practices. In this research, we aim to
understand the characteristics of high-voted Jupyter Notebooks on Kaggle and
the popular Jupyter Notebooks for data science projects on GitHub. We plan to
mine and analyse the Jupyter Notebooks on both platforms. We will perform
exploratory analytics, data visualization, and feature importances to
understand the overall structure of these notebooks and to identify common
patterns and best-practice features separating the low-voted and high-voted
notebooks. Upon the completion of this research, the discovered insights can be
applied as training guidelines for aspiring data scientists and machine
learning practitioners looking to improve their performance from novice ranking
Jupyter Notebook on Kaggle to a deployable project on GitHub
A CMMI-Based Automated Risk Assessment Framework
Risk assessment is crucial to the increase of software development project success. Current risk assessment approaches provide only a rough guide. Risk assessment experts and domain experts are required in conducting risk assessments in software projects. Therefore, traditional risk assessment approaches require extra activities besides development tasks, and possibly leading to extra costs. We believe that an effective risk assessment approach should be transparently embedded in software development process. This paper aims to present an automated risk assessment framework using CMMI and risk taxnomy as a guidance to develop a risk assessment model. A pragmatic approach will be applied as a basis in building this suggested risk prediction model and the case studies of our practice. These studies are considered as our proof of concept
Studying the association between Gitcoin's issues and resolving outcomes
The development of open-source software (OSS) projects usually have been
driven through collaborations among contributors and strongly relies on
volunteering. Thus, allocating software practitioners (e.g., contributors) to a
particular task is non-trivial and draws attention away from the development.
Therefore, a number of bug bounty platforms have emerged to address this
problem through bounty rewards. Especially, Gitcoin, a new bounty platform,
introduces a bounty reward mechanism that allows individual issue owners
(backers) to define a reward value using cryptocurrencies rather than using
crowdfunding mechanisms. Although a number of studies have investigated the
phenomenon on bounty platforms, those rely on different bounty reward systems.
Our study thus investigates the association between the Gitcoin bounties and
their outcomes (i.e., success and non-success). We empirically study over 4,000
issues with Gitcoin bounties using statistical analysis and machine learning
techniques. We also conducted a comparative study with the Bountysource
platform to gain insights into the usage of both platforms. Our study
highlights the importance of factors such as the length of the project, issue
description, type of bounty issue, and the bounty value, which are found to be
highly correlated with the outcome of bounty issues. These findings can provide
useful guidance to practitioners
Abstract
Computer uses in orthodontics are not entirely new. The history of computer applications for orthodontics can be traced back for decades, yet its many advantages may still not be fully realized, especially for orthodontics in Thailand. In authors ’ points of view, a key reason is the lack of fusions of the existing computer know-how and the expertise in orthodontics. This brief survey was intended to present a personal view which brings together the present trends of computer technology and the view of orthodontists who have experienced the needs for the technology, as a step towards an understanding of reflections of computer uses in orthodontics, particularly in Thailand, and the research challenges
An XMLBased Multi-Agents Model for Information Retrieval on WWW
Abstract: In this paper, we present a multi-agents model for IR, namely IR agents, for information retrieval on WWW. An IR agent consists of three types of agent, Managing agents for extracting the semantics of information and managing the details of co-ordinate agents, Interface agents for interacting between the system and users, and Search agents for discovering the information on WWW. This work focuses on the use of XML technology for information retrieval on WWW. In our model, agents communicate with each other to perform IR tasks by using XML as an agent communication language. They also express their knowledge bases and the semantics of their search results in XML format. As a result, users are able not only to access information more precisely from semantically encoded search results that are returned from our model, but also to utilize the content of the results without using proprietary tags or customized scripts to scrape web pages to extract the content. Key words: Information retrieval, Multi-agents model, XML 1