13 research outputs found

    Mining the Characteristics of Jupyter Notebooks in Data Science Projects

    Full text link
    Nowadays, numerous industries have exceptional demand for skills in data science, such as data analysis, data mining, and machine learning. The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in practice. Kaggle and GitHub are two platforms where data science communities are used for knowledge-sharing, skill-practicing, and collaboration. While tutorials and guidelines for novice data science are available on both platforms, there is a low number of Jupyter Notebooks that received high numbers of votes from the community. The high-voted notebook is considered well-documented, easy to understand, and applies the best data science and software engineering practices. In this research, we aim to understand the characteristics of high-voted Jupyter Notebooks on Kaggle and the popular Jupyter Notebooks for data science projects on GitHub. We plan to mine and analyse the Jupyter Notebooks on both platforms. We will perform exploratory analytics, data visualization, and feature importances to understand the overall structure of these notebooks and to identify common patterns and best-practice features separating the low-voted and high-voted notebooks. Upon the completion of this research, the discovered insights can be applied as training guidelines for aspiring data scientists and machine learning practitioners looking to improve their performance from novice ranking Jupyter Notebook on Kaggle to a deployable project on GitHub

    A CMMI-Based Automated Risk Assessment Framework

    Get PDF
    Risk assessment is crucial to the increase of software development project success. Current risk assessment approaches provide only a rough guide. Risk assessment experts and domain experts are required in conducting risk assessments in software projects. Therefore, traditional risk assessment approaches require extra activities besides development tasks, and possibly leading to extra costs. We believe that an effective risk assessment approach should be transparently embedded in software development process. This paper aims to present an automated risk assessment framework using CMMI and risk taxnomy as a guidance to develop a risk assessment model. A pragmatic approach will be applied as a basis in building this suggested risk prediction model and the case studies of our practice. These studies are considered as our proof of concept

    Studying the association between Gitcoin's issues and resolving outcomes

    Full text link
    The development of open-source software (OSS) projects usually have been driven through collaborations among contributors and strongly relies on volunteering. Thus, allocating software practitioners (e.g., contributors) to a particular task is non-trivial and draws attention away from the development. Therefore, a number of bug bounty platforms have emerged to address this problem through bounty rewards. Especially, Gitcoin, a new bounty platform, introduces a bounty reward mechanism that allows individual issue owners (backers) to define a reward value using cryptocurrencies rather than using crowdfunding mechanisms. Although a number of studies have investigated the phenomenon on bounty platforms, those rely on different bounty reward systems. Our study thus investigates the association between the Gitcoin bounties and their outcomes (i.e., success and non-success). We empirically study over 4,000 issues with Gitcoin bounties using statistical analysis and machine learning techniques. We also conducted a comparative study with the Bountysource platform to gain insights into the usage of both platforms. Our study highlights the importance of factors such as the length of the project, issue description, type of bounty issue, and the bounty value, which are found to be highly correlated with the outcome of bounty issues. These findings can provide useful guidance to practitioners

    Abstract

    No full text
    Computer uses in orthodontics are not entirely new. The history of computer applications for orthodontics can be traced back for decades, yet its many advantages may still not be fully realized, especially for orthodontics in Thailand. In authors ’ points of view, a key reason is the lack of fusions of the existing computer know-how and the expertise in orthodontics. This brief survey was intended to present a personal view which brings together the present trends of computer technology and the view of orthodontists who have experienced the needs for the technology, as a step towards an understanding of reflections of computer uses in orthodontics, particularly in Thailand, and the research challenges

    An XMLBased Multi-Agents Model for Information Retrieval on WWW

    No full text
    Abstract: In this paper, we present a multi-agents model for IR, namely IR agents, for information retrieval on WWW. An IR agent consists of three types of agent, Managing agents for extracting the semantics of information and managing the details of co-ordinate agents, Interface agents for interacting between the system and users, and Search agents for discovering the information on WWW. This work focuses on the use of XML technology for information retrieval on WWW. In our model, agents communicate with each other to perform IR tasks by using XML as an agent communication language. They also express their knowledge bases and the semantics of their search results in XML format. As a result, users are able not only to access information more precisely from semantically encoded search results that are returned from our model, but also to utilize the content of the results without using proprietary tags or customized scripts to scrape web pages to extract the content. Key words: Information retrieval, Multi-agents model, XML 1
    corecore