7 research outputs found

    Retail Demand Forecasting

    Get PDF
    Sales and demand forecasting is one the most critical tasks of enterprises. It lays the foundation for many other essential business assumptions, such as cash flows, profit margins, turnover, capacity planning, and capital expenditure. This report presents a solution for the case study of forecasting monthly sales of one of the largest retail stores in Europe, Rossman chain stores. There are three steps in which this problem will be handled. First, a complete and comprehensive exploratory data analysis will be done to understand the data and perform feature engineering. Secondly, a time series will be modeled by autoregressive models using machine learning and neural networks. Thirdly, these models will be evaluated with standard time series evaluation metrics. Some of the commonly used approaches to achieving the prediction value or models include the ARIMA and classification-based modeling techniques for forecasting. The literature review indicates that these aspects are hard to choose due to the need for matching the supply and demand of consumers being critical as more consumers prefer more reliable companies and companies that can consistently deliver on what they need and when they need. People have underappreciated machine learning’s ability to make data-driven predictions. Still, given the analysis results, companies are now starting to realize its potential and are investing more in this technology

    Software Supply Chain Development and Application

    Get PDF
    Motivation: Free Libre Open Source Software (FLOSS) has become a critical componentin numerous devices and applications. Despite its importance, it is not clear why FLOSS ecosystem works so well or if it may cease to function. Majority of existing research is focusedon studying a specific software project or a portion of an ecosystem, but FLOSS has not been investigated in its entirety. Such view is necessary because of the deep and complex technical and social dependencies that go beyond the core of an individual ecosystem and tight inter-dependencies among ecosystems within FLOSS.Aim: We, therefore, aim to discover underlying relations within and across FLOSS projects and developers in open source community, mitigate potential risks induced by the lack of such knowledge and enable systematic analysis over entire open source community through the lens of supply chain (SC).Method: We utilize concepts from an area of supply chains to model risks of FLOSS ecosystem. FLOSS, due to the distributed decision making of software developers, technical dependencies, and copying of the code, has similarities to traditional supply chain. Unlike in traditional supply chain, where data is proprietary and distributed among players, we aim to measure open-source software supply chain (OSSC) by operationalizing supply chain concept in software domain using traces reconstructed from version control data.Results: We create a very large and frequently updated collection of version control data in the entire FLOSS ecosystems named World of Code (WoC), that can completely cross-reference authors, projects, commits, blobs, dependencies, and history of the FLOSS ecosystems, and provide capabilities to efficiently correct, augment, query, and analyze that data. Various researches and applications (e.g., software technology adoption investigation) have been successfully implemented by leveraging the combination of WoC and OSSC.Implications: With a SC perspective in FLOSS development and the increased visibility and transparency in OSSC, our work provides potential opportunities for researchers to conduct wider and deeper studies on OSS over entire FLOSS community, for developers to build more robust software and for students to learn technologies more efficiently and improve programming skills

    Modeling User-Affected Software Properties for Open Source Software Supply Chains

    Get PDF
    Background: Open Source Software development community relies heavily on users of the software and contributors outside of the core developers to produce top-quality software and provide long-term support. However, the relationship between a software and its contributors in terms of exactly how they are related through dependencies and how the users of a software affect many of its properties are not very well understood. Aim: My research covers a number of aspects related to answering the overarching question of modeling the software properties affected by users and the supply chain structure of software ecosystems, viz. 1) Understanding how software usage affect its perceived quality; 2) Estimating the effects of indirect usage (e.g. dependent packages) on software popularity; 3) Investigating the patch submission and issue creation patterns of external contributors; 4) Examining how the patch acceptance probability is related to the contributors\u27 characteristics. 5) A related topic, the identification of bots that commit code, aimed at improving the accuracy of these and other similar studies was also investigated. Methodology: Most of the Research Questions are addressed by studying the NPM ecosystem, with data from various sources like the World of Code, GHTorrent, and the GiHub API. Different supervised and unsupervised machine learning models, including Regression, Random Forest, Bayesian Networks, and clustering, were used to answer appropriate questions. Results: 1) Software usage affects its perceived quality even after accounting for code complexity measures. 2) The number of dependents and dependencies of a software were observed to be able to predict the change in its popularity with good accuracy. 3) Users interact (contribute issues or patches) primarily with their direct dependencies, and rarely with transitive dependencies. 4) A user\u27s earlier interaction with the repository to which they are contributing a patch, and their familiarity with related topics were important predictors impacting the chance of a pull request getting accepted. 5) Developed BIMAN, a systematic methodology for identifying bots. Conclusion: Different aspects of how users and their characteristics affect different software properties were analyzed, which should lead to a better understanding of the complex interaction between software developers and users/ contributors
    corecore