2,185 research outputs found

    Data mining framework for batching orders in real-time warehouse operations

    Get PDF
    Warehouse activities play a key role in the final customer service level. From the warehouse processes, order picking is the major contributor to this category overall expenses. Order batching is commonly employed to improve the resources efficiency. Several heuristics have been proposed for the order batching problem, most of them developed for static batching, although scarce research has been focused on dynamic batching via stochastic modeling. We present an a novel approach to the problem developing a framework based on machine learning application directly to historical order batches data; gaining valuable knowledge regarding how are the batches formed and what attributes are the most meaningful in this process. This knowledge is then translated into simple batching decision rules capable of batch orders in a real-time scenario (dynamically). The framework was compared to FCFS heuristics and single picking; the results indicate higher performance

    Spatial and Temporal Characteristics of Freight Tours: A Data-Driven Exploratory Analysis

    Full text link
    This paper presents a modeling approach to infer scheduling and routing patterns from digital freight transport activity data for different freight markets. We provide a complete modeling framework including a new discrete-continuous decision tree approach for extracting rules from the freight transport data. We apply these models to collected tour data for the Netherlands to understand departure time patterns and tour strategies, also allowing us to evaluate the effectiveness of the proposed algorithm. We find that spatial and temporal characteristics are important to capture the types of tours and time-of-day patterns of freight activities. Also, the empirical evidence indicates that carriers in most of the transport markets are sensitive to the level of congestion. Many of them adjust the type of tour, departure time, and the number of stops per tour when facing a congested zone. The results can be used by practitioners to get more grip on transport markets and develop freight and traffic management measures

    The Effects of a Simulated Self-Evaluative Routine on Teachers' Grades, Intraclass Correlations, and Feedback Characteristics

    Get PDF
    English language arts teachers committed to the teaching of writing must allocate substantial time and energy to the evaluation of student essays. And in doing so, these teachers wrestle with at least two star-crossed expectations. First, they must fulfill the institutional obligation of making reliable holistic judgments of the papers they receive, stratifying papers according to their successes against a set of stipulated criteria. Second--and more importantly for the sake of teaching and learning--they must also be the providers of insightful, inviting feedback that promotes rather than hinders students` progress toward robust literacies. The qualities of such feedback, having been studied by Kluger and DeNisi (1996), Hattie and Timperley (2007), and others, have recently been made available to classroom practitioners in Brookhart`s How to Give Effective Feedback to Your Students (2008). The current study leverages Brookhart`s transmission of previous research to investigate how teachers might improve their feedback characteristics by way of a self-evaluation routine administered to students prior to the submission of so-called final-draft essays. Specifically, the study tested teachers` scoring and feedback practices, with respect to their work on stronger and weaker essays across control and experimental conditions pertaining to the absence or presence of simulated self-evaluative comments by student authors. Scoring practices were considered by way of group means, distributions, and intraclass correlations of participating teachers` evaluative scores; similarly these teachers` feedback was coded according to criteria suggested by Brookhart, and then compared by way of a 2x2 ANOVA comparison of feedback variances across stronger and weaker papers under control and experimental conditions. The analyses of these data demonstrated a medium-sized positive effect for the desirable feedback trait of focus on self-regulation (partial ç2 = 0.079), as well as a small-sized positive effect for the desirable trait of comparisons to an imaginable previous or successive draft (partial ç2 = 0.032). These desirable improvements in feedback were accompanied while maintaining comparative stability in the grades imposed by teachers, limiting the concern that a "friendlier" approach derived from principles in interpersonal psychology (Heider, 1958) might somehow weaken the integrity of rigor in scoring

    A Simulation-based Methodology to Compare Reverse Logistics System Configuration Considering Uncertainty

    Get PDF
    With increasing environmental concerns, recovery of used products through various options has gained significant attention. In order to collect, categorize and reprocess used products in a cost and time efficient manner, a pre-evaluated network infrastructure is needed in addition to existing traditional forward logistics networks, in most cases. However, such networks, which are referred to as reverse logistics networks, impose inherent uncertainty in returned product supply and challenges additional to forward networks. Incorporating uncertainty in long term decisions with regards to network planning is significant especially in RL networks, since such decisions are difficult and costly to adjust later on. Uncertainty in product returns, dynamic and complex behavior of the system can be modeled as a queueing model, using a discrete event simulation methodology. In this work, a simulation based tool is developed which can be used as a platform for evaluating and comparing reverse logistics network configurations. In addition to defining system parameters, the tool provides experimentation with the number of collection, sorting, and processing centers, as well as the standard deviation of the return rate distribution. Various types of experiments are used in order to illustrate the use and goal of the tool, where the trade-offs within and across scenarios are addressed. Experiments are divided into three main parts; verification, pairwise detailed and a final more holistic scenario which illustrates the usage of the tool. A user interface is developed via Microsoft Excel for convenient specification of operational system parameters and scenario values. Upon running the simulation with specified experimental factors, the tool automatically computes and displays the total weighted score of each scenario, which is an indicator of the scenario quality

    Computational acquisition of knowledge in small-data environments: a case study in the field of energetics

    Get PDF
    The UK’s defence industry is accelerating its implementation of artificial intelligence, including expert systems and natural language processing (NLP) tools designed to supplement human analysis. This thesis examines the limitations of NLP tools in small-data environments (common in defence) in the defence-related energetic-materials domain. A literature review identifies the domain-specific challenges of developing an expert system (specifically an ontology). The absence of domain resources such as labelled datasets and, most significantly, the preprocessing of text resources are identified as challenges. To address the latter, a novel general-purpose preprocessing pipeline specifically tailored for the energetic-materials domain is developed. The effectiveness of the pipeline is evaluated. Examination of the interface between using NLP tools in data-limited environments to either supplement or replace human analysis completely is conducted in a study examining the subjective concept of importance. A methodology for directly comparing the ability of NLP tools and experts to identify important points in the text is presented. Results show the participants of the study exhibit little agreement, even on which points in the text are important. The NLP, expert (author of the text being examined) and participants only agree on general statements. However, as a group, the participants agreed with the expert. In data-limited environments, the extractive-summarisation tools examined cannot effectively identify the important points in a technical document akin to an expert. A methodology for the classification of journal articles by the technology readiness level (TRL) of the described technologies in a data-limited environment is proposed. Techniques to overcome challenges with using real-world data such as class imbalances are investigated. A methodology to evaluate the reliability of human annotations is presented. Analysis identifies a lack of agreement and consistency in the expert evaluation of document TRL.Open Acces

    Company2Vec -- German Company Embeddings based on Corporate Websites

    Full text link
    With Company2Vec, the paper proposes a novel application in representation learning. The model analyzes business activities from unstructured company website data using Word2Vec and dimensionality reduction. Company2Vec maintains semantic language structures and thus creates efficient company embeddings in fine-granular industries. These semantic embeddings can be used for various applications in banking. Direct relations between companies and words allow semantic business analytics (e.g. top-n words for a company). Furthermore, industry prediction is presented as a supervised learning application and evaluation method. The vectorized structure of the embeddings allows measuring companies similarities with the cosine distance. Company2Vec hence offers a more fine-grained comparison of companies than the standard industry labels (NACE). This property is relevant for unsupervised learning tasks, such as clustering. An alternative industry segmentation is shown with k-means clustering on the company embeddings. Finally, this paper proposes three algorithms for (1) firm-centric, (2) industry-centric and (3) portfolio-centric peer-firm identification.Comment: Accepted for Publication in: International Journal of Information Technology & Decision Making (2023

    Can bank interaction during rating measurement of micro and very small enterprises ipso facto Determine the collapse of PD status?

    Get PDF
    This paper begins with an analysis of trends - over the period 2012-2018 - for total bank loans, non-performing loans, and the number of active, working enterprises. A review survey was done on national data from Italy with a comparison developed on a local subset from the Sardinia Region. Empirical evidence appears to support the hypothesis of the paper: can the rating class assigned by banks - using current IRB and A-IRB systems - to micro and very small enterprises, whose ability to replace financial resources using endogenous means is structurally impaired, ipso facto orient the results of performance in the same terms of PD assigned by the algorithm, thereby upending the principle of cause and effect? The thesis is developed through mathematical modeling that demonstrates the interaction of the measurement tool (the rating algorithm applied by banks) on the collapse of the loan status (default, performing, or some intermediate point) of the assessed micro-entity. Emphasis is given, in conclusion, to the phenomenon using evidence of the intrinsically mutualistic link of the two populations of banks and (micro) enterprises provided by a system of differential equation

    A survey on pre-processing techniques: relevant issues in the context of environmental data mining

    Get PDF
    One of the important issues related with all types of data analysis, either statistical data analysis, machine learning, data mining, data science or whatever form of data-driven modeling, is data quality. The more complex the reality to be analyzed is, the higher the risk of getting low quality data. Unfortunately real data often contain noise, uncertainty, errors, redundancies or even irrelevant information. Useless models will be obtained when built over incorrect or incomplete data. As a consequence, the quality of decisions made over these models, also depends on data quality. This is why pre-processing is one of the most critical steps of data analysis in any of its forms. However, pre-processing has not been properly systematized yet, and little research is focused on this. In this paper a survey on most popular pre-processing steps required in environmental data analysis is presented, together with a proposal to systematize it. Rather than providing technical details on specific pre-processing techniques, the paper focus on providing general ideas to a non-expert user, who, after reading them, can decide which one is the more suitable technique required to solve his/her problem.Peer ReviewedPostprint (author's final draft
    • …
    corecore