35 research outputs found

    Adaptive Order Dispatching based on Reinforcement Learning: Application in a Complex Job Shop in the Semiconductor Industry

    Get PDF
    Heutige Produktionssysteme tendieren durch die Marktanforderungen getrieben zu immer kleineren Losgrößen, höherer Produktvielfalt und größerer Komplexität der Materialflusssysteme. Diese Entwicklungen stellen bestehende Produktionssteuerungsmethoden in Frage. Im Zuge der Digitalisierung bieten datenbasierte Algorithmen des maschinellen Lernens einen alternativen Ansatz zur Optimierung von Produktionsabläufen. Aktuelle Forschungsergebnisse zeigen eine hohe Leistungsfähigkeit von Verfahren des Reinforcement Learning (RL) in einem breiten Anwendungsspektrum. Im Bereich der Produktionssteuerung haben sich jedoch bisher nur wenige Autoren damit befasst. Eine umfassende Untersuchung verschiedener RL-Ansätze sowie eine Anwendung in der Praxis wurden noch nicht durchgeführt. Unter den Aufgaben der Produktionsplanung und -steuerung gewährleistet die Auftragssteuerung (order dispatching) eine hohe Leistungsfähigkeit und Flexibilität der Produktionsabläufe, um eine hohe Kapazitätsauslastung und kurze Durchlaufzeiten zu erreichen. Motiviert durch komplexe Werkstattfertigungssysteme, wie sie in der Halbleiterindustrie zu finden sind, schließt diese Arbeit die Forschungslücke und befasst sich mit der Anwendung von RL für eine adaptive Auftragssteuerung. Die Einbeziehung realer Systemdaten ermöglicht eine genauere Erfassung des Systemverhaltens als statische Heuristiken oder mathematische Optimierungsverfahren. Zusätzlich wird der manuelle Aufwand reduziert, indem auf die Inferenzfähigkeiten des RL zurückgegriffen wird. Die vorgestellte Methodik fokussiert die Modellierung und Implementierung von RL-Agenten als Dispatching-Entscheidungseinheit. Bekannte Herausforderungen der RL-Modellierung in Bezug auf Zustand, Aktion und Belohnungsfunktion werden untersucht. Die Modellierungsalternativen werden auf der Grundlage von zwei realen Produktionsszenarien eines Halbleiterherstellers analysiert. Die Ergebnisse zeigen, dass RL-Agenten adaptive Steuerungsstrategien erlernen können und bestehende regelbasierte Benchmarkheuristiken übertreffen. Die Erweiterung der Zustandsrepräsentation verbessert die Leistung deutlich, wenn ein Zusammenhang mit den Belohnungszielen besteht. Die Belohnung kann so gestaltet werden, dass sie die Optimierung mehrerer Zielgrößen ermöglicht. Schließlich erreichen spezifische RL-Agenten-Konfigurationen nicht nur eine hohe Leistung in einem Szenario, sondern weisen eine Robustheit bei sich ändernden Systemeigenschaften auf. Damit stellt die Forschungsarbeit einen wesentlichen Beitrag in Richtung selbstoptimierender und autonomer Produktionssysteme dar. Produktionsingenieure müssen das Potenzial datenbasierter, lernender Verfahren bewerten, um in Bezug auf Flexibilität wettbewerbsfähig zu bleiben und gleichzeitig den Aufwand für den Entwurf, den Betrieb und die Überwachung von Produktionssteuerungssystemen in einem vernünftigen Gleichgewicht zu halten

    Autonomous order dispatching in the semiconductor industry using reinforcement learning

    Get PDF
    Cyber Physical Production Systems (CPPS) provide a huge amount of data. Simultaneously, operational decisions are getting ever more complex due to smaller batch sizes, a larger product variety and complex processes in production systems. Production engineers struggle to utilize the recorded data to optimize production processes effectively because of a rising level of complexity. This paper shows the successful implementation of an autonomous order dispatching system that is based on a Reinforcement Learning (RL) algorithm. The real-world use case in the semiconductor industry is a highly suitable example of a cyber physical and digitized production system

    Potentials of Traceability Systems - A Cross-Industry Perspective

    Get PDF
    Recently, traceability systems have become more common, but their prevalence and design vary significantly depending on the industry. Different law and customer-based requirements for traceability systems have led to diverse standards. This contribution offers a framework to compare the state of traceability systems in different industries. A comparison of industry characteristics, motivations for traceability system implementation, common data management, and identification systems are offered. Upon that analysis, the potential of cross-industry traceability systems and approaches is identified. This extended usage of traceability systems supports the quality assurance, process management and counterfeit protection and thus expands customer value

    Design, Implementation and Evaluation of Reinforcement Learning for an Adaptive Order Dispatching in Job Shop Manufacturing Systems

    Get PDF
    Modern production systems tend to have smaller batch sizes, a larger product variety and more complex material flow systems. Since a human oftentimes can no longer act in a sufficient manner as a decision maker under these circumstances, the demand for efficient and adaptive control systems is rising. This paper introduces a methodical approach as well as guideline for the design, implementation and evaluation of Reinforcement Learning (RL) algorithms for an adaptive order dispatching. Thereby, it addresses production engineers willing to apply RL. Moreover, a real-world use case shows the successful application of the method and remarkable results supporting real-time decision-making. These findings comprehensively illustrate and extend the knowledge on RL

    Data Analytics for Manufacturing Systems – A Data-Driven Approach for Process Optimization

    Get PDF
    In the course of digitalization many small and medium-sized companies face the challenge of using the existing database for process optimization in manufacturing. Furthermore, the demand-oriented expansion of the database is a great challenge. A lack of competencies, limited financial resources and historically grown data structures, which show a strong heterogeneity and lack of transparency, are the central obstacles. A specific approach, how data analytics projects for process optimization should be carried out in manufacturing, is presented. In particular, the question which sensors should be implemented to expand the database is answered. The approach is applied exemplarily for a manufacturing line

    Reinforcement learning for an intelligent and autonomous production control of complex job-shops under time constraints

    Get PDF
    Reinforcement learning (RL) offers promising opportunities to handle the ever-increasing complexity in managing modern production systems. We apply a Q-learning algorithm in combination with a process-based discrete-event simulation in order to train a self-learning, intelligent, and autonomous agent for the decision problem of order dispatching in a complex job shop with strict time constraints. For the first time, we combine RL in production control with strict time constraints. The simulation represents the characteristics of complex job shops typically found in semiconductor manufacturing. A real-world use case from a wafer fab is addressed with a developed and implemented framework. The performance of an RL approach and benchmark heuristics are compared. It is shown that RL can be successfully applied to manage order dispatching in a complex environment including time constraints. An RL-agent with a gain function rewarding the selection of the least critical order with respect to time-constraints beats heuristic rules strictly by picking the most critical lot first. Hence, this work demonstrates that a self-learning agent can successfully manage time constraints with the agent performing better than the traditional benchmark, a time-constraint heuristic combining due date deviations and a classical first-in-first-out approach

    Reinforcement Learning Based Production Control of Semi-automated Manufacturing Systems

    Get PDF
    In an environment which is marked by an increasing speed of changes, industrial companies have to be able to quickly adapt to new market demands and innovative technologies. This leads to a need for continuous adaption of existing production systems and the optimization of their production control. To tackle this problem digitalization of production systems has become essential for new and existing systems. Digital twins based on simulations of real production systems allow the simplification of analysis processes and, thus, a better understanding of the systems, which leads to broad optimization possibilities. In parallel, machine learning methods can be integrated to process the numerical data and discover new production control strategies. In this work, these two methods are combined to derive a production control logic in a semi-automated production system based on the chaku-chaku principle. A reinforcement learning method is integrated into the digital twin to autonomously learn a superior production control logic for the distribution of tasks between the different workers on a production line. By analyzing the influence of different reward shaping and hyper-parameter optimization on the quality and stability of the results obtained, the use of a well-configured policy-based algorithm enables an efficient management of the workers and the deduction of an optimal production control logic for the production system. The algorithm manages to define a control logic that leads to an increase in productivity while having a stable task assignment so that a transfer to daily business is possible. The approach is validated in the digital twin of a real assembly line of an automotive supplier. The results obtained suggest a new approach to optimizing production control in production lines. Production control shall be centered directly on the workers’ routines and controlled by artificial intelligence infused with a global overview of the entire production system

    Decentralized Multi-Agent Production Control through Economic Model Bidding for Matrix Production Systems

    Get PDF
    Due to increasing demand for unique products, large variety in product portfolios and the associated rise in individualization, the efficient use of resources in traditional line production dwindles. One answer to these new challenges is the application of matrix-shaped layouts with multiple production cells, called Matrix Production Systems. The cycle time independence and redundancy of production cell capabilities within a Matrix Production System enable individual production paths per job for Flexible Mass Customisation. However, the increased degrees of freedom strengthen the need for reliable production control systems compared to traditional production systems such as line production. Beyond reliability a need for intelligent production within a smart factory in order to ensure goal-oriented production control under ever-changing manufacturing conditions can be ascertained. Learning-based methods can leverage condition-based reactions for goal-oriented production control. While centralized control performs well in single-objective situations, it is hard to achieve contradictory targets for individual products or resources. Hence, in order to master these challenges, a production control concept based on a decentralized multi-agent bidding system is presented. In this price-based model, individual production agents - jobs, production cells and transport system - interact based on an economic model and attempt to maximize monetary revenues. Evaluating the application of learning and priority-based control policies shows that decentralized multi-agent production control can outperform traditional approaches for certain control objectives. The introduction of decentralized multi-agent reinforcement learning systems is a starting point for further research in this area of intelligent production control within smart manufacturing

    Foresighted digital twin for situational agent selection in production control

    Get PDF
    As intelligent Data Acquisition and Analysis in Manufacturing nears its apex, a new era of Digital Twins is dawning. Foresighted Digital Twins enable short- to medium-term system behavior predictions to infer optimal production operation strategies. Creating up-to-the-minute Digital Twins requires both the availability of real-time data and its incorporation and serve as a stepping-stone into developing unprecedented forms of production control. Consequently, we regard a new concept of Digital Twins that includes foresight, thereby enabling situational selection of production control agents. One critical element for adequate system predictions is human behavior as it is neither rule-based nor deterministic, which we therefore model applying Reinforcement Learning. Owing to these ever-changing circumstances, rigid operation strategies crucially restrain reactions, as opposed to circumstantial control strategies that hence can outperform traditional approaches. Building on enhanced foresights we show the superiority of this approach and present strategies for improved situational agent selection

    Data analytics for time constraint adherence prediction in a semiconductor manufacturing use-case

    Get PDF
    Semiconductor manufacturing represents a challenging industrial environments, where products require more than several hundred operations, each representing the technical state-of-the-art. Products vary greatly in volume, design and required production processes and, additionally, product portfolios and technologies change rapidly. Thus, technologically restricted rapid product development, stringent quality related clean room requirements and high precision manufacturing equipment application enforce operational excellence, in particular time constraints adherence. Product specific time constraints between two or more successive process operations are an industry-specific challenge, as violations lead to additional scrapping or reworking costs. Time constraint adherence is linked to dispatching and currently manually assessed. To overcome this error-prone manual task, this article presents a data-based decision process to predict time constraint adherence in semiconductor manufacturing. Real-world historical data is analyzed and appropriate statistical models and scoring functions derived. Compared to other relevant literature regarding time constraint violations, the central contribution of this article is the design, generation and validation of a model for product quality-related time constraint adherence based on a real-world semiconductor plant
    corecore