4 research outputs found

    Process modeling using stacked neural networks

    Get PDF
    Typically neural network modelers in chemical engineering focus on identifying and using a single hopefully optimal neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. In this work, the stacked neural network approach is introduced. Stacked neural networks (SNNs) allow multiple neural networks to be selected and used to model a given process. The idea is that improved predictions can be obtained using multiple networks, instead of simply selecting a single hopefully optimal network as is usually done. A methodology for stacking neural networks for plant process modeling has been developed. This method is inspired by the technique of stacked generalization proposed by Wolpert (1992). The feasibility of the stacked neural network approach is first demonstrated using linear combinations. A general technique known as the information theoretic stacking algorithm is then developed and evaluated. The ITS algorithm is able to identify and combine informative neural network models regardless of how their outputs related to the process output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. Results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks

    Zero-Shot Question Answering over Financial Documents using Large Language Models

    Full text link
    We introduce a large language model (LLM) based approach to answer complex questions requiring multi-hop numerical reasoning over financial reports. While LLMs have exhibited remarkable performance on various natural language and reasoning tasks, complex reasoning problems often rely on few-shot prompts that require carefully crafted examples. In contrast, our approach uses novel zero-shot prompts that guide the LLM to encode the required reasoning into a Python program or a domain specific language. The generated program is then executed by a program interpreter, thus mitigating the limitations of LLM in performing accurate arithmetic calculations. We evaluate the proposed approach on three financial datasets using some of the recently developed generative pretrained transformer (GPT) models and perform comparisons with various zero-shot baselines. The experimental results demonstrate that our approach significantly improves the accuracy for all the LLMs over their respective baselines. We provide a detailed analysis of the results, generating insights to support our findings. The success of our approach demonstrates the enormous potential to extract complex domain specific numerical reasoning by designing zero-shot prompts to effectively exploit the knowledge embedded in LLMs

    Process modeling using stacked neural networks

    No full text
    Typically neural network modelers in chemical engineering focus on identifying and using a single hopefully optimal neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. In this work, the stacked neural network approach is introduced. Stacked neural networks (SNNs) allow multiple neural networks to be selected and used to model a given process. The idea is that improved predictions can be obtained using multiple networks, instead of simply selecting a single hopefully optimal network as is usually done. A methodology for stacking neural networks for plant process modeling has been developed. This method is inspired by the technique of stacked generalization proposed by Wolpert (1992). The feasibility of the stacked neural network approach is first demonstrated using linear combinations. A general technique known as the information theoretic stacking algorithm is then developed and evaluated. The ITS algorithm is able to identify and combine informative neural network models regardless of how their outputs related to the process output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. Results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks.</p
    corecore