4 research outputs found
Process modeling using stacked neural networks
Typically neural network modelers in chemical engineering focus on identifying and using a single hopefully optimal neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. In this work, the stacked neural network approach is introduced. Stacked neural networks (SNNs) allow multiple neural networks to be selected and used to model a given process. The idea is that improved predictions can be obtained using multiple networks, instead of simply selecting a single hopefully optimal network as is usually done. A methodology for stacking neural networks for plant process modeling has been developed. This method is inspired by the technique of stacked generalization proposed by Wolpert (1992). The feasibility of the stacked neural network approach is first demonstrated using linear combinations. A general technique known as the information theoretic stacking algorithm is then developed and evaluated. The ITS algorithm is able to identify and combine informative neural network models regardless of how their outputs related to the process output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. Results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks
Zero-Shot Question Answering over Financial Documents using Large Language Models
We introduce a large language model (LLM) based approach to answer complex
questions requiring multi-hop numerical reasoning over financial reports. While
LLMs have exhibited remarkable performance on various natural language and
reasoning tasks, complex reasoning problems often rely on few-shot prompts that
require carefully crafted examples. In contrast, our approach uses novel
zero-shot prompts that guide the LLM to encode the required reasoning into a
Python program or a domain specific language. The generated program is then
executed by a program interpreter, thus mitigating the limitations of LLM in
performing accurate arithmetic calculations.
We evaluate the proposed approach on three financial datasets using some of
the recently developed generative pretrained transformer (GPT) models and
perform comparisons with various zero-shot baselines. The experimental results
demonstrate that our approach significantly improves the accuracy for all the
LLMs over their respective baselines. We provide a detailed analysis of the
results, generating insights to support our findings. The success of our
approach demonstrates the enormous potential to extract complex domain specific
numerical reasoning by designing zero-shot prompts to effectively exploit the
knowledge embedded in LLMs
Process modeling using stacked neural networks
Typically neural network modelers in chemical engineering focus on identifying and using a single hopefully optimal neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. In this work, the stacked neural network approach is introduced. Stacked neural networks (SNNs) allow multiple neural networks to be selected and used to model a given process. The idea is that improved predictions can be obtained using multiple networks, instead of simply selecting a single hopefully optimal network as is usually done. A methodology for stacking neural networks for plant process modeling has been developed. This method is inspired by the technique of stacked generalization proposed by Wolpert (1992). The feasibility of the stacked neural network approach is first demonstrated using linear combinations. A general technique known as the information theoretic stacking algorithm is then developed and evaluated. The ITS algorithm is able to identify and combine informative neural network models regardless of how their outputs related to the process output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. Results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks.</p