2 research outputs found

    Developing an innovative entity extraction method for unstructured data

    Get PDF
    The main goal of this study is to build high-precision extractors for entities such as Person and Organization as a good initial seed that can be used for training and learning in machine-learning systems, for the same categories, other categories, and across domains, languages, and applications. The improvement of entities extraction precision also increases the relationships extraction precision, which is particularly important in certain domains (such as intelligence systems, social networking, genetic studies, healthcare, etc.). These increases in precision improve the end users’ experience quality in using the extraction system because it lowers the time that users spend for training the system and correcting outputs, focusing more on analyzing the information extracted to make better data-driven decisions

    Virtual Assistant Design for Water Systems Operation

    Get PDF
    Water management systems such as wastewater treatment plants and water distributions systems are big systems which include a multitude of variables and performance indicators that drive the decision making process for controlling the plant. To help water operators make the right decisions, we provide them with a platform to get quick answers about the different components of the system that they are controlling in natural language. In our research, we explore the architecture for building a virtual assistant in the domain of water systems. Our design focused on developing better semantic inference across the different stages of the process. We developed a named entity recognizer that is able to infer the semantics in the water field by leveraging state-of-the art methods for word embeddings. Our model achieved significant improvements over the baseline Term Frequency - Inverse Document Frequency (TF-IDF) cosine similarity model. Additionally, we explore the design of intent classifiers, which involves more challenges than a traditional classifier due to the small ratio of text length compared to the number of classes. In our design, we incorporate the results of entity recognition, produced from previous layers of the Chatbot pipeline to boost the intent classification performance. Our baseline bidirectional Long Short Term Memory Network (LSTM) model showed significant improvements, amounting to 7-10\% accuracy boost on augmented input data and we contrasted its performance with a modified bidirectional LSTM architecture which embeds information about recognized entities. In each stage of our architecture, we explored state-of-the-art solutions and how we can customize them to our problem domain in order to build a production level application. We additionally leveraged Chatbot frameworks architecture to provide a context aware virtual assistance experience which is able to infer implicit references from the conversation flow
    corecore