5 research outputs found

    Text Analytic System: Document Similarity

    Get PDF
    Knowledge discovery is a critical function of infrastructure protection in the U.S. By analyzing key text documents, we can gain insight into the interwoven and interdependent infrastructure system of the U.S., and better understand the security aspects of the system as a whole. Massive amounts of relevant data resides in text documents, which must be gathered and parsed to be analyzed on a large scale. Our algorithm collects web-based text embedded in HTML pages and analyzes it in various ways to decipher similarities. It will be a needed component of the larger system being developed by the Idaho National Laboratory, which will seek to accomplish what was described above. By analyzing the similarity of these HTML documents, we are helping the Idaho National Laboratory to keep redundant data out of the database. Without proper parsing of similar data, repetitive entries may clog the system with unneeded information. We attack this problem by providing a series of interfaces, each culminating into the same comparison algorithm. The interface can accept a raw String, a text file, or a web URL. The BoilerPipe library is used to extract useful text from the HTML document, by stripping the document of its tags, and using a series of filters to acquire desired text. A simple Java scanner is used to parse the text file. This text is then lemmatized, stripped of punctuation, converted to lowercase, stemmed, and put into a term-document matrix. Finally, we use cosine similarity to generate a proper percentage point representing how similar or dissimilar the two provided text documents are.https://scholarscompass.vcu.edu/capstone/1025/thumbnail.jp

    Improving Inventory for a Large Food Service Supplier

    Get PDF
    Golden State Foods (GSF) is an international food service supplier. Their Georgia manufacturing plant currently produces thousands of condiments for restaurants across the United States. This project analyzed the inventory policies of seasonal ingredients, in order to decrease GSF’s working capital and inventory holding costs. By inputting product recipes, ingredient usage, and weekly inventory data into a dynamic lot-sizing model, we predict optimal order quantities and reorder points for GSF’s seasonal and expensive ingredients. This will potentially decrease the facility’s $1.2 million holding cost and increase its long-term profits compared to the status quo

    The neurobiology and control of anxious states

    No full text
    corecore