6 research outputs found

    Recommending Recommendations to Support the Defense Acquisition Workforce

    Get PDF
    Excerpt from the Proceedings of the Nineteenth Annual Acquisition Research SymposiumThis paper presentings the preliminary results of a research study to support the Defense Acquisition Workforce with a Natural Language Processing (NLP)/Machine Learning (ML) prototype of a system to determine what are the most relevant recommendations that stakeholders are providing to the Defense Acquisition community. The problem addressed by the research study is in the realm of NLP and ML and it is part of the quite popular category of “recommendation systems.” Unlike the majority of the cases in this category, though, this task does not focus on numerical data representing behaviors (like in shopping recommendations), but on extracting user-specific relevance from text and “recommending” a document or part of it. In order to identify important pieces of these texts, subjective text analysis is required to be run. The method used for the analysis is the “room theory framework” by Lipizzi et al. (2021) which applies the Framework Theory by Marvin Minsky (1974) through the use of text vectorization. This framework has three main components: a vectorized corpus representing the knowledge base of the specific domain (the “room”), a set of keywords or phrases defining the specific points of interest for the recommendation (the “benchmarks”) and the documents to be analyzed. The documents are then vectorized using the “room” and compared to the “benchmarks.” The sentences/paragraphs within a given document that are most similar to the benchmarks, and thus presumably the most important parts of the document, are highlighted. This enables the DAU reviewers to submit a document, run the program, and be able to clearly see what recommendations will be the most useful.Approved for public release; distribution is unlimited

    Recommending Recommendations to Support the Defense Acquisition Workforce

    Get PDF
    Excerpt from the Proceedings of the Nineteenth Annual Acquisition Research SymposiumThis paper presentings the preliminary results of a research study to support the Defense Acquisition Workforce with a Natural Language Processing (NLP)/Machine Learning (ML) prototype of a system to determine what are the most relevant recommendations that stakeholders are providing to the Defense Acquisition community. The problem addressed by the research study is in the realm of NLP and ML and it is part of the quite popular category of “recommendation systems.” Unlike the majority of the cases in this category, though, this task does not focus on numerical data representing behaviors (like in shopping recommendations), but on extracting user-specific relevance from text and “recommending” a document or part of it. In order to identify important pieces of these texts, subjective text analysis is required to be run. The method used for the analysis is the “room theory framework” by Lipizzi et al. (2021) which applies the Framework Theory by Marvin Minsky (1974) through the use of text vectorization. This framework has three main components: a vectorized corpus representing the knowledge base of the specific domain (the “room”), a set of keywords or phrases defining the specific points of interest for the recommendation (the “benchmarks”) and the documents to be analyzed. The documents are then vectorized using the “room” and compared to the “benchmarks.” The sentences/paragraphs within a given document that are most similar to the benchmarks, and thus presumably the most important parts of the document, are highlighted. This enables the DAU reviewers to submit a document, run the program, and be able to clearly see what recommendations will be the most useful.Approved for public release; distribution is unlimited

    Text Summarizing and Clustering Using Data Mining Technique

    Get PDF
    Text summarization is an important research topic in the field of information technology because of the large volume of texts, and the large amount of data found on the Internet and social media. The task of summarizing the text has gained great importance that requires finding highly efficient ways in the process of extracting knowledge in various fields, Thus, there was a need for methods of summarizing texts for one document or multiple documents. The summarization methods aim to obtain the main content of the set of documents at the same time to reduce redundant information. In this paper, an efficient method to summarize texts is proposed that depends on the word association algorithm to separate and merge sentences after summarizing them. As well as the use of data mining technology in the process of redistributing information according to the (K-Mean) algorithm and the use of (Term Frequency Inverse Document Frequency TF-IDF) technology for measuring the properties of summarized texts. The experimental results found that the summarization ratios are good by deleting unimportant words. Also, the method of extracting characteristics for texts was useful in grouping similar texts into clusters, which makes this method possible to be combined with other methods in artificial intelligence such as fuzzy logic or evolutionary algorithms in increasing summarization rates and accelerating cluster operations

    MULTI-DOCUMENT SUMMARIZATION USING A COMBINATION OF FEATURES BASED ON CENTROID AND KEYWORD

    Get PDF
    Summarizing text in multi-documents requires choosing important sentences which are more complex than in one document because there is different information which results in contradictions and redundancy of information. The process of selecting important sentences can be done by scoring sentences that consider the main information. The combination of features is carried out for the process of scoring sentences so that sentences with high scores become candidates for summary. The centroid approach provides an advantage in obtaining key information. However, the centroid approach is still limited to information close to the center point. The addition of positional features provides increased information on the importance of a sentence, but positional features only focus on the main position. Therefore, researchers use the keyword feature as a research contribution that can provide additional information on important words in the form of N-grams in a document. In this study, the centroid, position, and keyword features were combined for a scoring process which can provide increased performance for multi-document news data and reviews. The test results show that the addition of keyword features produces the highest value for news data DUC2004 ROUGE-1 of 35.44, ROUGE-2 of 7.64, ROUGE-L of 37.02, and BERTScore of 84.22. While the Amazon review data was obtained with ROUGE-1 of 32.24, ROUGE-2 of 6.14, ROUGE-L of 34.77, and BERTScore of 85.75. The ROUGE and BERScore values outperform the other unsupervised models

    Automatic Extraction of Useful Information from Food -Health Articles related to Diabetes, Cardiovascular Disease and Cancer

    Get PDF
    Food-health articles (FHA) contain invaluable information for health promotion. However, extracting this information manually is a challenging process due to the length and number of articles published yearly. Automatic text summarization efficiently identifies useful information across large bodies of text which in turn speeds up the delivery of useful information from FHA. This research work aims to investigate the performance of statistical based summarization and graphical based unsupervised learning summarization in extracting useful information from FHA related to diabetes, cardiovascular disease and cancer. Various combinations of introduction, result and conclusion sections of three hundred articles were collected, preprocessed and used for evaluating the performance of the two summarization technique types. Generated summaries are compared to the original abstracts using two measures. The first quantifies the similarity of the generated summary to the abstract. The second measure gauges the coverage of the generated summary and the article abstract to the article sections. Overall, this experiment showed the automatically generated summaries are not comparable to the human-made abstracts found in FHA and there is room for improvement since the highest similarity of the generated to the written abstract was 52-57% and the sentence scoring of summarization could be optimized for various domains

    Volume II Acquisition Research Creating Synergy for Informed Change, Thursday 19th Annual Acquisition Research Proceedings

    Get PDF
    ProceedingsApproved for public release; distribution is unlimited
    corecore