76 research outputs found

    Using NMF for analyzing war logs

    Get PDF
    We investigate a semi-automated identification of technical problems occurred by armed forces weapon systems during mission of war. The proposed methodology is based on a semantic analysis of textual information in reports from soldiers (war logs). Latent semantic indexing (LSI) with non-negative matrix factorization (NMF) as technique from multivariate analysis and linear algebra is used to extract hidden semantic textual patterns from the reports. NMF factorizes the term-by-war log matrix - that consists of weighted term frequencies into two non-negative matrices. This enables natural parts-based representation of the report information and it leads to an easy evaluation by human experts because human brain also uses parts-based representation. For an improved research and technology planning, the identified technical problems are a valuable source of information. A case study extracts technical problems from military logs of the Afghanistan war. Results are compared to a manual analysis written by journalists of 'Der Spiegel'

    Weak signal identification with semantic web mining

    Get PDF
    We investigate an automated identification of weak signals according to Ansoff to improve strategic planning and technological forecasting. Literature shows that weak signals can be found in the organization's environment and that they appear in different contexts. We use internet information to represent organization's environment and we select these websites that are related to a given hypothesis. In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals. This improves existing knowledge based approaches because LSI considers the aspects of meaning and thus, it is able to identify similar textual patterns in different contexts. A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI. It enables to calculate the largest number of relevant weak signals represented by singular value decomposition (SVD) dimensions. A case study identifies and analyses weak signals to predict trends in the field of on-site medical oxygen production. This supports the planning of research and development (R&D) for a medical oxygen supplier. As a result, it is shown that the proposed methodology enables organizations to identify weak signals from the internet for a given hypothesis. This helps strategic planners to react ahead of time

    Using webcrawling of publicly available websites to assess E-commerce relationships

    Get PDF
    We investigate e-commerce success factors concerning their impact on the success of commerce transactions between businesses companies. In scientific literature, many e-commerce success factors are introduced. Most of them are focused on companies' website quality. They are evaluated concerning companies' success in the business-to- consumer (B2C) environment where consumers choose their preferred e-commerce websites based on these success factors e.g. website content quality, website interaction, and website customization. In contrast to previous work, this research focuses on the usage of existing e-commerce success factors for predicting successfulness of business-to-business (B2B) ecommerce. The introduced methodology is based on the identification of semantic textual patterns representing success factors from the websites of B2B companies. The successfulness of the identified success factors in B2B ecommerce is evaluated by regression modeling. As a result, it is shown that some B2C e-commerce success factors also enable the predicting of B2B e-commerce success while others do not. This contributes to the existing literature concerning ecommerce success factors. Further, these findings are valuable for B2B e-commerce websites creation

    Essays on text mining for improved decision making

    Get PDF

    Extracting consumers needs for new products a web mining approach

    Get PDF
    Here we introduce a web mining approach for automatically identifying new product ideas extracted from web logs. A web log - also known as blog - is a web site that provides commentary, news, and further information on a subject written by individual persons. We can find a large amount of web logs for nearly each topic where consumers present their needs for new products. These new product ideas probably are valuable for producers as well as for researchers and developers. This is because they can lead to a new product development process. Finding these new product ideas is a well-known task in marketing. Therefore, with this automatic approach we support marketing activities by extracting new and useful product ideas from textual information in internet logs. This approach is implemented by a web-based application named Product Idea Web Log Miner where users from the marketing department provide descriptions of existing products. As a result, new product ideas are extracted from the web logs and presented to the users

    Technology classification with latent semantic indexing

    Get PDF
    Many national and international governments establish organizations for applied science research funding. For this, several organizations have defined procedures for identifying relevant projects that based on prioritized technologies. Even for applied science research projects, which combine several technologies it is difficult to identify all corresponding technologies of all research-funding organizations. In this paper, we present an approach to support researchers and to support research-funding planners by classifying applied science research projects according to corresponding technologies of research-funding organizations. In contrast to related work, this problem is solved by considering results from literature concerning the application based technological relationships and by creating a new approach that is based on latent semantic indexing (LSI) as semantic text classification algorithm. Technologies that occur together in the process of creating an application are grouped in classes, semantic textual patterns are identified as representative for each class, and projects are assigned to one of these classes. This enables the assignment of each project to all technologies semantically grouped by use of LSI. This approach is evaluated using the example of defense and security based technological research. This is because the growing importance of this application field leads to an increasing number of research projects and to the appearance of many new technologies

    A Sentence Meaning Based Alignment Method for Parallel Text Corpora Preparation

    Full text link
    Text alignment is crucial to the accuracy of Machine Translation (MT) systems, some NLP tools or any other text processing tasks requiring bilingual data. This research proposes a language independent sentence alignment approach based on Polish (not position-sensitive language) to English experiments. This alignment approach was developed on the TED Talks corpus, but can be used for any text domain or language pair. The proposed approach implements various heuristics for sentence recognition. Some of them value synonyms and semantic text structure analysis as a part of additional information. Minimization of data loss was ensured. The solution is compared to other sentence alignment implementations. Also an improvement in MT system score with text processed with described tool is shown.Comment: corpora filtration, text alignement, corpora improvement. arXiv admin note: text overlap with arXiv:1509.0888

    Multi-agent knowledge integration mechanism using particle swarm optimization

    Get PDF
    This is the post-print version of the final paper published in Technological Forecasting and Social Change. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2011 Elsevier B.V.Unstructured group decision-making is burdened with several central difficulties: unifying the knowledge of multiple experts in an unbiased manner and computational inefficiencies. In addition, a proper means of storing such unified knowledge for later use has not yet been established. Storage difficulties stem from of the integration of the logic underlying multiple experts' decision-making processes and the structured quantification of the impact of each opinion on the final product. To address these difficulties, this paper proposes a novel approach called the multiple agent-based knowledge integration mechanism (MAKIM), in which a fuzzy cognitive map (FCM) is used as a knowledge representation and storage vehicle. In this approach, we use particle swarm optimization (PSO) to adjust causal relationships and causality coefficients from the perspective of global optimization. Once an optimized FCM is constructed an agent based model (ABM) is applied to the inference of the FCM to solve real world problem. The final aggregate knowledge is stored in FCM form and is used to produce proper inference results for other target problems. To test the validity of our approach, we applied MAKIM to a real-world group decision-making problem, an IT project risk assessment, and found MAKIM to be statistically robust.Ministry of Education, Science and Technology (Korea

    Improved emergency management by a loosely coupled logistic system

    Get PDF
    We investigate a robust and intelligent logistic system for emergency management where existing commercial logistic systems are loosely coupled with logistic systems of emergency management organizations and armed forces. This system is used to supply the population in case of a disaster where a high impact of environmental conditions on logistics can be seen. Very important are robustness as the ability of a logistic system to remain effective under these conditions and intelligent behavior for automated ad-hoc decisions facing unforeseen events. Scenario technique, roadmapping, as well as surveys are used as qualitative methodologies to identify current weaknesses in emergency management logistics and to forecast future development of loosely coupled logistic systems. Text mining and web mining analysis as quantitative methodologies are used to improve forecasting. As a result, options are proposed for governmental organizations and companies to enable such a loosely coupled logistic system within the next 20 years

    Using text summarizing to support planning of research and development

    No full text
    Some governmental organizations process a large number of research and development (R&D) projects simultaneously in their R&D program. For the planning of such a R&D program, decision makers can be supported by providing an overview that contains summaries of all currently running projects because they normally are not experts in all concerned R&D areas. A manual creation of such an overview is time consuming because the description of each project has been summarized in a homogeneous form. Further, each project summary has to be updated very often to consider changes within the project. Based on results of comprehensibility research, we identify a specific structure for the project summaries to ensure comprehensibility for a decision maker and usefulness for the R&D program planning. We introduce a new approach that enables a semi-automatic summarization of descriptions from R&D projects. It creates a summary in accordance to the proposed structure. A case study shows that the time taken by using the introduced approach is less than by creating a summary manually. As a result, the proposed methodology supports decision makers by planning an R&D program
    corecore