10 research outputs found

    Understanding Aviation English: Challenges and Opportunities in NLP Applications for Indian Languages

    Get PDF
    English is a language that is understood, spoken and used by citizens of a diverse array of countries. The speakers include both native and non-native speakers of English. NLP or Natural Language Processing on the other hand is a branch of computer science that deals with one of the most challenging aspect that a machine can process: dealing with Natural Languages. Natural languages which have evolved over centuries are complete, diverse and highly complex and thus are challenging for a computer system to understand and process. MT or Machine Translation is a more specific part of NLP that translates one natural language to another (English being one of the major researched and sought after languages among them). Though research in the field of NLP and MT has come a long way and many efficient translators are available, still Translation and other NLP applications in specialized domains such as aeronautics are still today a challenge for NLP researchers and developers to achieve. NLP applications are often used in education of English Language, and are therefore a continuous process for Non-Native speakers of English. Non-native English speakers take help of various NLP tools such as E-Dictionary, MT applications and others to better understand the English language and thus learn it better and faster. Aviation English poses a challenge to MT systems and understanding it as a whole requires specialized handling as it has own phonetic pronunciations and terminologies and constituent Out-Of-Vocabulary words. Dealing with Aviation English calls for teaming up of experts from Applied Linguistics, NLP and AI. As a result it becomes a cross-research discipline that covers situations that demand real time use of proper language, e.g. ATC communications. This Paper aims to discuss most recent research methodologies that deals with the Aviation English and reviews the problems posed by it. Being a specialized and structured form of English, the problems are faced by both native and non-native speakers of English Language. Discussion is carried out in the relevant and recent advances of methods in dealing with aviation English language challenges from both, the Human (ICAO/DGCA/AAI) as well as NLP angle. Lastly we have a look at how these challenges are linked to scope for development of applied technologies. Research in experiential Aviation English situations deals with both English for Specific Purposes - ESP (Aeronautics in our case) as well as situations in English as a Foreign Language i.e. EFL (English-Indian language pair)

    Semantic analysis for improved multi-document summarization of text

    Get PDF
    Excess amount of unstructured data is easily accessible in digital format. This information overload places too heavy a burden on society for its analysis and execution needs. Focused (i.e. topic, query, question, category, etc.) multi-document summarization is an information reduction solution which has reached a state-of-the-art that now demands the need to further explore other techniques to model human summarization activity. Such techniques have been mainly extractive and rely on distribution and complex machine learning on corpora in order to perform closely to human summaries. Overall, these techniques are still being used, and the field now needs to move toward more abstractive approaches to model human way of summarizing. A simple, inexpensive and domain-independent system architecture is created for adding semantic analysis to the summarization process. The proposed system is novel in its use of a new semantic analysis metric to better score sentences for selection into a summary. It also simplifies semantic processing of sentences to better capture more likely semantic-related information, reduce redundancy and reduce complexity. The system is evaluated against participants in the Document Understanding Conference and the later Text Analysis Conference using the performance ROUGE measures of n-gram recall between automated systems, human and baseline gold standard baseline summaries. The goal was to show that semantic analysis used for summarization can perform well, while remaining simple and inexpensive without significant loss of recall as compared to the foundational baseline system. Current results show improvement over the gold standard baseline when all factors of this work's semantic analysis technique are used in combination. These factors are the semantic cue words feature and semantic class weighting to determine sentences with important information. Also, the semantic triples clustering used to decompose natural language sentences to their most basic meaning and select the most important sentences added to this improvement. In competition against the gold standard baseline system on the standardized summarization evaluation metric ROUGE, this work outperforms the baseline system by more than ten position rankings. This work shows that semantic analysis and light-weight, open-domain techniques have potential.Ph.D., Information Studies -- Drexel University, 201

    Acquiring information extraction patterns from unannotated corpora

    Get PDF
    Information Extraction (IE) can be defined as the task of automatically extracting preespecified kind of information from a text document. The extracted information is encoded in the required format and then can be used, for example, for text summarization or as accurate index to retrieve new documents.The main issue when building IE systems is how to obtain the knowledge needed to identify relevant information in a document. Today, IE systems are commonly based on extraction rules or IE patterns to represent the kind of information to be extracted. Most approaches to IE pattern acquisition require expert human intervention in many steps of the acquisition process. This dissertation presents a novel method for acquiring IE patterns, Essence, that significantly reduces the need for human intervention. The method is based on ELA, a specifically designed learning algorithm for acquiring IE patterns from unannotated corpora.The distinctive features of Essence and ELA are that 1) they permit the automatic acquisition of IE patterns from unrestricted and untagged text representative of the domain, due to 2) their ability to identify regularities around semantically relevant concept-words for the IE task by 3) using non-domain-specific lexical knowledge tools such as WordNet and 4) restricting the human intervention to defining the task, and validating and typifying the set of IE patterns obtained.Since Essence does not require a corpus annotated with the type of information to be extracted and it does makes use of a general purpose ontology and widely applied syntactic tools, it reduces the expert effort required to build an IE system and therefore also reduces the effort of porting the method to any domain.In order to Essence be validated we conducted a set of experiments to test the performance of the method. We used Essence to generate IE patterns for a MUC-like task. Nevertheless, the evaluation procedure for MUC competitions does not provide a sound evaluation of IE systems, especially of learning systems. For this reason, we conducted an exhaustive set of experiments to further test the abilities of Essence.The results of these experiments indicate that the proposed method is able to learn effective IE patterns

    TRIZ Future Conference 2004

    Get PDF
    TRIZ the Theory of Inventive Problem Solving is a living science and a practical methodology: millions of patents have been examined to look for principles of innovation and patterns of excellence. Large and small companies are using TRIZ to solve problems and to develop strategies for future technologies. The TRIZ Future Conference is the annual meeting of the European TRIZ Association, with contributions from everywhere in the world. The aims of the 2004 edition are the integration of TRIZ with other methodologies and the dissemination of systematic innovation practices even through SMEs: a broad spectrum of subjects in several fields debated with experts, practitioners and TRIZ newcomers

    A Content Based Approach for Investigating the Role and Use of Email in Engineering Design Projects

    Get PDF
    The use of email as a communication and information sharing medium in large, complex, globally distributed engineering projects is widespread; yet there exists little understanding of the content of the emails exchanged and the implications of this content on the design project, design records and contracts. The importance of these issues is underlined by the fact that email records can now be required as evidence in legal disputes. It follows that the overall aim of this research is to assess the role and use of email in engineering design projects. A state-of-the-art review of literature pertaining to email is reported, along with a review of information and communication processes in engineering design projects. The primary contribution of this thesis is the creation of a content based approach for analysing the role and use of email in engineering design projects. This centres on the development and application of a coding scheme to email text, identifying what subject matter an email relates to, why it was sent, and how its content is expressed. Results are then analysed with respect to the frequencies of each code and other variables, including how coding varies between different senders and throughout the project duration. The second key contribution of this thesis is the analysis of emails and content in an engineering setting by applying the aforementioned approach to two case studies. The major case study concerned a large, complex, globally distributed, multimillion pound systems engineering project, from which 16 000 emails were obtained. It was found that emails are mainly used to transfer information but also to support management functions. Emails facilitate design work but little of this takes place explicitly in the email content. Characteristics of a project affect the subject matter or emails but have little effect on why they are sent. User roles and personal preferences also influence email use. If was found that the purposes for sending emails varied over the duration of a project; it was further determined that these changes could be used to identify project progress and design activity. Implications of the findings are identified in relation to: information management, knowledge management, project management, collaboration and email practice. Significantly, emails do contain potentially important design information and because these often support decisions made elsewhere, emails should be integrated with wider records. More consideration and training should be given to the use of project standards for email use and guidelines for composition. Changes in email use over the project duration could be a potential tool for project managers to identify design progress and possible issues in a project.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Research reports: 1985 NASA/ASEE Summer Faculty Fellowship Program

    Get PDF
    A compilation of 40 technical reports on research conducted by participants in the 1985 NASA/ASEE Summer Faculty Fellowship Program at Marshall Space Flight Center (MSFC) is given. Weibull density functions, reliability analysis, directional solidification, space stations, jet stream, fracture mechanics, composite materials, orbital maneuvering vehicles, stellar winds and gamma ray bursts are among the topics discussed

    Aeronautical engineering: A continuing bibliography with indexes (supplement 322)

    Get PDF
    This bibliography lists 719 reports, articles, and other documents introduced into the NASA scientific and technical information system in Oct. 1995. Subject coverage includes: design, construction and testing of aircraft and aircraft engines; aircraft components, equipment, and systems; ground support systems; and theoretical and applied aspects of aerodynamics and general fluid dynamics
    corecore