58 research outputs found
SimpleNLG : a realisation engine for practical applications
This paper describes SimpleNLG, a realisation
engine for English which aims
to provide simple and robust interfaces to
generate syntactic structures and linearise
them. The library is also flexible in allowing
the use of mixed (canned and noncanned)
representations.peer-reviewe
ЛИНГВИСТИЧЕСКАЯ БАЗА ДАННЫХ СИСТЕМЫ АВТОМАТИЧЕСКОГО ПОРОЖДЕНИЯ АНГЛОЯЗЫЧНОГО РЕКЛАМНОГО ТЕКСТА
The article deals with the linguistic database for the system of automatic generation of English advertising texts on cosmetics and perfumery. The database for such a system includes two main blocks: automatic dictionary (that contains semantic and morphological information for each word), and semantic-syntactical formulas of the texts in a special formal language SEMSINT. The database is built on the result of the analysis of 30 English advertising texts on cosmetics and perfumery. First, each word was given a unique code. For example, N stands for nouns, A – for adjectives, V – for verbs, etc. Then all the lexicon of the analyzed texts was distributed into different semantic categories. According to this semantic classification each word was given a special semantic code. For example, the record N01 that is attributed to the word «lip» in the dictionary means that this word refers to nouns of the semantic category «part of a human’s body».The second block of the database includes the semantic-syntactical formulas of the analyzed advertising texts written in a special formal language SEMSINT. The author gives a brief description of this language, presenting its essence and structure. Also, an example of one formalized advertising text in SEMSINT is provided. Целью данной работы является разработка лингвистического обеспечения автоматической системы порождения англоязычного рекламного текста по косметике и парфюмерии, и ее последующая реализация в виде компьютерной программы. Создаваемая система разрабатывается по принципу лингвистически мотивированных технологий, что требует использования широкого спектра лингвистических знаний о структуре и содержании порождаемого текста (базы данных, семантические и формальные языки). Лингвистическая база данных рассматриваемой системы включает следующие компоненты: автоматический словарь лексических единиц с указанием семантических и морфологических сведений, семантико-синтаксические формулы текстов на формальном языке СЕМСИНТ. В работе рассматривается каждая составляющая этой базы данных. Словарь лексических единиц строится на основе анализа тридцати оригинальных англоязычных рекламных текстов по косметике и парфюмерии, относящимся к трем предметным областям (губная помада, тушь для ресниц, шампунь). Словарная статья автоматического словаря включает две зоны: зону грамматических сведений, зону семантических сведений. Зона грамматических сведений содержит информацию о части речи лексической единицы, а также набор ее морфологических признаков. Зона семантических сведений включает семантический признак лексической единицы, т. е. ее отнесенность к определенному семантическому подклассу. Для этого была произведена семантическая классификация всех слов исследуемых текстов с присвоением им соответствующих кодов. В качестве примера в работе приводится результат семантической классификации имен существительных исследуемых рекламных текстов предметной области «губная помада».Вторую часть базы данных составляют семантико-синтаксические формулы текстов на формальном языке СЕМСИНТ. В работе описываются составляющие языка СЕМСИНТ, а также рассматривается его сущность и правила его использования. Представлен пример семантико-синтаксической формулы текста, созданной средствами данного формального языка
Machine learning research 1989-90
Multifunctional knowledge bases offer a significant advance in artificial intelligence because they can support numerous expert tasks within a domain. As a result they amortize the costs of building a knowledge base over multiple expert systems and they reduce the brittleness of each system. Due to the inevitable size and complexity of multifunctional knowledge bases, their construction and maintenance require knowledge engineering and acquisition tools that can automatically identify interactions between new and existing knowledge. Furthermore, their use requires software for accessing those portions of the knowledge base that coherently answer questions. Considerable progress was made in developing software for building and accessing multifunctional knowledge bases. A language was developed for representing knowledge, along with software tools for editing and displaying knowledge, a machine learning program for integrating new information into existing knowledge, and a question answering system for accessing the knowledge base
Génération automatique de rapports boursiers français et anglais
Depuis peu de temps, il est possible, dans un sous-langage technique bien délimité, de créer des systèmes automatiques capables de générer, à partir d’une représentation sémantique, des textes linguistiquement bien formés. Un tel système existe pour le sous-langage boursier. En effet, à partir des données de la Bourse de New York, ce logiciel produit de façon automatique des résumés boursiers en anglais et en français. Le présent article présente le système anglais et français de génération automatique de texte et décrit brièvement les particularités du sous-langage boursier.It has become possible over the last few years to create automatic systems that generate, from a semantic representation, linguistically well-formed texts in a well-defined technical sublanguage. Such a system exists for the sublanguage of stock market reports. The system produces English and French stock market reports from the same data coming from the New York Stock Exchange. This article presents this English and French automatic text generation system and briefly describes the particularities of the stock market sublanguage
Writing Without Audiences: A Comprehensive Survey of State-Mandated Standards and Assessments
Writing studies professionals agree that students must learn to write for specific audiences. Despite this professional consensus, there is reason to believe that this skill is not widely tested in state-mandated writing assessments. In this study, we survey the state content standards for English Language Arts and the state-mandated writing tests for high school students in all 50 states and the District of Columbia. While all states have adopted standards that require students to write for specific audiences, only a small percentage test this skill on state-mandated assessments. We argue that the consequences of this misalignment between standards and assessment are potentially severe. Since teachers often narrow the curriculum to content that appears on state tests, it could be that pre-service writing teachers will encounter an educational environment in which students are not taught how to adapt their writing to specific audiences
Recommended from our members
The Challenge of Spoken Language Systems: Research Directions for the Nineties
A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area
GenLeNa: Sistema para la construcción de Aplicaciones de Generación de Lenguaje Natural
In this article the proposal is made for the division of the process of construction of natural language generation (NLG) systems into two stages: content planning (CP), which is dependent on the mastery of the application to be developed, and document structuring (DS). This division allows people who are not expert in NLG to develop natural language generation systems, concentrating on building abstract representations of the information to be communicated (called messages). Specific architecture for the DS stage is also presented. This enables NLG researchers to work ortogonally on specific techniques and methodologies for the conversion of messages into text which is grammatically and syntactically correct.En este artículo se propone la división del proceso de construcción de sistemas de Generación de Lenguajes Natural (GLN) en dos etapas: planificación del contenido (EPC), que es dependiente del dominio de la aplicación a desarrollar, y estructuración del documento (EED). Esta división permite que personas no expertas en GLN puedan desarrollar sistemas de generación de lenguajes natural enfocándose en construir representaciones abstractas de la información que se desea comunicar (denominadas mensajes). Adicionalmente se presenta una arquitectura específica para la etapa EED que permite a investigadores en GLN trabajar ortogonalmente en técnicas y metodologías específicas para la transformación de los mensajes en texto gramatical y sintácticamente correcto
Media multitasking in adolescence
Media use has been on the rise in adolescents overall, and in particular, the amount of media multitasking—multiple media consumed simultaneously, such as having a text message conversation while watching TV—has been increasing. In adults, heavy media multitasking has been linked with poorer performance on a number of laboratory measures of cognition, but no relationship has yet been established between media-multitasking behavior and real-world outcomes. Examining individual differences across a group of adolescents, we found that more frequent media multitasking in daily life was associated with poorer performance on statewide standardized achievement tests of math and English in the classroom, poorer performance on behavioral measures of executive function (working memory capacity) in the laboratory, and traits of greater impulsivity and lesser growth mindset. Greater media multitasking had a relatively circumscribed set of associations, and was not related to behavioral measures of cognitive processing speed, implicit learning, or manual dexterity, or to traits of grit and conscientiousness. Thus, individual differences in adolescent media multitasking were related to specific differences in executive function and in performance on real-world academic achievement measures: More media multitasking was associated with poorer executive function ability, worse academic achievement, and a reduced growth mindset.Bill & Melinda Gates Foundatio
- …