Search CORE

723 research outputs found

Text Segmentation Using Exponential Models

Author: Beeferman Doug
Berger Adam
Lafferty John
Publication venue
Publication date: 01/01/1997
Field of study

This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists both short-range and long-range language models to help it sniff out likely sites of topic changes in text. To aid its search, the system consults a set of simple lexical hints it has learned to associate with the presence of boundaries through inspection of a large corpus of annotated data. We also propose a new probabilistically motivated error metric for use by the natural language processing and information retrieval communities, intended to supersede precision and recall for appraising segmentation algorithms. Qualitative assessment of our algorithm as well as evaluation using this new metric demonstrate the effectiveness of our approach in two very different domains, Wall Street Journal articles and the TDT Corpus, a collection of newswire articles and broadcast news transcripts.Comment: 12 pages, LaTeX source and postscript figures for EMNLP-2 pape

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Wireless audio networking modifying the IEEE 802.11 standard to handle multi-channel real-time wireless audio networks

Author: Chousidis Christos
Publication venue
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityAudio networking is a rapidly increasing field which introduces new exiting possibilities for the professional audio industry. When well established, it will drastically change the way live sound systems will be designed, built and used. Today's networks have enough bandwidth that enables them to transfer hundreds of high quality audio channels, replacing analogue cables and intricate installations of conventional analogue audio systems. Currently there are many systems in the market that distribute audio over networks for live music and studio applications, but this technology is not yet widespread. The reasons that audio networks are not as popular as it was expected are mainly the lack of interoperability between different vendors and still, the need of a wired network infrastructure. Therefore, the development of a wireless digital audio networking system based on the existing widespread wireless technology is a major research challenge. However, the ΙΕΕΕ 802.11 standard, which is the primary wireless networking technology today, appears to be unable to handle this type of application despite the large bandwidth available. Apart from the well-known drawbacks of interference and security, encountered in all wireless data transmission systems, the way that ΙΕΕΕ 802.11 arbitrates the wireless channel access causes significantly high collision rate, low throughput and long overall delay. The aim of this research was to identify the causes that impede this technology to support real time wireless audio networks and to propose possible solutions. Initially the standard was tested thoroughly using a data traffic model which emulates a multi-channel real time audio environment. Broadcasting was found to be the optimal communication method, in order to satisfy the intolerance of live audio, when it comes to delay. The results were analysed and the drawback was identified in the hereditary weakness of the IEEE 802.11 standard to manage broadcasting, from multiple sources in the same network. To resolve this, a series of modifications was proposed for the Medium Access Control algorithm of the standard. First, the extended use of the "CTS-to-Self" control message was introduced in order to act as a protection mechanism in broadcasting, similar to the RTC/CTS protection mechanism, already used in unicast transmission. Then, an alternative "random backoff" method was proposed taking into account the characteristics of live audio wireless networks. For this method a novel "Exclusive Backoff Number Allocation" (EBNA) algorithm was designed aiming to minimize collisions. The results showed that significant improvement in throughput can be achieved using the above modifications but further improvement was needed, when it comes to delay, in order to reach the internationally accepted standards for real time audio delivery. Thus, a traffic adaptive version of the EBNA algorithm was designed. This algorithm monitors the traffic in the network, calculates the probability of collision and accordingly switches between classic IEEE 802.11 MAC and EBNA which is applied only between active stations, rather than to all stations in the network. All amendments were designed to operate as an alternative mode of the existing technology rather as an independent proprietary system. For this reason interoperability with classic IEEE 802.11 was also tested and analysed at the last part of this research. The results showed that the IEEE 802.11 standard, suitably modified, is able to support multiple broadcasting transmission and therefore it can be the platform upon which, the future wireless audio networks will be developed

Brunel University Research Archive

An efficient multichannel wireless sensor networks MAC protocol based on IEEE 802.11 distributed co-ordinated function.

Author: Campbell C.
Campbell C.
Publication venue
Publication date: 01/01/2011
Field of study

This research aimed to create new knowledge and pioneer a path in the area relating to future trends in the WSN, by resolving some of the issues at the MAC layer in Wireless Sensor Networks. This work introduced a Multi-channel Distributed Coordinated Function (MC-DCF) which takes advantage of multi-channel assignment. The backoff algorithm of the IEEE 802.11 distributed coordination function (DCF) was modified to invoke channel switching, based on threshold criteria in order to improve the overall throughput for wireless sensor networks. This work commenced by surveying different protocols: contention-based MAC protocols, transport layer protocols, cross-layered design and multichannel multi-radio assignments. A number of existing protocols were analysed, each attempting to resolve one or more problems faced by the current layers. The 802.15.4 performed very poorly at high data rate and at long range. Therefore 802.15.4 is not suitable for sensor multimedia or surveillance system with streaming data for future multichannel multi-radio systems. A survey on 802.11 DCF - which was designed mainly for wireless networks –supports and confirm that it has a power saving mechanism which is used to synchronise nodes. However it uses a random back-off mechanism that cannot provide deterministic upper bounds on channel access delay and as such cannot support real-time traffic. The weaknesses identified by surveying this protocol form the backbone of this thesis The overall aim for this thesis was to introduce multichannel with single radio as a new paradigm for IEEE 802.11 Distributed Coordinated Function (DCF) in wireless sensor networks (WSNs) that is used in a wide range of applications, from military application, environmental monitoring, medical care, smart buildings and other industry and to extend WSNs with multimedia capability which sense for instance sounds or motion, video sensor which capture video events of interest. Traditionally WSNs do not need high data rate and throughput, since events are normally captured periodically. With the paradigm shift in technology, multimedia streaming has become more demanding than data sensing applications as such the need for high data rate protocol for WSN which is an emerging technology in this area. The IEEE 802.11 can support data rates up to 54Mbps and 802.11 DCF was designed specifically for use in wireless networks. This thesis focused on designing an algorithm that applied multichannel to IEEE 802.11 DCF back-off algorithm to reduce the waiting time of a node and increase throughput when attempting to access the medium. Data collection in WSN tends to suffer from heavy congestion especially nodes nearer to the sink node. Therefore, this thesis proposes a contention based MAC protocol to address this problem from the inspiration of the 802.11 DCF backoff algorithm resulting from a comparison of IEEE 802.11 and IEEE 802.15.4 for Future Green Multichannel Multi-radio Wireless Sensor Networks

Middlesex University Research Repository

Innovative energy-efficient wireless sensor network applications and MAC sub-layer protocols employing RTS-CTS with packet concatenation

Author: Barroca Norberto José Gil
Publication venue
Publication date: 01/01/2014
Field of study

of energy-efficiency as well as the number of available applications. As a consequence there are challenges that need to be tackled for the future generation of WSNs. The research work from this Ph.D. thesis has involved the actual development of innovative WSN applications contributing to different research projects. In the Smart-Clothing project contributions have been given in the development of a Wireless Body Area Network (WBAN) to monitor the foetal movements of a pregnant woman in the last four weeks of pregnancy. The creation of an automatic wireless measurement system for remotely monitoring concrete structures was an contribution for the INSYSM project. This was accomplished by using an IEEE 802.15.4 network enabling for remotely monitoring the temperature and humidity within civil engineering structures. In the framework of the PROENEGY-WSN project contributions have been given in the identification the spectrum opportunities for Radio Frequency (RF) energy harvesting through power density measurements from 350 MHz to 3 GHz. The design of the circuits to harvest RF energy and the requirements needed for creating a WBAN with electromagnetic energy harvesting and Cognitive Radio (CR) capabilities have also been addressed. A performance evaluation of the state-of-the art of the hardware WSN platforms has also been addressed. This is explained by the fact that, even by using optimized Medium Access Control (MAC) protocols, if the WSNs platforms do not allow for minimizing the energy consumption in the idle and sleeping states, energy efficiency and long network lifetime will not be achieved. The research also involved the development of new innovative mechanisms that tries and solves overhead, one of the fundamental reasons for the IEEE 802.15.4 standard MAC inefficiency. In particular, this Ph.D. thesis proposes an IEEE 802.15.4 MAC layer performance enhancement by employing RTS/CTS combined with packet concatenation. The results have shown that the use of the RTS/CTS mechanism improves channel efficiency by decreasing the deferral time before transmitting a data packet. In addition, the Sensor Block Acknowledgment MAC (SBACK-MAC) protocol has been proposed that allows the aggregation of several acknowledgment responses in one special Block Acknowledgment (BACK) Response packet. Two different solutions are considered. The first one considers the SBACK-MAC protocol in the presence of BACK Request (concatenation) while the second one considers the SBACK-MAC in the absence of BACK Request (piggyback). The proposed solutions address a distributed scenario with single-destination and single-rate frame aggregation. The throughput and delay performance is mathematically derived under both ideal conditions (a channel environment with no transmission errors) and non ideal conditions (a channel environment with transmission errors). An analytical model is proposed, capable of taking into account the retransmission delays and the maximum number of backoff stages. The simulation results successfully validate our analytical model. For more than 7 TX (aggregated packets) all the MAC sub-layer protocols employing RTS/CTS with packet concatenation allows for the optimization of channel use in WSNs, v8-48 % improvement in the maximum average throughput and minimum average delay, and decrease energy consumption

UBibliorum repositorio digital da ubi

Decision Tree-based Syntactic Language Modeling

Author: Filimonov Denis
Publication venue
Publication date: 01/01/2011
Field of study

Statistical Language Modeling is an integral part of many natural language processing applications, such as Automatic Speech Recognition (ASR) and Machine Translation. N-gram language models dominate the field, despite having an extremely shallow view of language---a Markov chain of words. In this thesis, we develop and evaluate a joint language model that incorporates syntactic and lexical information in a effort to ``put language back into language modeling.'' Our main goal is to demonstrate that such a model is not only effective but can be made scalable and tractable. We utilize decision trees to tackle the problem of sparse parameter estimation which is exacerbated by the use of syntactic information jointly with word context. While decision trees have been previously applied to language modeling, there has been little analysis of factors affecting decision tree induction and probability estimation for language modeling. In this thesis, we analyze several aspects that affect decision tree-based language modeling, with an emphasis on syntactic language modeling. We then propose improvements to the decision tree induction algorithm based on our analysis, as well as the methods for constructing forest models---models consisting of multiple decision trees. Finally, we evaluate the impact of our syntactic language model on large scale Speech Recognition and Machine Translation tasks. In this thesis, we also address a number of engineering problems associated with the joint syntactic language model in order to make it tractable. Particularly, we propose a novel decoding algorithm that exploits the decision tree structure to eliminate unnecessary computation. We also propose and evaluate an approximation of our syntactic model by word n-grams---the approximation that makes it possible to incorporate our model directly into the CDEC Machine Translation decoder rather than using the model for rescoring hypotheses produced using an n-gram model

CiteSeerX

Digital Repository at the University of Maryland

ProQuest OAI Repository

Argument Mining with Structured SVMs and RNNs

Author: Cardie Claire
Niculae Vlad
Park Joonsuk
Publication venue
Publication date: 01/01/2017
Field of study

We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure. (This is the case in over 20% of the web comments dataset we release.) Our model jointly learns elementary unit type classification and argumentative relation prediction. Moreover, our model supports SVM and RNN parametrizations, can enforce structure constraints (e.g., transitivity), and can express dependencies between adjacent relations and propositions. Our approaches outperform unstructured baselines in both web comments and argumentative essay datasets.Comment: Accepted for publication at ACL 2017. 11 pages, 5 figures. Code at https://github.com/vene/marseille and data at http://joonsuk.org

arXiv.org e-Print Archive

Crossref

Posterior Regularization for Learning with Side Information and Weak Supervision

Author: Ganchev Kuzman
Publication venue: ScholarlyCommons
Publication date: 01/01/2010
Field of study

Supervised machine learning techniques have been very successful for a variety of tasks and domains including natural language processing, computer vision, and computational biology. Unfortunately, their use often requires creation of large problem-specific training corpora that can make these methods prohibitively expensive. At the same time, we often have access to external problem-specific information that we cannot alway easily incorporate. We might know how to solve the problem in another domain (e.g. for a different language); we might have access to cheap but noisy training data; or a domain expert might be available who would be able to guide a human learner much more efficiently than by simply creating an IID training corpus. A key challenge for weakly supervised learning is then how to incorporate such kinds of auxiliary information arising from indirect supervision. In this thesis, we present Posterior Regularization, a probabilistic framework for structured, weakly supervised learning. Posterior Regularization is applicable to probabilistic models with latent variables and exports a language for specifying constraints or preferences about posterior distributions of latent variables. We show that this language is powerful enough to specify realistic prior knowledge for a variety applications in natural language processing. Additionally, because Posterior Regularization separates model complexity from the complexity of structural constraints, it can be used for structured problems with relatively little computational overhead. We apply Posterior Regularization to several problems in natural language processing including word alignment for machine translation, transfer of linguistic resources across languages and grammar induction. Additionally, we find that we can apply Posterior Regularization to the problem of multi-view learning, achieving particularly good results for transfer learning. We also explore the theoretical relationship between Posterior Regularization and other proposed frameworks for encoding this kind of prior knowledge, and show a close relationship to Constraint Driven Learning as well as to Generalized Expectation Constraints

CiteSeerX

ScholarlyCommons@Penn

Performance improvement of ad hoc networks using directional antennas and power control

Author: Bian Qilei
Publication venue
Publication date: 01/01/2009
Field of study

Au cours de la dernière décennie, un intérêt remarquable a été éprouvé en matière des réseaux ad hoc sans fil capables de s'organiser sans soutien des infrastructures. L'utilisation potentielle d'un tel réseau existe dans de nombreux scénarios, qui vont du génie civil et secours en cas de catastrophes aux réseaux de capteurs et applications militaires. La Fonction de coordination distribuée (DCF) du standard IEEE 802.11 est le protocole dominant des réseaux ad hoc sans fil. Cependant, la méthode DCF n'aide pas à profiter efficacement du canal partagé et éprouve de divers problèmes tels que le problème de terminal exposé et de terminal caché. Par conséquent, au cours des dernières années, de différentes méthodes ont été développées en vue de régler ces problèmes, ce qui a entraîné la croissance de débits d'ensemble des réseaux. Ces méthodes englobent essentiellement la mise au point de seuil de détecteur de porteuse, le remplacement des antennes omnidirectionnelles par des antennes directionnelles et le contrôle de puissance pour émettre des paquets adéquatement. Comparées avec les antennes omnidirectionnelles, les antennes directionnelles ont de nombreux avantages et peuvent améliorer la performance des réseaux ad hoc. Ces antennes ne fixent leurs énergies qu'envers la direction cible et ont une portée d'émission et de réception plus large avec la même somme de puissance. Cette particularité peut être exploitée pour ajuster la puissance d'un transmetteur en cas d'utilisation d'une antenne directionnelle. Certains protocoles de contrôle de puissance directionnel MAC ont été proposés dans les documentations. La majorité de ces suggestions prennent seulement la transmission directionnelle en considération et, dans leurs résultats de simulation, ces études ont l'habitude de supposer que la portée de transmission des antennes omnidirectionnelles et directionnelles est la même. Apparemment, cette supposition n'est pas toujours vraie dans les situations réelles. De surcroît, les recherches prenant l'hétérogénéité en compte dans les réseaux ad hoc ne sont pas suffisantes. Le présent mémoire est dédié à proposer un protocole de contrôle de puissance MAC pour les réseaux ad hoc avec des antennes directionnelles en prenant tous ces problèmes en considération. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Réseaux ad hoc, Antennes directives, Contrôle de puissance

Archipel - Université du Québec à Montréal