723 research outputs found

    Text Segmentation Using Exponential Models

    Full text link
    This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists both short-range and long-range language models to help it sniff out likely sites of topic changes in text. To aid its search, the system consults a set of simple lexical hints it has learned to associate with the presence of boundaries through inspection of a large corpus of annotated data. We also propose a new probabilistically motivated error metric for use by the natural language processing and information retrieval communities, intended to supersede precision and recall for appraising segmentation algorithms. Qualitative assessment of our algorithm as well as evaluation using this new metric demonstrate the effectiveness of our approach in two very different domains, Wall Street Journal articles and the TDT Corpus, a collection of newswire articles and broadcast news transcripts.Comment: 12 pages, LaTeX source and postscript figures for EMNLP-2 pape

    An efficient multichannel wireless sensor networks MAC protocol based on IEEE 802.11 distributed co-ordinated function.

    Get PDF
    This research aimed to create new knowledge and pioneer a path in the area relating to future trends in the WSN, by resolving some of the issues at the MAC layer in Wireless Sensor Networks. This work introduced a Multi-channel Distributed Coordinated Function (MC-DCF) which takes advantage of multi-channel assignment. The backoff algorithm of the IEEE 802.11 distributed coordination function (DCF) was modified to invoke channel switching, based on threshold criteria in order to improve the overall throughput for wireless sensor networks. This work commenced by surveying different protocols: contention-based MAC protocols, transport layer protocols, cross-layered design and multichannel multi-radio assignments. A number of existing protocols were analysed, each attempting to resolve one or more problems faced by the current layers. The 802.15.4 performed very poorly at high data rate and at long range. Therefore 802.15.4 is not suitable for sensor multimedia or surveillance system with streaming data for future multichannel multi-radio systems. A survey on 802.11 DCF - which was designed mainly for wireless networks –supports and confirm that it has a power saving mechanism which is used to synchronise nodes. However it uses a random back-off mechanism that cannot provide deterministic upper bounds on channel access delay and as such cannot support real-time traffic. The weaknesses identified by surveying this protocol form the backbone of this thesis The overall aim for this thesis was to introduce multichannel with single radio as a new paradigm for IEEE 802.11 Distributed Coordinated Function (DCF) in wireless sensor networks (WSNs) that is used in a wide range of applications, from military application, environmental monitoring, medical care, smart buildings and other industry and to extend WSNs with multimedia capability which sense for instance sounds or motion, video sensor which capture video events of interest. Traditionally WSNs do not need high data rate and throughput, since events are normally captured periodically. With the paradigm shift in technology, multimedia streaming has become more demanding than data sensing applications as such the need for high data rate protocol for WSN which is an emerging technology in this area. The IEEE 802.11 can support data rates up to 54Mbps and 802.11 DCF was designed specifically for use in wireless networks. This thesis focused on designing an algorithm that applied multichannel to IEEE 802.11 DCF back-off algorithm to reduce the waiting time of a node and increase throughput when attempting to access the medium. Data collection in WSN tends to suffer from heavy congestion especially nodes nearer to the sink node. Therefore, this thesis proposes a contention based MAC protocol to address this problem from the inspiration of the 802.11 DCF backoff algorithm resulting from a comparison of IEEE 802.11 and IEEE 802.15.4 for Future Green Multichannel Multi-radio Wireless Sensor Networks

    Innovative energy-efficient wireless sensor network applications and MAC sub-layer protocols employing RTS-CTS with packet concatenation

    Get PDF
    of energy-efficiency as well as the number of available applications. As a consequence there are challenges that need to be tackled for the future generation of WSNs. The research work from this Ph.D. thesis has involved the actual development of innovative WSN applications contributing to different research projects. In the Smart-Clothing project contributions have been given in the development of a Wireless Body Area Network (WBAN) to monitor the foetal movements of a pregnant woman in the last four weeks of pregnancy. The creation of an automatic wireless measurement system for remotely monitoring concrete structures was an contribution for the INSYSM project. This was accomplished by using an IEEE 802.15.4 network enabling for remotely monitoring the temperature and humidity within civil engineering structures. In the framework of the PROENEGY-WSN project contributions have been given in the identification the spectrum opportunities for Radio Frequency (RF) energy harvesting through power density measurements from 350 MHz to 3 GHz. The design of the circuits to harvest RF energy and the requirements needed for creating a WBAN with electromagnetic energy harvesting and Cognitive Radio (CR) capabilities have also been addressed. A performance evaluation of the state-of-the art of the hardware WSN platforms has also been addressed. This is explained by the fact that, even by using optimized Medium Access Control (MAC) protocols, if the WSNs platforms do not allow for minimizing the energy consumption in the idle and sleeping states, energy efficiency and long network lifetime will not be achieved. The research also involved the development of new innovative mechanisms that tries and solves overhead, one of the fundamental reasons for the IEEE 802.15.4 standard MAC inefficiency. In particular, this Ph.D. thesis proposes an IEEE 802.15.4 MAC layer performance enhancement by employing RTS/CTS combined with packet concatenation. The results have shown that the use of the RTS/CTS mechanism improves channel efficiency by decreasing the deferral time before transmitting a data packet. In addition, the Sensor Block Acknowledgment MAC (SBACK-MAC) protocol has been proposed that allows the aggregation of several acknowledgment responses in one special Block Acknowledgment (BACK) Response packet. Two different solutions are considered. The first one considers the SBACK-MAC protocol in the presence of BACK Request (concatenation) while the second one considers the SBACK-MAC in the absence of BACK Request (piggyback). The proposed solutions address a distributed scenario with single-destination and single-rate frame aggregation. The throughput and delay performance is mathematically derived under both ideal conditions (a channel environment with no transmission errors) and non ideal conditions (a channel environment with transmission errors). An analytical model is proposed, capable of taking into account the retransmission delays and the maximum number of backoff stages. The simulation results successfully validate our analytical model. For more than 7 TX (aggregated packets) all the MAC sub-layer protocols employing RTS/CTS with packet concatenation allows for the optimization of channel use in WSNs, v8-48 % improvement in the maximum average throughput and minimum average delay, and decrease energy consumption

    Decision Tree-based Syntactic Language Modeling

    Get PDF
    Statistical Language Modeling is an integral part of many natural language processing applications, such as Automatic Speech Recognition (ASR) and Machine Translation. N-gram language models dominate the field, despite having an extremely shallow view of language---a Markov chain of words. In this thesis, we develop and evaluate a joint language model that incorporates syntactic and lexical information in a effort to ``put language back into language modeling.'' Our main goal is to demonstrate that such a model is not only effective but can be made scalable and tractable. We utilize decision trees to tackle the problem of sparse parameter estimation which is exacerbated by the use of syntactic information jointly with word context. While decision trees have been previously applied to language modeling, there has been little analysis of factors affecting decision tree induction and probability estimation for language modeling. In this thesis, we analyze several aspects that affect decision tree-based language modeling, with an emphasis on syntactic language modeling. We then propose improvements to the decision tree induction algorithm based on our analysis, as well as the methods for constructing forest models---models consisting of multiple decision trees. Finally, we evaluate the impact of our syntactic language model on large scale Speech Recognition and Machine Translation tasks. In this thesis, we also address a number of engineering problems associated with the joint syntactic language model in order to make it tractable. Particularly, we propose a novel decoding algorithm that exploits the decision tree structure to eliminate unnecessary computation. We also propose and evaluate an approximation of our syntactic model by word n-grams---the approximation that makes it possible to incorporate our model directly into the CDEC Machine Translation decoder rather than using the model for rescoring hypotheses produced using an n-gram model

    Argument Mining with Structured SVMs and RNNs

    Full text link
    We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure. (This is the case in over 20% of the web comments dataset we release.) Our model jointly learns elementary unit type classification and argumentative relation prediction. Moreover, our model supports SVM and RNN parametrizations, can enforce structure constraints (e.g., transitivity), and can express dependencies between adjacent relations and propositions. Our approaches outperform unstructured baselines in both web comments and argumentative essay datasets.Comment: Accepted for publication at ACL 2017. 11 pages, 5 figures. Code at https://github.com/vene/marseille and data at http://joonsuk.org

    Posterior Regularization for Learning with Side Information and Weak Supervision

    Get PDF
    Supervised machine learning techniques have been very successful for a variety of tasks and domains including natural language processing, computer vision, and computational biology. Unfortunately, their use often requires creation of large problem-specific training corpora that can make these methods prohibitively expensive. At the same time, we often have access to external problem-specific information that we cannot alway easily incorporate. We might know how to solve the problem in another domain (e.g. for a different language); we might have access to cheap but noisy training data; or a domain expert might be available who would be able to guide a human learner much more efficiently than by simply creating an IID training corpus. A key challenge for weakly supervised learning is then how to incorporate such kinds of auxiliary information arising from indirect supervision. In this thesis, we present Posterior Regularization, a probabilistic framework for structured, weakly supervised learning. Posterior Regularization is applicable to probabilistic models with latent variables and exports a language for specifying constraints or preferences about posterior distributions of latent variables. We show that this language is powerful enough to specify realistic prior knowledge for a variety applications in natural language processing. Additionally, because Posterior Regularization separates model complexity from the complexity of structural constraints, it can be used for structured problems with relatively little computational overhead. We apply Posterior Regularization to several problems in natural language processing including word alignment for machine translation, transfer of linguistic resources across languages and grammar induction. Additionally, we find that we can apply Posterior Regularization to the problem of multi-view learning, achieving particularly good results for transfer learning. We also explore the theoretical relationship between Posterior Regularization and other proposed frameworks for encoding this kind of prior knowledge, and show a close relationship to Constraint Driven Learning as well as to Generalized Expectation Constraints

    Performance improvement of ad hoc networks using directional antennas and power control

    Get PDF
    Au cours de la dernière décennie, un intérêt remarquable a été éprouvé en matière des réseaux ad hoc sans fil capables de s'organiser sans soutien des infrastructures. L'utilisation potentielle d'un tel réseau existe dans de nombreux scénarios, qui vont du génie civil et secours en cas de catastrophes aux réseaux de capteurs et applications militaires. La Fonction de coordination distribuée (DCF) du standard IEEE 802.11 est le protocole dominant des réseaux ad hoc sans fil. Cependant, la méthode DCF n'aide pas à profiter efficacement du canal partagé et éprouve de divers problèmes tels que le problème de terminal exposé et de terminal caché. Par conséquent, au cours des dernières années, de différentes méthodes ont été développées en vue de régler ces problèmes, ce qui a entraîné la croissance de débits d'ensemble des réseaux. Ces méthodes englobent essentiellement la mise au point de seuil de détecteur de porteuse, le remplacement des antennes omnidirectionnelles par des antennes directionnelles et le contrôle de puissance pour émettre des paquets adéquatement. Comparées avec les antennes omnidirectionnelles, les antennes directionnelles ont de nombreux avantages et peuvent améliorer la performance des réseaux ad hoc. Ces antennes ne fixent leurs énergies qu'envers la direction cible et ont une portée d'émission et de réception plus large avec la même somme de puissance. Cette particularité peut être exploitée pour ajuster la puissance d'un transmetteur en cas d'utilisation d'une antenne directionnelle. Certains protocoles de contrôle de puissance directionnel MAC ont été proposés dans les documentations. La majorité de ces suggestions prennent seulement la transmission directionnelle en considération et, dans leurs résultats de simulation, ces études ont l'habitude de supposer que la portée de transmission des antennes omnidirectionnelles et directionnelles est la même. Apparemment, cette supposition n'est pas toujours vraie dans les situations réelles. De surcroît, les recherches prenant l'hétérogénéité en compte dans les réseaux ad hoc ne sont pas suffisantes. Le présent mémoire est dédié à proposer un protocole de contrôle de puissance MAC pour les réseaux ad hoc avec des antennes directionnelles en prenant tous ces problèmes en considération. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Réseaux ad hoc, Antennes directives, Contrôle de puissance
    • …
    corecore