1,958 research outputs found
Optimizing Dependency Parsing Throughput
Dependency parsing is considered a key technology for improving information extraction tasks. Research indicates that dependency parsers spend more than 95 of their total runtime on feature computations. Based on this insight, this paper investigates the potential of improving parsing throughput by designing feature representations which are optimized for combining single features to more complex feature templates and by optimizing parser constraints. Applying these techniques to MDParser increased its throughput four fold, yielding Syntactic Parser, a dependency parser that outperforms comparable approaches by factor 25 to 400
Advancing Hungarian Text Processing with HuSpaCy: Efficient and Accurate NLP Pipelines
This paper presents a set of industrial-grade text processing models for
Hungarian that achieve near state-of-the-art performance while balancing
resource efficiency and accuracy. Models have been implemented in the spaCy
framework, extending the HuSpaCy toolkit with several improvements to its
architecture. Compared to existing NLP tools for Hungarian, all of our
pipelines feature all basic text processing steps including tokenization,
sentence-boundary detection, part-of-speech tagging, morphological feature
tagging, lemmatization, dependency parsing and named entity recognition with
high accuracy and throughput. We thoroughly evaluated the proposed
enhancements, compared the pipelines with state-of-the-art tools and
demonstrated the competitive performance of the new models in all text
preprocessing steps. All experiments are reproducible and the pipelines are
freely available under a permissive license.Comment: Submitted to TSD 2023 Conferenc
Implementation and Optimization of PEG Parsers for Use on FPGAs
DARPA’s Guaranteed Architecture for Physical Security (GAPS) project requires a device to provably enforce security policies. As part of a solution that GE and Dartmouth have proposed for the GAPS project, parsers for Parsing Expression Grammars (PEGs) are required to run on a Field-Programmable Gate Array (FPGA). There exist programs, like Pegmatite, which produce PEG parsers written in VHDL, but these parsers have not yet been run on FPGAs. They have been run in simulators where they have been tested for correctness, but they need to be adapted for execution on FPGAs (Lucas et al., 2021).
This thesis explores the process of modifying existing VHDL PEG parsers to run on FPGAs and optimizing their performance. We contribute two techniques to achieve performance improvements: (1) exploiting data parallelism, and (2) parsing the input packet as it arrives instead of waiting for the entire packet to be received. We were not able to execute our solution consistently on FPGAs, so we present an analysis of these techniques through simulations
HP4 High-Performance Programmable Packet Parser
Now, header parsing is the main topic in the modern network systems to support many operations such as packet processing and security functions. The header parser design has a significant effect on the network devices' performances (latency, throughput, and resource utilization). However, the header parser design suffers from a lot number of difficulties, such as the incrementing in network throughput and a variety of protocols. Therefore, the programmable hardware packet parsing is the best solution to meet the dynamic reconfiguration and speed needs. Field Programmable Gate Array (FPGA) is an appropriate device for programmable high-speed packet implementation. This paper introduces a novel FPGA High-Performance Programmable Packet Parser architecture (HP4). HP4 automatically generated by the P4 (Programming protocol-independent Packet Processors) to optimize the speed, dynamic reconfiguration, and resource consumption. The HP4 shows a pipelined packet parser dynamic reconfiguration and low latency. In addition to high throughput (over 600 Gb/s), HP4 resource utilization is less than 7.5 percent of Virtex-7 870HT, and latency is about 88 ns. HP4 can use in a high-speed dynamic packet switch and network security
- …