Data parallel processing is a key concept to increase the scalability and elasticity in event streaming systems. Often data parallelism is accomplished in a splitter-merger architecture where the splitter divides incoming streams into partitions and forwards them to parallel operator instances. The splitter performance is a limiting factor to the system throughput and the parallelization degree. This work studies how to leverage novel methods of in-network computing to accelerate the splitter functionality by implementing it as an in-network function. While dedicated hardware for in-network computing has a high potential to enhance the splitter performance, in-network programming models like the P4 language are also highly limited in their expressiveness to support corresponding parallelization models. We propose P4 Splitter Switch (P4SS) which supports overlapping and non-overlapping count-based windows for multiple independent data streams and parallelizes them to a dynamically configurable number of operator instances. We validate in the context of a prototypical implementation our splitting strategy and its scalability in terms of switch resource consumption