Synthetic data is essential for assessing clustering techniques,
complementing and extending real data, and allowing for a more complete
coverage of a given problem's space. In turn, synthetic data generators have
the potential of creating vast amounts of data -- a crucial activity when
real-world data is at premium -- while providing a well-understood generation
procedure and an interpretable instrument for methodically investigating
cluster analysis algorithms. Here, we present \textit{Clugen}, a modular
procedure for synthetic data generation, capable of creating multidimensional
clusters supported by line segments using arbitrary distributions.
\textit{Clugen} is open source, 100\% unit tested and fully documented, and is
available for the Python, R, Julia and MATLAB/Octave ecosystems. We demonstrate
that our proposal is able to produce rich and varied results in various
dimensions, is fit for use in the assessment of clustering algorithms, and has
the potential to be a widely used framework in diverse clustering-related
research tasks