This thesis deals with estimation and classification problems of non-stationary processes in a few special cases.In paper A and paper D we make strong assumptions about the observed signal, where a specific model is assumed and the parameters of the model are estimated.In Paper B, Paper C, and Paper E more general assumptions about the structure of the observed processes are made, and the methods in these papers may be applied to a wider range of parameter estimation and classification scenarios.All papers handle non-stationary signals where the spectral power distribution may change with respect to time. Here, we are interested in finding time-frequency representations (TFR) of the signal which can depict how the frequencies and corresponding amplitudes change.In Paper A, we consider the estimation of the shape parameter detailing time- and frequency translated Gaussian bell functions.The algorithm is based on the scaled reassigned spectrogram, where the spectrogram is calculated using a unit norm Gaussian window.The spectrogram is then reassigned using a large set of candidate scaling factors.For the correct scaling factor, with regards to the shape parameter, the reassigned spectrogram of a Gaussian function will be perfectly localized into one single point.In Paper B, we expand on the concept in Paper A, and allow it to be applied to any twice differentiable transient function in any dimension.Given that the matched window function is used when calculating the spectrogram, we prove that all energy is reassigned to one single point in the time-frequency domain if scaled reassignment is applied.Given a parametric model of an observed signal, one may tune the parameter(s) to minimize the entropy of the matched reassigned spectrogram.We also present a classification scheme, where one may apply multiple different parametric models and evaluate which one of the models that best fit the data. In Paper C, we consider the problem of estimating the spectral content of signals where the spectrum is assumed to have a smooth structure.By dividing the spectral representation into a coarse grid and assuming that the spectrum within each segment may be well approximated as linear, a smooth version of the Fourier transform is derived.Using this, we minimize the least squares norm of the difference between the sample covariance matrix of an observed signal and any covariance matrix belonging to a piece-wise linear spectrum.Additionally, we allow for adding constraints that make the solution obey common assumptions of spectral representations.We apply the algorithm to stationary signals in one and two dimensions, as well as to one-dimensional non-stationary processes. In Paper D we consider the problem of estimating the parameters of a multi-component chirp signal, where a harmonic structure may be imposed.The algorithm is based on a group sparsity with sparse groups framework where a large dictionary of candidate parameters is constructed.An optimization scheme is formulated such as to find harmonic groups of chirps that also punish the number of harmonics within each group.Additionally, we form a non-linear least squares step to avoid the bias which is introduced by the spacing of the dictionary. In Paper E we propose that the Wigner-Ville distribution should be used as input to convolutional neural networks, as opposed to the often used spectrogram.As the spectrogram may be expressed as a convolution between a kernel function and the Wigner-Ville distribution, we argue that the kernel function should not be chosen manually.Instead, said convolutional kernel should be optimized together with the rest of the kernels that make up the neural network