1 research outputs found
Revealing patterns in HIV viral load data and classifying patients via a novel machine learning cluster summarization method
HIV RNA viral load (VL) is an important outcome variable in studies of HIV
infected persons. There exists only a handful of methods which classify
patients by viral load patterns. Most methods place limits on the use of viral
load measurements, are often specific to a particular study design, and do not
account for complex, temporal variation. To address this issue, we propose a
set of four unambiguous computable characteristics (features) of time-varying
HIV viral load patterns, along with a novel centroid-based classification
algorithm, which we use to classify a population of 1,576 HIV positive clinic
patients into one of five different viral load patterns (clusters) often found
in the literature: durably suppressed viral load (DSVL), sustained low viral
load (SLVL), sustained high viral load (SHVL), high viral load suppression
(HVLS), and rebounding viral load (RVL). The centroid algorithm summarizes
these clusters in terms of their centroids and radii. We show that this allows
new viral load patterns to be assigned pattern membership based on the distance
from the centroid relative to its radius, which we term radial normalization
classification. This method has the benefit of providing an objective and
quantitative method to assign viral load pattern membership with a concise and
interpretable model that aids clinical decision making. This method also
facilitates meta-analyses by providing computably distinct HIV categories.
Finally we propose that this novel centroid algorithm could also be useful in
the areas of cluster comparison for outcomes research and data reduction in
machine learning.Comment: 17 page paper with additional 10 pages of references and
supplementary material. 7 figures and 9 supplementary figure