1 research outputs found
Finding Inner Outliers in High Dimensional Space
Outlier detection in a large-scale database is a significant and complex
issue in knowledge discovering field. As the data distributions are obscure and
uncertain in high dimensional space, most existing solutions try to solve the
issue taking into account the two intuitive points: first, outliers are
extremely far away from other points in high dimensional space; second,
outliers are detected obviously different in projected-dimensional subspaces.
However, for a complicated case that outliers are hidden inside the normal
points in all dimensions, existing detection methods fail to find such inner
outliers. In this paper, we propose a method with twice dimension-projections,
which integrates primary subspace outlier detection and secondary
point-projection between subspaces, and sums up the multiple weight values for
each point. The points are computed with local density ratio separately in
twice-projected dimensions. After the process, outliers are those points
scoring the largest values of weight. The proposed method succeeds to find all
inner outliers on the synthetic test datasets with the dimension varying from
100 to 10000. The experimental results also show that the proposed algorithm
can work in low dimensional space and can achieve perfect performance in high
dimensional space. As for this reason, our proposed approach has considerable
potential to apply it in multimedia applications helping to process images or
video with large-scale attributes.Comment: 9 pages, 9 Figures, 3 table