Multidimensional Scaling is a technique to visualise similarities in datasets. It works by projecting a high-dimensional dataset into a two-dimensional space. While the resulting visualisations clearly show if samples are similar or dissimilar, they fail to communicate the why. Furthermore, the visualisations usually contain some degree of error that isn’t visible, inspiring false confidence in the resulting projections.
This project tries to solve these problems by introducing a set of interaction and visualisation techniques to examine dimensionality-reduced datasets.
Watch a 30-second video walkthrough of the prototype:
The code for the prototype is available on github.com/julians/probing-projections.
Julian Stahnke; Marian Dörk; Boris Müller; Andreas Thom, ‘Probing Projections: Interaction Techniques for Interpreting Arrangements and Errors of Dimensionality Reductions’, in IEEE Transactions on Visualization & Computer Graphics, vol.PP, no.99, pp.1-1 doi: 10.1109/TVCG.2015.2467717
Abstract: We introduce a set of integrated interaction techniques to interpret and interrogate dimensionality-reduced data. Projection techniques generally aim to make a high-dimensional information space visible in form of a planar layout. However, the meaning of the resulting data projections can be hard to grasp. It is seldom clear why elements are placed far apart or close together and the inevitable approximation errors of any projection technique are not exposed to the viewer. Previous research on dimensionality reduction focuses on the efficient generation of data projections, interactive customisation of the model, and comparison of different projection techniques. There has been only little research on how the visualization resulting from data projection is interacted with. We propose a set of interactive visualization methods to examine the dimensionality-reduced data as well as the projection itself. The methods let viewers see approximation errors, question the positioning of elements, compare them to each other, and visualize the influence of data dimensions on the projection space. We created a web-based system implementing these methods, and report on findings from an evaluation with data analysts using the prototype to examine multidimensional datasets.