Converted from a Word document
In 2012 Google indexed an estimated 46 billion web pages (staticbrain, 2014). Although the Web of linked data was only 3/10,000s this size in the same period (Simpson and Brown, 2014), 15.9 million pages is not an insignificant number (Cheng et al., 2013). Furthermore, as an emerging and burgeoning technological field, we can be certain that these numbers have grown significantly since then. With such a vast amount of data in these two Webs, the increasingly pressing concern is how to understand what is contained within them, particularly when it comes to seeing relationships within the content of each. This remains a problem even with linked data and, when warranted, the ontologies that connect such data, even though such data was built to make relationships explicit and available for exploration. This is a problem that others have articulated as well. For instance, Loskyll et al. ask, ‘How can one represent huge collections of knowledge (e.g. ontologies with over 10 million concepts) as browsable trees in a scalable manner and with a clear user interface?’ (2009, 385). While part of the solution would include appropriately curating the original data, this is not feasible with large datasets, and thus a solution must be borne by improved visualization tools and techniques. While there is a clear interest in using visualizations to navigate this information, there is no generally agreed-upon method to aid in this task. Should we render information that would otherwise need to be inferred ‘actually explicitly visible’ (Howse et al., 2011, 259), provide ontologists with an ‘effective solution’ for understanding ‘relations between existing concepts’ (Kocbek et al., 2013, 34), or any number of other purposes? Balancing these concerns remains a challenge despite the fact that visualizations have played an important role in assisting the retrieval and dissemination of both qualitative and quantitative information for centuries (Tufte, 2006; 2001). In this paper, we ask,
What features should a Semantic Web visualization tool have to maximize the discovery of new knowledge?
This paper answers this question in two stages: it provides a review and evaluation of 30 existing Semantic Web–related visualization tools, and it reports on the production of a completed network visualization tool that is now entering the user testing phase.
Review of Linked Data and Ontology Visualization Tools
Before reviewing the features of the network visualization tool we have produced, we will summarize the results of a review of 30 existing tools or techniques that target visualizations of either linked data or ontologies. While it is the case that surveys of visualization tools for linked data and ontologies have been carried out before,
1 since 2007 there have been none that we know of and none conducted on this scale. The list of tools or techniques that are included in our review is displayed in Table 1.
Table 1. Linked data and ontology visualization tools.
A New Visualization Tool
In 2014, Lohman et al. pointed out, ‘While several visualizations for ontologies have been developed in the last couple of years, they either focus on specific ontology aspects or are hard to read for non-expert users. The silver bullet would be an ontology visualization that is equally comprehensive and comprehensible. It must be printable but also provide intuitive ways to interactively explore ontologies’ (395). While the focus of this comment is clearly on ontologies, the same is true of visualizations for linked data in general. This is just one set of criteria that a so-called silver-bullet visualization tool ostensibly should have. Others include key concept extraction, a rich set of navigation and visualization mechanisms, flexible zooming into and hiding of specific parts of an ontology, history browsing, saving and loading of customized ontology views, essential interface customization support, such as graphical zooming, font manipulation, tree layout customization, clear representations of hierarchy and predicates, and the availability of multiple geometrical views (Motta et al., 2011; Sivakumar and Arivoli, 2011). Clearly, there are a lot of expectations to be met.
In constructing a new visualization tool for ontologies and linked data, our primary goal was a general one: build an interactive visualization tool that can be used by non-expert users to explore linked data. Of course, such general goals are often riddled with complexities when put into practice, and that was certainly the case here as we worked to address many of the criteria felt to be important by other developers (see above) as well as within our own team. The result has been the tool illustrated in Figure 1, which we are tentatively referring to as HuVis (short for Humanities Visualizer). As the figure makes clear, the interface of the tool is broken out into three areas: a central stage; a tabbed command construction history, documentation, and configuration window on the right; and a text-viewing area on the left that displays text to which the triples are linked.
Figure 1. The HuVis tool in action.
In the center of the display are all the entities of the dataset arranged in a circle around a central visualization stage. Entities, represented as nodes, are colour-coded by type and may be dragged onto the stage or reshelved in the ring by dragging with the mouse. When dragged to the stage, all the nodes immediately connected within the pre-loaded linked dataset to the added entity are also added. Nodes may be removed from consideration by dragging them to a discard area, here shown by the small circle of blue dots in the lower right of the figure. To aid users with navigation, the default display of labels is limited to those attached to nodes that are either in the center stage area or those within the range of the mouse pointer. A circular area around the mouse pointer also acts as a magnifying glass, making it easy to differentiate entities even if they are crowded into the display area.
The command panel in the upper right of the display is a powerful tool for controlling the content of the visualization. With just a few clicks, users may execute commands to display or hide groupings of nodes from the stage area and control the display of those entities. Such selections may be made in conjunction with direct selections of elements in the stage area or outer ring, based on entity type, or based on the predicates by which entities are connected. Other features, such as general instructions for the tool (the Help function) and the controls for the layout of the stage area (Settings), are available in the same window through a tab interface.
In the top left of the window is the ‘snippet display’. This area shows portions of the text from which the links in the center display area have been drawn. Snippets are revealed by clicking on a link between the two nodes. These snippets are important so that scholars may examine the criteria by which the connections that they are seeing are justified.
Conclusions
Our paper will report on the results of the user testing conducted in early 2015. We will share our insights, as well as a set of recommendations for ontology and linked-data visualization based on the results of this study.
Note
1. The review by Katifori et al. (2007) in particular has been highly influential in terms of ontology visualizations.