--- title: "005 - Inspecting Nodes and Edges" output: html_document --- ## Setup Ensure that the development version of **DiagrammeR** is installed. Load in the package with `library()`. Additionally, load in the **tidyverse** packages. ```{r load_packages, message=FALSE, warning=FALSE, include=FALSE, results=FALSE} #devtools::install_github("rich-iannone/DiagrammeR") library(DiagrammeR) library(tidyverse) ``` ## Part 1. Information on All Nodes and Edges When you have a graph object, sometimes you'll want to poke around and inspect some of the nodes, and some of the edges. There are very good reasons for doing so. There can be valuable information within the nodes and edges. Further graph construction may hinge on what's extant in the graph. Inspection is a good way to verify that graph modification has indeed taken place in the correct manner. First, let's build a graph to use for the examples. For the node data frame we will include values for the `type`, `label`, and `data` attributes. The edge data frame will contain the `rel`, `color`, and `weight` edge attributes. ```{r create_initial_graph} # Create a node data frame (ndf) with # 4 nodes ndf <- create_node_df( n = 4, type = "number", label = c( "one", "two", "three", "four"), data = c( 3.5, 2.6, 9.4, 2.7)) # Create an edge data frame (ndf) with # 4 edges edf <- create_edge_df( from = c(1, 2, 3, 4), to = c(4, 3, 1, 1), rel = c("P", "B", "L", "L"), color = c("green", "blue", "red", "red"), weight = c(2.1, 5.7, 10.1, 3.9)) graph <- create_graph( nodes_df = ndf, edges_df = edf) # Render the graph to see it in the RStudio Viewer graph %>% render_graph() ``` The `get_node_ids()` function simply returns a vector of node ID values. This is useful in many cases and is great when used as a sanity check. ```{r use_get_node_ids} graph %>% get_node_ids() ``` Using the `get_node_info()` function provides a data frame with detailed information on nodes and their interrelationships within a graph. It always returns the same columns, in the same order. It returns as many rows as there are nodes in the graph. It's useful when you want a quick summary of the node ID values, their labels and `type` values, and their degrees of connectness with other nodes. ```{r use_get_node_info} graph %>% get_node_info() ``` In the above table the base attributes of the nodes are provided first (`id`, `type`, and `label`) and then follow the columns with degree information (`deg`, `indeg`, and `outdeg`). The node degree (`deg`) describes the number of edges to or from the node. The indegree and outdegree are the number of edges coming in to the node and out from the node, respectively. Finally, the `loops` column provides the number of self edges for the node (this is an edge that starts and terminates at the same node, so the degree for that is 2). The `get_edges()` function returns all of the node ID values related to each edge in the graph: ```{r use_get_edges} graph %>% get_edges() ``` Like nodes, edges also have ID values. This is important for distinguishing between edges when a pair of nodes has multiple edges between them (and especially if they are in the same direction in a directed graph). To get all edge ID values in the graph, use the `get_edge_ids()` function. ```{r use_get_edge_ids} graph %>% get_edge_ids() ``` The `get_edge_info()`, like the `get_node_info()` function, always returns a data frame with a set number of columns. In this case, it is the edge ID value `id`, the node ID values `from` and `to` that define the links, and, the relationship (`rel`) labels for the edges. ```{r use_get_edge_info} graph %>% get_edge_info() ``` ## Part 2. Inspecting Nodes, Edges, and their Attributes Two of a graph object's main components are its node data frame (ndf) and its edge data frame (edf). These can be obtained as individual data frames using the `get_node_df()` and `get_edge_df()` functions: ```{r use_get_node_df} # Get the graph's ndf with the `get_node_df()` function graph %>% get_node_df() ``` ```{r use_get_edge_df} # Get the graph's edf with the `get_edge_df()` function graph %>% get_edge_df() ``` For the ndf, the `id`, `type`, and `label` columns will always be present and in that prescribed order. For the edf, it is the `id`, `from`, `to`, and `rel` columns will always be present. Any additional columns can be either parameters recognized by the graph rendering engine (e.g., `color`, `fontname`, etc.) or non-aesthetic properties of the nodes or edges (e.g., a node `data` value or an edge `weight`). ## Part 3. Determining Existence of Nodes or Edges There may be cases where you need to verify that a certain node ID exists in the graph or that an edge definition is present. The `is_node_present()` and `is_edge_present()` functions will provide a `TRUE` or `FALSE` value as confirmation. Get the node ID values present in the graph with the `get_node_ids()` function. ```{r use_get_node_id_2} graph %>% get_node_ids() ``` Is node with ID `1` in the graph? Use `node_present()` to find out. ```{r is_node_1_present} graph %>% is_node_present(node = 1) ``` Is node with ID `5` in the graph? ```{r is_node_5_present} graph %>% is_node_present(node = 5) ``` Get the node ID values associated with the edges present in the graph (with the `get_edges()` function). ```{r use_get_edges_2} graph %>% get_edges() ``` To determine whether an edge is present, the `edge_present()` function takes 2 arguments after `graph`: `from` and `to`. So, to find out whether the edge `1->4` is present, the following can be used: ```{r is_edge_1_4_present} graph %>% is_edge_present( from = 1, to = 4) ``` Since the the edge `2->4` does not exist, the following will return FALSE: ```{r is_edge_2_4_present} graph %>% is_edge_present( from = 2, to = 4) ```