Some data sets are relational data sets, they represent relations between their units and which unit is connected to which ones, with a wide range of reasons for two units to be considered linked or not (family/kinship, trade, membership in a club etc.). Other data sets have a geographical, spatial component: each unit represents a part of the physical world, like a river, a city, a country, a landmark. Some data sets are both: relational data sets (networks) can be about units that have an inherent spatial aspect - trade networks between regions, relationships between users for which location data is available, traffic relationships between areas and so on. Network visualization is hard and most methods to visualize a network focus on displaying vertices and edges (loosely speaking, units of the network and relationships between them, respectively) in a way that highlights the properties of the network itself. When vertices already have a spatial component, though, a “natural” visualization is just to plot the vertices in their spatial position and display the edges between them. This may make the network more understandable, though not necessarily easier to read.
The netmap
package doesn’t attempt to reinvent the
wheel, so it uses both the sf
package to handle spatial
data files (from shapefiles to KML files to whatnot) and the
ggnetwork
package to plot network data using
ggplot2
’s grammar of graphics. It will work with network
objects as produced by either the network
or the
igraph
package, without the need to specify the object
class. Elements of the sf
objects are called features
(i.e. a city, a state of a country), while network objects have vertices
(the elements that may or may not be linked to each other) and edges
(the connections between the vertices).
The stable version of the package can be installed from CRAN:
In alternative, the latest version can be installed from GitHub:
The main function is ggnetmap
. It will need both a
network object and a sf
object and it will produce a data
frame (a fortified data frame, as produced by
fortify.network
or fortify.igraph
in
ggnetwork
). It will most commonly used like this
(net
is a network
or igraph
object, map
a sf
object, lkp_tbl
is a lookup table, in case there is no direct match between the two
objects, m_name
and n_name
are the variable
names for linking the two objects):
fortified_df=ggnetmap(net, map, lkp_tbl, m_name="spatial_id", n_name="vertex.names")
ggplot() +
geom_sf(data=map) + #this will be the map on which the network will be overlayed
geom_edges(data=fortified_df, aes(x=x,y=y, xend=xend, yend=yend), colour="red") + #network edges
geom_nodes(data=fortified_df, aes(x=x,y=y)) + #network vertices
geom_nodetext(data=fortified_df, aes(x=x,y=y, label = spatial_id), fontface = "bold") + #vertex labels
theme_blank()
The data frame returned by ggnetmap
will describe the
edges of the network, from which vertex information can also be derived.
It will contain both vertex identifiers (from the network and the
sf
object), the coordinates of the edge’s start (which will
coincide with the vertex identifier) and the coordinates of the edge’s
end. Since both identifiers are included, it is rather straightforward
to add further variables by merging with other data frames.
The plot itself, it is based on two different data sources, the
sf
object and the data frame produced by
ggnetmap
, the latter overlayed to the former. We are
trying to represent a network by using the spatial component of its
vertices, so ggnetmap
will only include elements that are
both vertices of the network and features of the sf
object,
that is, the intersection between the set of the network vertices and
the set of geographical features.
An actual example would look like this:
library(ggplot2)
library(netmap)
data(fvgmap)
routes=network::network(matrix(c(0, 1, 1, 0,
1, 0, 1, 0,
1, 1, 0, 1,
0, 0, 1, 0), nrow=4, byrow=TRUE))
network::set.vertex.attribute(routes, "names", value=c("Trieste", "Gorizia", "Udine", "Pordenone"))
routes_df=netmap::ggnetmap(routes, fvgmap, m_name="Comune", n_name="names")
ggplot() +
geom_sf(data=fvgmap) +
ggnetwork::geom_edges(data=routes_df, aes(x=x,y=y, xend=xend, yend=yend), colour="red") +
ggnetwork::geom_nodes(data=routes_df, aes(x=x,y=y)) +
ggnetwork::geom_nodetext(data=routes_df, aes(x=x,y=y, label = Comune), fontface = "bold") +
theme_blank()
Further aesthetics can be passed to geom_edges
,
geom_nodes
and geom_nodetext
, like different
line types based on edge attributes.
When analyzing a network, measures of centrality like degree,
betweenness and closeness are often used. The netmap
package offers a convenient way of obtaining these centrality measures,
with the ggcentrality
function. This creates an
sf
object that acts as an additional layer that can be
combined with the background sf
object and the network
visualization itself, as the following example shows:
routes2=network::network(matrix(c(0, 1, 1, 0, 0, 1 ,
1, 0, 1, 0, 0, 1,
1, 1, 0, 1, 1, 1,
0, 0, 1, 0, 1, 1,
0, 0, 1, 1, 0, 0,
1, 1, 1, 1, 0, 0), nrow=6, byrow=TRUE))
network::set.vertex.attribute(routes2, "names",
value=c("Trieste", "Gorizia", "Udine", "Pordenone",
"Tolmezzo", "Grado"))
lkpt=data.frame(Pro_com=c(32006, 31007, 30129, 93033, 30121, 31009),
names=c("Trieste", "Gorizia", "Udine", "Pordenone", "Tolmezzo",
"Grado"))
routes2_df=netmap::ggnetmap(routes2, fvgmap, lkpt, m_name="Pro_com", n_name="names")
map_centrality=netmap::ggcentrality(routes2, fvgmap, lkpt, m_name="Pro_com",
n_name="names", par.deg=list(gmode="graph"))
ggplot() +
geom_sf(data=fvgmap) +
geom_sf(data=map_centrality, aes(fill=degree)) +
ggnetwork::geom_edges(data=routes2_df, aes(x=x,y=y, xend=xend, yend=yend), colour="red") +
ggnetwork::geom_nodes(data=routes2_df, aes(x=x,y=y)) +
ggnetwork::geom_nodetext(data=routes2_df, aes(x=x,y=y, label = names), fontface = "bold") +
theme_blank()
While the above example shows the degree of the vertices, different centrality measures can be represented just by changing the aesthetic.
It’s also possible to plot just the network using the geographical
position of the nodes as layout without using ggplot2
, but
instead resorting to the plot.network
and
plot.igraph
functions. In this case, the layout function
network.layout.extract_coordinates
should be passed to the
plotting functions or the wrapper netmap_plot
should be
used instead, as in the following example. Note that the features and
the vertices should have the same order for
network.layout.extract_coordinates
to produce correct
results; otherwise, use netmap_plot
. Please also note that
it’s not possible to overlay the network on the map this way.
routes2=network::network(matrix(c(0, 1, 1, 0, 0, 1 ,
1, 0, 1, 0, 0, 1,
1, 1, 0, 1, 1, 1,
0, 0, 1, 0, 1, 1,
0, 0, 1, 1, 0, 0,
1, 1, 1, 1, 0, 0), nrow=6, byrow=TRUE))
network::set.vertex.attribute(routes2, "names",
value=c("Trieste", "Gorizia", "Udine", "Pordenone",
"Tolmezzo", "Grado"))
lkpt=data.frame(Pro_com=c(32006, 31007, 30129, 93033, 30121, 31009),
names=c("Trieste", "Gorizia", "Udine", "Pordenone", "Tolmezzo",
"Grado"))
netmap::netmap_plot(routes2, fvgmap, lkpt, m_name="Pro_com", n_name="names")