We have already talked about the network measure degree (see Network theory
). It is one of the simplest and most intuitive measures to describe individual nodes. This and many other nodal network measures can be used to describe entire networks as well. A common approach is to take the average over all nodes and use it as a representative measure of the network.
The clustering coefficient
is one of the most important network measures that has been used in a wide range of studies. It was first introduced in 1998 and represents the probability that two neighbours (connected by an edge) of a node are also connected, i.e. forming closed triangles. In social media it would represent the chance of a persons friends to be friends themselves. This way, the clustering coefficient can describe the extent of how cliquish friendship circles are.
The measure itself takes on values between 0 and 1, where a clustering coefficient of 1 is only achieved if the neighbourhood of a node is fully connected, i.e. that all of a person's friends are also friends with each other.
A similar measure, defined for the entire network however, is called transitivity
. It is calculated over all the nodes and some consider it to be a more representative measure of the entire network, compared to the average clustering coefficient. An example where both measures can result in different trends when the network is slightly altered can be seen here:
The network on the right is similar to the one on the left, with the exception that two more of all possible triangles in the network are formed. The average clustering coefficient (0.87→0.67), however, is reduced by closing these additional triangles, whereas transitivity (0.60→0.67) reflects this closing by an increase.
Modularity tries to determine how well a network can be separated into communities (modules). This is a challenging problem and many different approaches have been developed over the years. One possible approach splits a network into two parts and continues to split these subnetworks into smaller and smaller communities, as long as a measure of separation increases.
This example illustrates that the definition of two modules is not always well defined. Node a
, for example, can belong to either module, while any measure of separation reaches equal values. Therefore most community finding algorithms are used multiple times on the same network and the average measure of separation can be used to describe the network.
Characteristic path-length and efficiency
Some network measures are calculated on the shortest distances between nodes. In some cases, these can represent by physical distances, such as the distance between cities a person and his or her friend lives. However, sometimes these distances are unknown and/or the topological distances are of interest. Topological distances can be related, for example, to the weights (strenghts) of edges by using a function to map a weight into a distance or to the number of "hops" in binarised networks. In brain analyses, these distances are often set to the inverse edge weight. In the following graph the distances from node f
to all other nodes are given by the blue numbers above each node.
Once these distances are defined, a network measure called characteristic path-length
may be calculated. For each node it is given by the average distance to all other nodes (1.75 for node f
in the example before). It represents how "central" a node is in the network.
A closely related network measure is called efficiency
. Instead of taking the average of the distances, it uses the average over the inverse distances. The intuition behind this approach lies in the assumption that if two nodes are closer, the efficiency of their communication (information transport) is higher. Node f
, for example, has an efficiency of 0.71. The benefit of efficiency over characteristic path-length lies in the meaningful calculation of the efficiency in networks that have two separate (unconnected) components. The efficiency of information transport between the two components is 0, whereas the characteristic path-length is infinity, which can cause "problems" when calculating a global network measure.
Betweenness centrality provides a measure of a node's importance by counting how many of the shortest paths, not starting or ending at that particular node, pass through it. It was introduced by Freeman in 1977 and can be interpreted with respect to the information flow within a network.
When considering the flow of information or messages from node to node along the edges, betweenness centrality is a measure which relates to the amount of information passing through a certain node (assuming that information travels along the shortest paths). Betweenness centrality considers the load of a node, instead of how well it is connected. Therefore it can be seen as a measure of importance with respect to network functionality. It also means that a node with just two connections might have a high betweenness centrality in a network, for example, if it connects two communities of the network as in the previous figure.
There are many more measures than those described here. For a more detailed an mathematical discussion of the measures above, and references to other measures, please see Developing Brain Connectivity