graphology-metrics
Version:
Miscellaneous graph metrics for graphology.
582 lines (479 loc) • 17 kB
JavaScript
/**
* Graphology Modularity
* ======================
*
* Implementation of network modularity for graphology.
*
* Modularity is a bit of a tricky problem because there are a wide array
* of different definitions and implementations. The current implementation
* try to stay true to Newman's original definition and consider both the
* undirected & directed case as well as the weighted one. The current
* implementation should also be aligned with Louvain algorithm's definition
* of the metric.
*
* Regarding the directed version, one has to understand that the undirected
* version's is basically considering the graph as a directed one where all
* edges would be mutual.
*
* There is one exception to this, though: self loops. To conform with density's
* definition, as used in modularity's one, and to keep true to the matrix
* formulation of modularity, one has to note that self-loops only count once
* in both the undirected and directed cases. This means that a k-clique with
* one node having a self-loop will not have the same modularity in the
* undirected and mutual case. Indeed, in both cases the modularity of a
* k-clique with one loop and minus one internal edge should be equal.
*
* This also means that, as with the naive density formula regarding loops,
* one should increment M when considering a loop. Also, to remain coherent
* in this regard, degree should not be multiplied by two because of the loop
* else it will have too much importance regarding the considered proportions.
*
* Hence, here are the retained formulas:
*
* For dense weighted undirected network:
* --------------------------------------
*
* Q = 1/2m * [ ∑ij[Aij - (di.dj / 2m)] * ∂(ci, cj) ]
*
* where:
* - i & j being a pair of nodes
* - m is the sum of edge weights
* - Aij being the weight of the ij edge (or 0 if absent)
* - di being the weighted degree of node i
* - ci being the community to which belongs node i
* - ∂ is Kronecker's delta function (1 if x = y else 0)
*
* For dense weighted directed network:
* ------------------------------------
*
* Qd = 1/m * [ ∑ij[Aij - (dini.doutj / m)] * ∂(ci, cj) ]
*
* where:
* - dini is the in degree of node i
* - douti is the out degree of node i
*
* For sparse weighted undirected network:
* ---------------------------------------
*
* Q = ∑c[ (∑cinternal / 2m) - (∑ctotal / 2m)² ]
*
* where:
* - c is a community
* - ∑cinternal is the total weight of a community internal edges
* - ∑ctotal is the total weight of edges connected to a community
*
* For sparse weighted directed network:
* -------------------------------------
*
* Qd = ∑c[ (∑cinternal / m) - (∑cintotal * ∑couttotal / m²) ]
*
* where:
* - ∑cintotal is the total weight of edges pointing towards a community
* - ∑couttotal is the total weight of edges going from a community
*
* Note that dense version run in O(N²) while sparse version runs in O(V). So
* the dense version is mostly here to guarantee the validity of the sparse one.
* As such it is not used as default.
*
* For undirected delta computation:
* ---------------------------------
*
* ∆Q = (dic / 2m) - ((∑ctotal * di) / 2m²)
*
* where:
* - dic is the degree of the node in community c
*
* For directed delta computation:
* -------------------------------
*
* ∆Qd = (dic / m) - (((douti * ∑cintotal) + (dini * ∑couttotal)) / m²)
*
* Gephi's version of undirected delta computation:
* ------------------------------------------------
*
* ∆Qgephi = dic - (di * Ztot) / 2m
*
* Note that the above formula is erroneous and should really be:
*
* ∆Qgephi = dic - (di * Ztot) / m
*
* because then: ∆Qgephi = ∆Q * 2m
*
* It is used because it is faster to compute. Since Gephi's error is only by
* a constant factor, it does not make the result incorrect.
*
* [Latex]
*
* Sparse undirected
* Q = \sum_{c} \bigg{[} \frac{\sum\nolimits_{c\,in}}{2m} - \left(\frac{\sum\nolimits_{c\,tot}}{2m}\right )^2 \bigg{]}
*
* Sparse directed
* Q_d = \sum_{c} \bigg{[} \frac{\sum\nolimits_{c\,in}}{m} - \frac{\sum_{c\,tot}^{in}\sum_{c\,tot}^{out}}{m^2} \bigg{]}
*
* [Articles]
* M. E. J. Newman, « Modularity and community structure in networks »,
* Proc. Natl. Acad. Sci. USA, vol. 103, no 23, 2006, p. 8577–8582
* https://dx.doi.org/10.1073%2Fpnas.0601602103
*
* Newman, M. E. J. « Community detection in networks: Modularity optimization
* and maximum likelihood are equivalent ». Physical Review E, vol. 94, no 5,
* novembre 2016, p. 052315. arXiv.org, doi:10.1103/PhysRevE.94.052315.
* https://arxiv.org/pdf/1606.02319.pdf
*
* Blondel, Vincent D., et al. « Fast unfolding of communities in large
* networks ». Journal of Statistical Mechanics: Theory and Experiment,
* vol. 2008, no 10, octobre 2008, p. P10008. DOI.org (Crossref),
* doi:10.1088/1742-5468/2008/10/P10008.
* https://arxiv.org/pdf/0803.0476.pdf
*
* Nicolas Dugué, Anthony Perez. Directed Louvain: maximizing modularity in
* directed networks. [Research Report] Université d’Orléans. 2015. hal-01231784
* https://hal.archives-ouvertes.fr/hal-01231784
*
* R. Lambiotte, J.-C. Delvenne and M. Barahona. Laplacian Dynamics and
* Multiscale Modular Structure in Networks,
* doi:10.1109/TNSE.2015.2391998.
* https://arxiv.org/abs/0812.1770
*
* [Links]:
* https://math.stackexchange.com/questions/2637469/where-does-the-second-formula-of-modularity-comes-from-in-the-louvain-paper-the
* https://www.quora.com/How-is-the-formula-for-Louvain-modularity-change-derived
* https://github.com/gephi/gephi/blob/master/modules/StatisticsPlugin/src/main/java/org/gephi/statistics/plugin/Modularity.java
* https://github.com/igraph/igraph/blob/eca5e809aab1aa5d4eca1e381389bcde9cf10490/src/community.c#L906
*/
var resolveDefaults = require('graphology-utils/defaults');
var isGraph = require('graphology-utils/is-graph');
var inferType = require('graphology-utils/infer-type');
var getters = require('graphology-utils/getters');
var DEFAULTS = {
getNodeCommunity: 'community',
getEdgeWeight: 'weight',
resolution: 1
};
function collectForUndirectedDense(graph, options) {
var communities = new Array(graph.order);
var weightedDegrees = new Float64Array(graph.order);
var weights = {};
var M = 0;
var getEdgeWeight = getters.createEdgeWeightGetter(
options.getEdgeWeight
).fromEntry;
var getNodeCommunity = getters.createNodeValueGetter(
options.getNodeCommunity
).fromEntry;
// Collecting communities
var i = 0;
var ids = {};
graph.forEachNode(function (node, attr) {
ids[node] = i;
communities[i++] = getNodeCommunity(node, attr);
});
// Collecting weights
graph.forEachUndirectedEdge(function (edge, attr, source, target, sa, ta, u) {
var weight = getEdgeWeight(edge, attr, source, target, sa, ta, u);
M += weight;
weights[edge] = weight;
weightedDegrees[ids[source]] += weight;
// NOTE: we double degree only if we don't have a loop
if (source !== target) weightedDegrees[ids[target]] += weight;
});
return {
weights: weights,
communities: communities,
weightedDegrees: weightedDegrees,
M: M
};
}
function collectForDirectedDense(graph, options) {
var communities = new Array(graph.order);
var weightedInDegrees = new Float64Array(graph.order);
var weightedOutDegrees = new Float64Array(graph.order);
var weights = {};
var M = 0;
var getEdgeWeight = getters.createEdgeWeightGetter(
options.getEdgeWeight
).fromEntry;
var getNodeCommunity = getters.createNodeValueGetter(
options.getNodeCommunity
).fromEntry;
// Collecting communities
var i = 0;
var ids = {};
graph.forEachNode(function (node, attr) {
ids[node] = i;
communities[i++] = getNodeCommunity(node, attr);
});
// Collecting weights
graph.forEachDirectedEdge(function (edge, attr, source, target, sa, ta, u) {
var weight = getEdgeWeight(edge, attr, source, target, sa, ta, u);
M += weight;
weights[edge] = weight;
weightedOutDegrees[ids[source]] += weight;
weightedInDegrees[ids[target]] += weight;
});
return {
weights: weights,
communities: communities,
weightedInDegrees: weightedInDegrees,
weightedOutDegrees: weightedOutDegrees,
M: M
};
}
function undirectedDenseModularity(graph, options) {
var resolution = options.resolution;
var result = collectForUndirectedDense(graph, options);
var communities = result.communities;
var weightedDegrees = result.weightedDegrees;
var M = result.M;
var nodes = graph.nodes();
var i, j, l, Aij, didj, e;
var S = 0;
var M2 = M * 2;
for (i = 0, l = graph.order; i < l; i++) {
// NOTE: it is important to parse the whole matrix here, diagonal and
// lower part included. A lot of implementation differ here because
// they process only a part of the matrix
for (j = 0; j < l; j++) {
// NOTE: Kronecker's delta
// NOTE: we could go from O(n^2) to O(avg.C^2)
if (communities[i] !== communities[j]) continue;
e = graph.undirectedEdge(nodes[i], nodes[j]);
Aij = result.weights[e] || 0;
didj = weightedDegrees[i] * weightedDegrees[j];
// We add twice if we have a self loop
if (i === j && typeof e !== 'undefined')
S += (Aij - (didj / M2) * resolution) * 2;
else S += Aij - (didj / M2) * resolution;
}
}
return S / M2;
}
function directedDenseModularity(graph, options) {
var resolution = options.resolution;
var result = collectForDirectedDense(graph, options);
var communities = result.communities;
var weightedInDegrees = result.weightedInDegrees;
var weightedOutDegrees = result.weightedOutDegrees;
var M = result.M;
var nodes = graph.nodes();
var i, j, l, Aij, didj, e;
var S = 0;
for (i = 0, l = graph.order; i < l; i++) {
// NOTE: it is important to parse the whole matrix here, diagonal and
// lower part included. A lot of implementation differ here because
// they process only a part of the matrix
for (j = 0; j < l; j++) {
// NOTE: Kronecker's delta
// NOTE: we could go from O(n^2) to O(avg.C^2)
if (communities[i] !== communities[j]) continue;
e = graph.directedEdge(nodes[i], nodes[j]);
Aij = result.weights[e] || 0;
didj = weightedInDegrees[i] * weightedOutDegrees[j];
// Here we multiply by two to simulate iteration through lower part
S += Aij - (didj / M) * resolution;
}
}
return S / M;
}
function collectCommunitesForUndirected(graph, options) {
var communities = {};
var totalWeights = {};
var internalWeights = {};
var getNodeCommunity = getters.createNodeValueGetter(
options.getNodeCommunity
).fromEntry;
graph.forEachNode(function (node, attr) {
var community = getNodeCommunity(node, attr);
communities[node] = community;
if (typeof community === 'undefined')
throw new Error(
'graphology-metrics/modularity: the "' +
node +
'" node is not in the partition.'
);
totalWeights[community] = 0;
internalWeights[community] = 0;
});
return {
communities: communities,
totalWeights: totalWeights,
internalWeights: internalWeights
};
}
function collectCommunitesForDirected(graph, options) {
var communities = {};
var totalInWeights = {};
var totalOutWeights = {};
var internalWeights = {};
var getNodeCommunity = getters.createNodeValueGetter(
options.getNodeCommunity
).fromEntry;
graph.forEachNode(function (node, attr) {
var community = getNodeCommunity(node, attr);
communities[node] = community;
if (typeof community === 'undefined')
throw new Error(
'graphology-metrics/modularity: the "' +
node +
'" node is not in the partition.'
);
totalInWeights[community] = 0;
totalOutWeights[community] = 0;
internalWeights[community] = 0;
});
return {
communities: communities,
totalInWeights: totalInWeights,
totalOutWeights: totalOutWeights,
internalWeights: internalWeights
};
}
function undirectedSparseModularity(graph, options) {
var resolution = options.resolution;
var result = collectCommunitesForUndirected(graph, options);
var M = 0;
var totalWeights = result.totalWeights;
var internalWeights = result.internalWeights;
var communities = result.communities;
var getEdgeWeight = getters.createEdgeWeightGetter(
options.getEdgeWeight
).fromEntry;
graph.forEachUndirectedEdge(function (
edge,
edgeAttr,
source,
target,
sa,
ta,
u
) {
var weight = getEdgeWeight(edge, edgeAttr, source, target, sa, ta, u);
M += weight;
var sourceCommunity = communities[source];
var targetCommunity = communities[target];
totalWeights[sourceCommunity] += weight;
totalWeights[targetCommunity] += weight;
if (sourceCommunity !== targetCommunity) return;
internalWeights[sourceCommunity] += weight * 2;
});
var Q = 0;
var M2 = M * 2;
for (var C in internalWeights)
Q +=
internalWeights[C] / M2 - Math.pow(totalWeights[C] / M2, 2) * resolution;
return Q;
}
function directedSparseModularity(graph, options) {
var resolution = options.resolution;
var result = collectCommunitesForDirected(graph, options);
var M = 0;
var totalInWeights = result.totalInWeights;
var totalOutWeights = result.totalOutWeights;
var internalWeights = result.internalWeights;
var communities = result.communities;
var getEdgeWeight = getters.createEdgeWeightGetter(
options.getEdgeWeight
).fromEntry;
graph.forEachDirectedEdge(function (
edge,
edgeAttr,
source,
target,
sa,
ta,
u
) {
var weight = getEdgeWeight(edge, edgeAttr, source, target, sa, ta, u);
M += weight;
var sourceCommunity = communities[source];
var targetCommunity = communities[target];
totalOutWeights[sourceCommunity] += weight;
totalInWeights[targetCommunity] += weight;
if (sourceCommunity !== targetCommunity) return;
internalWeights[sourceCommunity] += weight;
});
var Q = 0;
for (var C in internalWeights)
Q +=
internalWeights[C] / M -
((totalInWeights[C] * totalOutWeights[C]) / Math.pow(M, 2)) * resolution;
return Q;
}
// NOTE: the formula is a bit unclear here but nodeCommunityDegree should be
// given as the edges count * 2
function undirectedModularityDelta(
M,
communityTotalWeight,
nodeDegree,
nodeCommunityDegree
) {
return (
nodeCommunityDegree / (2 * M) -
(communityTotalWeight * nodeDegree) / (2 * (M * M))
);
}
function directedModularityDelta(
M,
communityTotalInWeight,
communityTotalOutWeight,
nodeInDegree,
nodeOutDegree,
nodeCommunityDegree
) {
return (
nodeCommunityDegree / M -
(nodeOutDegree * communityTotalInWeight +
nodeInDegree * communityTotalOutWeight) /
(M * M)
);
}
function denseModularity(graph, options) {
if (!isGraph(graph))
throw new Error(
'graphology-metrics/modularity: given graph is not a valid graphology instance.'
);
if (graph.size === 0)
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of an empty graph.'
);
if (graph.multi)
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of a multi graph. Cast it to a simple one beforehand.'
);
var trueType = inferType(graph);
if (trueType === 'mixed')
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of a mixed graph.'
);
options = resolveDefaults(options, DEFAULTS);
if (trueType === 'directed') return directedDenseModularity(graph, options);
return undirectedDenseModularity(graph, options);
}
function sparseModularity(graph, options) {
if (!isGraph(graph))
throw new Error(
'graphology-metrics/modularity: given graph is not a valid graphology instance.'
);
if (graph.size === 0)
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of an empty graph.'
);
if (graph.multi)
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of a multi graph. Cast it to a simple one beforehand.'
);
var trueType = inferType(graph);
if (trueType === 'mixed')
throw new Error(
'graphology-metrics/modularity: cannot compute modularity of a mixed graph.'
);
options = resolveDefaults(options, DEFAULTS);
if (trueType === 'directed') return directedSparseModularity(graph, options);
return undirectedSparseModularity(graph, options);
}
var modularity = sparseModularity;
modularity.sparse = sparseModularity;
modularity.dense = denseModularity;
modularity.undirectedDelta = undirectedModularityDelta;
modularity.directedDelta = directedModularityDelta;
module.exports = modularity;