In our experiments, we evaluate the performance of Greedy
both in terms of quality of the solution found and in terms of its running time. All algorithms compared in our experiments are implemented in C++, building on the open-source NetworKit
[53] framework. The experiments were done on a machine equipped with 256 GB RAM and a 2.7 GHz Intel XeonCPU E5-2680 having 2 sockets with 8 cores each. To make the comparison with previous workmore meaningful, we use only one of the 16 cores. The machine runs 64 bit SUSE Linux and we compiled our code with g++-4.8.1 and OpenMP 3.1. For our experiments, we consider a set of real-world networks belonging to different do-mains, taken from SNAP [35], KONECT [32], Pajek [3], and the 10th DIMACS ImplementationChallenge [2]. The properties of the networks are reported in the tables below.
In the “Solution Quality” section, we evaluate Greedy in terms of accuracy and we compare it both with the optimum and with some alternative baselines. To speed up the computation of Greedy (and therefore to target larger graphs), we do not re-compute betweenness from scratch in Line 4 of Algorithm 1 (see full paper [1] for algorithmic description), but we use the dynamic algorithm. Notice that this does not affect the solution found by the algorithm, only its running time.
Since computing the optimum by examining all possible k-tuples would be too expensive even on small graphs, we use an Integer Programming (IP) formulation, described in the full version of the paper [1]. We solve the IP with the Mixed-Integer Nonlinear Programming Solver Couenne [5]_ and measure the approximation ratio of the greedy algorithm on three types of randomly generated directed networks, namely, directed Preferential Attachment (in short, PA
) [11], Copying (in short, COPY
) [31], Compressible Web (in short, COMP
) [14]. For each graph type, we generate five different instances with the same size. We focus our attention on twenty vertices \(v\), which have been chosen on the basis of their original betweenness ranking. In particular, we divide the list of vertices, sorted by their original ranking, in four intervals, and choose five random vertices uniformly at random in each interval. In each experiment, we add \(k={1,2,...,7}\) edges. We evaluate the quality of the solution produced by the greedy algorithm by measuring its approximation ratio and we report the results.
We also analyze the performance of Greedy
on the real-world directed networks of Figure 1. Since finding the optimum on these networks would take too long, we compare the solution of Greedy with the following three baselines:
Top-k Degree: the algorithm that connects theknodes having the highest degree to v
Top-k Betweenness: the algorithm that connects theknodes having the highest betweenness centrality to v
Random: the algorithm that connects k nodes extracted uniformly at random to v.
For each graph, we pick one node at random, compute its betweenness on the initial graph and try to increase it with the four heuristics. We refer to the selected node as pivot. Since the results may vary depending on the initial betweenness of the pivot, we also repeat each experiment withfive different pivots.
Although it was proven that Greedy
has anunbounded approximation ratio for undirected graphs [17], it is still not clear how it actually performs in practice. In Reference [17], the authors performed some preliminary experiments in which they showed that the greedy algorithm provides a solution slightly better than the Top-k Betweenness algorithm. However, they analyzed only very small networks (with few hundredsof nodes), due to the high complexity of a straightforward implementation of the greedy algorithm. In what follows, we compare Greedy
with other baselines on graphs with up to 104 nodes and 105 edges. This is made possible by combining Greedy
with the dynamic algorithm (see full paper [1] for description). In particular, we compare Greedy
with Top-k Betweenness, Top-k Degree, and Random, also on several undirected real-world networks listed in Figure 1.
In the “Running Time” section, we evaluate the running time of the dynamic algorithm for betweenness centrality computation. We used the same experimental setting used in “Solution Quality”. Since some of the algorithms we use for comparison work only on unweighted graphs, all the tested networks are unweighted.
In the following, we refer to our incremental algorithm for the update of the betweenness of a single node as SI(Single-node Incremental). Since there are no other algorithms specifically designed to computeor update the betweenness of a single node, we also use the static algorithm by Brandes [13] and the dynamic algorithm by Bergamini et al. [8] for a comparison. Indeed, the algorithm by Brandes (which we refer to as Stat
, from Static) is the best known algorithm for static computation of betweenness. The one by Bergamini et al. (which we name AI, from All-nodes Incremental) has been shown to outperform other dynamic algorithms [8].To compare the running times of the algorithms for betweenness centrality, we choose a node \(x\) at random and we assume we want to compute the betweenness of \(x\). Then, we add an edge to the graph, also chosen uniformly at random among the node pairs \((u,v)\) such that \((u,v)\in E\). After the insertion, we use the three algorithms to update the betweenness centrality of \(x\) and compare their running times. We recall that Stat
is a static algorithm, which means that we can only run it from scratch on the graph after the edge insertion. On each graph, we repeat this 100 times and report the average running time obtained by each of the algorithms.
Greedy
outperforms all other heuristics in terms of solution quality, both on directed and on undirected graphs (although we recall that the theoretical guarantee on the approximation ratio holds only for directed graphs). In this section, we report the running times of Greedy
, using SI to recompute betweenness.