# Experiments

In our experiments, we evaluate the performance of Greedy both in terms of quality of the solution found and in terms of its running time. All algorithms compared in our experiments are implemented in C++, building on the open-source NetworKit 53 framework. The experiments were done on a machine equipped with 256 GB RAM and a 2.7 GHz Intel XeonCPU E5-2680 having 2 sockets with 8 cores each. To make the comparison with previous workmore meaningful, we use only one of the 16 cores. The machine runs 64 bit SUSE Linux and we compiled our code with g++-4.8.1 and OpenMP 3.1. For our experiments, we consider a set of real-world networks belonging to different do-mains, taken from SNAP 35, KONECT 32, Pajek 3, and the 10th DIMACS ImplementationChallenge 2. The properties of the networks are reported in the tables below.

Figure 1:

Real-World Instances and Average Running Times of the Betweenness Algorithms

## Solution Quality

In the “Solution Quality” section, we evaluate Greedy in terms of accuracy and we compare it both with the optimum and with some alternative baselines. To speed up the computation of Greedy (and therefore to target larger graphs), we do not re-compute betweenness from scratch in Line 4 of Algorithm 1 (see full paper 1 for algorithmic description), but we use the dynamic algorithm. Notice that this does not affect the solution found by the algorithm, only its running time.

### IP Formulation for MBI on Directed Graphs

Since computing the optimum by examining all possible k-tuples would be too expensive even on small graphs, we use an Integer Programming (IP) formulation, described in the full version of the paper 1. We solve the IP with the Mixed-Integer Nonlinear Programming Solver Couenne [5]_ and measure the approximation ratio of the greedy algorithm on three types of randomly generated directed networks, namely, directed Preferential Attachment (in short, PA) 11, Copying (in short, COPY) 31, Compressible Web (in short, COMP) 14. For each graph type, we generate five different instances with the same size. We focus our attention on twenty vertices $$v$$, which have been chosen on the basis of their original betweenness ranking. In particular, we divide the list of vertices, sorted by their original ranking, in four intervals, and choose five random vertices uniformly at random in each interval. In each experiment, we add $$k={1,2,...,7}$$ edges. We evaluate the quality of the solution produced by the greedy algorithm by measuring its approximation ratio and we report the results.

### Results for Real-World Directed Networks

We also analyze the performance of Greedy on the real-world directed networks of Figure 1. Since finding the optimum on these networks would take too long, we compare the solution of Greedy with the following three baselines:

• Top-k Degree: the algorithm that connects theknodes having the highest degree to v

• Top-k Betweenness: the algorithm that connects theknodes having the highest betweenness centrality to v

• Random: the algorithm that connects k nodes extracted uniformly at random to v.

For each graph, we pick one node at random, compute its betweenness on the initial graph and try to increase it with the four heuristics. We refer to the selected node as pivot. Since the results may vary depending on the initial betweenness of the pivot, we also repeat each experiment withfive different pivots.

### Results for Real-World Undirected Graphs

Although it was proven that Greedy has anunbounded approximation ratio for undirected graphs 17, it is still not clear how it actually performs in practice. In Reference 17, the authors performed some preliminary experiments in which they showed that the greedy algorithm provides a solution slightly better than the Top-k Betweenness algorithm. However, they analyzed only very small networks (with few hundredsof nodes), due to the high complexity of a straightforward implementation of the greedy algorithm. In what follows, we compare Greedy with other baselines on graphs with up to 104 nodes and 105 edges. This is made possible by combining Greedy with the dynamic algorithm (see full paper 1 for description). In particular, we compare Greedy with Top-k Betweenness, Top-k Degree, and Random, also on several undirected real-world networks listed in Figure 1.

## Running Time

In the “Running Time” section, we evaluate the running time of the dynamic algorithm for betweenness centrality computation. We used the same experimental setting used in “Solution Quality”. Since some of the algorithms we use for comparison work only on unweighted graphs, all the tested networks are unweighted.

### Evaluation of the Dynamic Algorithm for the Betweenness of One Node

In the following, we refer to our incremental algorithm for the update of the betweenness of a single node as SI(Single-node Incremental). Since there are no other algorithms specifically designed to computeor update the betweenness of a single node, we also use the static algorithm by Brandes 13 and the dynamic algorithm by Bergamini et al. 8 for a comparison. Indeed, the algorithm by Brandes (which we refer to as Stat, from Static) is the best known algorithm for static computation of betweenness. The one by Bergamini et al. (which we name AI, from All-nodes Incremental) has been shown to outperform other dynamic algorithms 8.To compare the running times of the algorithms for betweenness centrality, we choose a node $$x$$ at random and we assume we want to compute the betweenness of $$x$$. Then, we add an edge to the graph, also chosen uniformly at random among the node pairs $$(u,v)$$ such that $$(u,v)\in E$$. After the insertion, we use the three algorithms to update the betweenness centrality of $$x$$ and compare their running times. We recall that Stat is a static algorithm, which means that we can only run it from scratch on the graph after the edge insertion. On each graph, we repeat this 100 times and report the average running time obtained by each of the algorithms.

### Running Times of the Greedy Algorithm for Betweenness Maximization

Greedy outperforms all other heuristics in terms of solution quality, both on directed and on undirected graphs (although we recall that the theoretical guarantee on the approximation ratio holds only for directed graphs). In this section, we report the running times of Greedy, using SI to recompute betweenness.

1(1,2,3)

Bergamini, E., Crescenzi, P., D’Angelo, G., Meyerhenke, H., Severini, L., Velaj, Y. 2018. Improving the Betweenness Centrality of a Node by Adding Links. In ACM J. Exp. Algorithmics. ACM, Vol. 23 2018

2

David A. Bader, Henning Meyerhenke, Peter Sanders, Christian Schulz, Andrea Kappes, and Dorothea Wagner. 2014. Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining. Springer,73–82.

3

Vladimir Batagelj and Andrej Mrvar. 2006. Pajek datasets. Retrieved July 23, 2018 from http://vlado.fmf.uni-lj.si/pub/networks/data.

8(1,2)

Elisabetta Bergamini, Henning Meyerhenke, Mark Ortmann, and Arie Slobbe. 2017. Faster betweenness centralityupdates in evolving networks. In 16th International Symposium on Experimental Algorithms (SEA’17), C. S. Iliopoulos, S. P. Pissis, S. J. Puglisi, and R. Raman (Eds.). Vol. 75 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 23:1–23:16.

11

Béla Bollobás, Christian Borgs, Jennifer Chayes, and Oliver Riordan. 2003. Directed scale-free graphs. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’03). SIAM, 132–139.

13

Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. J. Math. Soc.25 (2001), 163–177.

14

Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Alessandro Panconesi, and Prabhakar Raghavan. 2009. Models forthe compressible web. In Proceedings of the 50th Annual Symposium on Foundations of Computer Science (FOCS’09). IEEE, 331–340.

17(1,2)

Gianlorenzo D’Angelo, Lorenzo Severini, and Yllka Velaj. 2016. On the maximum betweenness improvement problem.InProceedings of the 16th Italian Conference on Theoretical Computer Science (ICTCS’15) (Electr. Notes Theor. Comput.Sci.), Vol. 322. 153–168.

31

Favi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D. Sivakumar, Andrew Tomkins, and Eli Upfal. 2000. Sto-chastic models for the web graph. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science(FOCS’00). IEEE, 57–65.

32

Jérôme Kunegis. 2013. KONECT: The Koblenz network collection. In Proceedings of the 22nd International World WideWeb Conference (WWW’13). 1343–1350. http://dl.acm.org/citation.cfm?id=2488173

35

Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved June 2014 from http://snap.stanford.edu/data.

53

Christian L. Staudt, Aleksejs Sazonovs, and Henning Meyerhenke. 2016. NetworKit: A tool suite for high-performance network analysis. Network Science 4, 4 (2016), 508–530.