python - Plotting log-binned network degree distributions -


i have encountered , made long-tailed degree distributions/histograms complex networks figures below. make heavy end of these tails, well, heavy , crowded many observations:

classic long-tailed degree distribution

however, many publications read have cleaner degree distributions don't have clumpiness @ end of distribution , observations more evenly-spaced.

!classic long-tailed degree distribution

how make chart using networkx , matplotlib?

use log binning (see also). here code take counter object representing histogram of degree values , log-bin distribution produce sparser , smoother distribution.

import numpy np def drop_zeros(a_list):     return [i in a_list if i>0]  def log_binning(counter_dict,bin_count=35):      max_x = log10(max(counter_dict.keys()))     max_y = log10(max(counter_dict.values()))     max_base = max([max_x,max_y])      min_x = log10(min(drop_zeros(counter_dict.keys())))      bins = np.logspace(min_x,max_base,num=bin_count)      # based off of: http://stackoverflow.com/questions/6163334/binning-data-in-python-with-scipy-numpy     bin_means_y = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.values())[0] / np.histogram(counter_dict.keys(),bins)[0])     bin_means_x = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.keys())[0] / np.histogram(counter_dict.keys(),bins)[0])      return bin_means_x,bin_means_y 

generating classic scale-free network in networkx , plotting this:

import networkx nx ba_g = nx.barabasi_albert_graph(10000,2) ba_c = nx.degree_centrality(ba_g) # convert normalized degrees raw degrees #ba_c = {k:int(v*(len(ba_g)-1)) k,v in ba_c.iteritems()} ba_c2 = dict(counter(ba_c.values()))  ba_x,ba_y = log_binning(ba_c2,50)  plt.xscale('log') plt.yscale('log') plt.scatter(ba_x,ba_y,c='r',marker='s',s=50) plt.scatter(ba_c2.keys(),ba_c2.values(),c='b',marker='x') plt.xlim((1e-4,1e-1)) plt.ylim((.9,1e4)) plt.xlabel('connections (normalized)') plt.ylabel('frequency') plt.show() 

produces following plot showing overlap between "raw" distribution in blue , "binned" distribution in red.

comparison between raw , log-binned

thoughts on how improve approach or feedback if i've missed obvious welcome.


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -