Where is my internet ?

02/01/05

After working on the process of identifying thegeographic location of the visitors of a web site, and then analysing the logs of a few servers I'm responsible for, I began to wonder where is the internet from a geographic point of view. My server is in Japan, I am in France, americans are everywhere, Koreans are bandwidth fat, Chinese and Indians are so many but not too wired, ... you've heard it all before. I was suprised by the geographic logs of my servers and I wanted to have a reference point to compare with.

Where am I ?

Using Sawmill log analyzer on a friend's computer, it came up with the following data about geographic location of habett.com and habett.org.

RankOn habett.orgOn habett.com
1France28.2 %United States32.1 %
2United States27.4 %France26.5 %
3Korea, Republic of11.6 %Korea, Republic of9.1 %
4United Kingdom3.1 %Canada7.4 %
5Iran, Islamic Republic of2.4 %United Kingdom5.3 %
6China2.3 %China3.5 %
7Canada2.2 %Macao2.6 %
8Germany1.7 %Switzerland1.1 %
9Tunisia1.7 %India1.1 %
10Switzerland1.5 %Italy0.8 %
11United Arab Emirates1.4 %Belgium0.8 %
12Italy1.1 %Germany0.8 %
13Japan0.9 %Netherlands0.5 %
14Morocco0.9 %Denmark0.5 %
15Belgium0.8 %Romania0.5 %
16Algeria0.7 %Congo0.4 %
17Egypt0.7 %Algeria0.4 %
18Malaysia0.7 %Morocco0.4 %
19India0.6 %Portugal0.3 %
20Brazil0.6 %Australia0.3 %

Right, we have it. Content on both sites is always available in french and in english so countries like France, Switzerland or Belgium are over represented. Korea was expected to be high but not that high. The Islamic Republic of Iran comes as a suprise, as do Macao, Congo or United Arab Emirates. I belong to the Long Tail so should I expected weird statistics, but how weird are this figures ?

Where are the IPs ?

Having a true world representation of the internet would be tricky. There are local legal restrictions to internet access, there are high bandwith countries, there are internet aware countries, ... My idea was to try to run a statistical analysis based on the http://ip-to-country.webhosting.info database. We already know about this fine file that counts IPs ranges and tells us where they are located. Once analyzed, we would get a representation of how are IPs splitted between countries. Here comes the small perl data cruncher :

open (FILE,"ip-to-country.csv");
$total = 0;
while (<FILE>) {
  ($beg,$end,$_,$_,$pays) = split (/,/,$_);
  $beg =~ s/"//g;
  $end =~ s/"//g;
  $sum = $end-$beg;
  $total += $sum;
  chomp $pays;
  $geo {$pays} += $sum;
}
close (FILE);
@orda = sort byval (keys %geo);
foreach $i (0..39) {
  $percent = 100 * $geo{$orda[$i]} / $total;
  print $orda[$i]." = ".sprintf("%1.2f",$percent)."\n";
}
sub byval {
  $geo{$b} <=> $geo{$a};
}

And there goes the output :

RankCountryStats
1United States69.02 %
2Japan4.52 %
3United Kingdom3.20 %
4Germany2.66 %
5Canada2.43 %
6China2.27 %
7Australia1.98 %
8France1.73 %
9Netherlands1.35 %
10Korea, Republic of1.29 %
11Italy0.91 %
12Sweden0.69 %
13Switzerland0.65 %
14Spain0.59 %
15Taiwan0.56 %
16Brazil0.49 %
17Norway0.40 %
18Finland0.38 %
19Russian Federation0.37 %
20South Africa0.31 %
21Mexico0.28 %
22Poland0.28 %
23Austria0.27 %
24Belgium0.24 %
25Denmark0.24 %
26Hong Kong0.22 %
27India0.19 %
28Israel0.16 %
29New Zealand0.16 %
30Turkey0.15 %
31Czech Republic0.13 %
32Chile0.11 %
33Hungary0.10 %
34Ireland0.10 %
35Argentina0.10 %
36Singapore0.09 %
37Portugal0.09 %
38Greece0.09 %
39Malaysia0.09 %
40Thailand0.09 %

We now have a global view but the USA are voer represented because they are host to so many server farms. You have to keep in mind that this statistics are only relevant to the number of IPs and not users.

How do we compare ?

We will now compare the global structure to our own representation. For each country, we'll see how the global percentage compares to our percentage to lay emphasis on the real importance of locations. This is the relevant ratio of marginal popularity.

RankCountryhabett.orghabett.com
1United States0.400.47
2Japan0.200.05
3United Kingdom3.201.64
4Germany0.640.30
5Canada0.913.06
6China1.011.55
7Australia0.250.15
8France16.3015.29
9Netherlands0.140.37
10Korea, Republic of8.997.05
11Italy1.210.92
12Sweden0.430.32
13Switzerland2.311.69
14Spain0.510.44
15Taiwan0.180.04
16Brazil1.220.42
17Norway0.250.57
18Finland0.030.04
19Russian Federation0.540.18
20South Africa0.210.05
21Mexico1.190.15
22Poland0.480.41
23Austria0.410.07
24Belgium3.553.44
25Denmark0.631.92
26Hong Kong2.200.49
27India3.205.76
28Israel1.010.94
29New Zealand0.330.05
30Turkey1.820.44
31Czech Republic0.642.30
32Chile0.681.36
33Hungary0.190.42
34Ireland2.752.92
35Argentina0.671.54
36Singapore2.612.39
37Portugal1.593.80
38Greece0.940.64
39Malaysia7.601.15
40Thailand4.000.77
41Romania4.798.17
46Egypt15.900.76
52Iran, Islamic Republic of81.920.39
58United Arab Emirates67.938.04
76Morocco140.1254.11
85Macao0.00550.43
89Tunisia430.8555.68
90Algeria195.32108.13
198Congo44.004147.47