
Web page as Graphs with source code

Webpages as Graphs
With this funny applet, you can judge of the complexity of a web page by just generating it’s graph! my homepage is way too complex compare to google for example 🙂
http://websitesasgraphs.waltercedric.com
What do the colors mean?
blue: for links (the A tag)
red: for tables (TABLE, TR and TD tags)
green: for the DIV tag
violet: for images (the IMG tag)
yellow: for forms (FORM, INPUT, TEXTAREA, SELECT and OPTION tags)
orange: for linebreaks and blockquotes (BR, P, and BLOCKQUOTE tags)
black: the HTML tag, the root node
gray: all other tags
Nothing new some of you will cry, as this java applet is available since 2007.. Yes but..
The main difference, is that I provide you the last bit of code to make it work on your own server, or locally in any php environment. The magic part that is difficult to get is the function that retrieve the html content of any page and pass it to the applet.
For this task and since on some web host, the php function furlopen() may be forbidden (I recommend you to disable it to reduce backdoor inclusion), i will present you a solution with CURL
PHP supports libcurl, a library created by Daniel Stenberg, that allows you to connect and communicate to many different types of servers with many different types of protocols. libcurl currently supports the http, https, ftp, gopher, telnet, dict, file, and ldap protocols. libcurl also supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading (this can also be done with PHP’s ftp extension), HTTP form based upload, proxies, cookies, and user+password authentication. [PHP Manual]
So I’ve create a small script call display.php that return the content of a webpage
Get Data From URL With Curl
<?php $input = $_GET; $name = 'url'; $url = (isset($input[$name]) &&
$input[$name] !== null) ? $input[$name] :
"http://www.waltercedric.com"; $timeout = 10; $show_errors = true; if (function_exists('curl_init')) { return getDataFromUrlWithCurl($url, $timeout, $show_errors); } else { return getDataFromUrlWithFopen($url, $timeout); } /** * CURL function to retrieve data from a URL. */ function getDataFromUrlWithCurl($url, $timeout = 10, $show_errors = false) { $ch = curl_init(); $agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"; curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt ($ch, CURLOPT_HEADER, 0); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_HTTPGET,1); curl_setopt($ch,CURLOPT_CRLF,1); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); // so it will return data into a
//variable instead of printing out curl_setopt($ch,CURLOPT_TIMEOUT,$timeout); // give it a time in seconds to reply //curl_setopt($ch,CURLOPT_SSL_VERIFYPEER, false); //dont validate SSL cert $result = curl_exec($ch); if ($show_errors && curl_error($ch)) { printf("Curl error %s: %s", curl_errno($ch), curl_error($ch)); print(' <a href="' . $url . '" target="_blank">This is the url</a><br>'); } curl_close($ch); print($result);
}
You can look at all CURL options there: http://us2.php.net/curl_setopt
Get the source code to make your own site
All credits to original author
About the author (Sala) of this applet
http://www.aharef.info/static/htmlgraph/?url=http://www.google.com
Flickrmania
Make a screenshot of your sitegraph below, put it on flickr and tag it websitesasgraphs.