Monday, September 25, 2006

Parsing yahoo catalog for getting all links and site info

digg  BlinkList  blinkbits  De.lirio.us  Furl  Reddit  Shadows  Simpy  Smarking  Spurl    YahooMyWeb 
As promised, today I post php sources for parsing yahoo catalog.

You can download sources codes by this direct link.

File get-yahoo.zip contains 3 files:
sql.sql - mysql script for creating tables.
lib_getWeb.php - php class for downloading specified pages.
get-yahoo.php - script for downloading and analize each yahoo page.

In result we will receive filled database with all links and domains from yahoo catalog. All data will store in yahoo_links table, in field host will be our VIP domains, after we should make script which will get whois information for each domain from our list, I will publish it on next week. This week get-yahoo script will work and fill our database.

During this week I will post few very interesting articles which will help us in future.

0 Comments:

Post a Comment

<< Home