How to get Data #2

Remember that I told one can either crawl vast amounts of data and then analyze it or crawl something that has already been analyzed? Well, I’ll make a further distinction
1-Asking people to send data(social networks)
2-Crawling huge amounts of data that have or have not been analyzed(search engines)
3-Crawling small on-demand data that have been analyzed(meta-search engines, flight price search engines)

That is different from my last post in that now 2 and 3 are different not only in that their data (has)/(has not) been tagged/classified but also that their data comes on-demand, an user in a website type 3 searchs for a term and THEN the website 3 looks for that on type 1 and 2 websites.

I just crawled a 1000 websites and still don’t see the tags I want… so this further narrowing of definition will help me… Also, wordpress is no longer a candidate, their tag-search result is in flash and extracting text from flash is well above my pay-grade 😉

Advertisements

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s