Tag Archives: Website

Thoughts about my new website…

I’ll fix the issue with getting files from other webpages to my host by doing it locally(client-side)

This will release my webpage bandwidth. When you think about it, even if doing it from PHP worked, someone someday would request analytics for a huge file that wouldn’t be feasible to get to my host. By requesting the client to download the file I can also request it to send and/or analyze specific parts of a file(as opposed to getting a whole 1GB movie I could just ask for the first hundred bytes and conclude it is the header of a video and should therefore receive the TAG “video” for example)

Another idea that hit me though is to have a search-engine that could filter multiple tags on wordpress posts. It would still require a user(likely blog owner) to run an open-source Python script to extract tags from posts and upload them to my index. But it would probably be easier than going straight to my auto-tagger/analytics project.

The question though is: is there any demand for either of these websites?

I can see myself using the analytics website IF it could give significant insight into videos and/or foreign accents in text. But those are algos I haven’t developed(not even drafts).

A website that allowed me to filter “Short story” AND “Fiction”? Hmmmm
I guess the tag Flash Fiction already does that…(combine those two)… Well… I guess there is no tag unifying “Fantasy” and “Short story”… but I would need lots of posts to get any decent results with a multiple tags search engine…

The promising side is that neither would take lots of my time, so there is no reason to not do both if my internet connection allows.(I need to look at Python’s and PHP’s online references to code)


Leave a comment

Filed under Uncategorized

How to get Data #2

Remember that I told one can either crawl vast amounts of data and then analyze it or crawl something that has already been analyzed? Well, I’ll make a further distinction
1-Asking people to send data(social networks)
2-Crawling huge amounts of data that have or have not been analyzed(search engines)
3-Crawling small on-demand data that have been analyzed(meta-search engines, flight price search engines)

That is different from my last post in that now 2 and 3 are different not only in that their data (has)/(has not) been tagged/classified but also that their data comes on-demand, an user in a website type 3 searchs for a term and THEN the website 3 looks for that on type 1 and 2 websites.

I just crawled a 1000 websites and still don’t see the tags I want… so this further narrowing of definition will help me… Also, wordpress is no longer a candidate, their tag-search result is in flash and extracting text from flash is well above my pay-grade 😉

Leave a comment

Filed under Uncategorized


So, been thinking about going back to my idea of organizing tags, could put my recent experience with crawling to good use.

Leave a comment

Filed under Uncategorized