Tag Archives: Python

Python Editors and Auto Completion

Just tested an editor called “PyCharm”, recommended for its auto-completion feature, and discovered it can’t complete “m= hashlib.sha1()” (it offers hashlib but not sha1)

In fact, it couldn’t auto-complete urllib.open either. Neither can IDLE(Python’s built-in editor).

Looking at https://wiki.python.org/moin/PythonEditors there are tons of editors, am I supposed to try every one of them? LOL

Edit: PyCharm actually completes sha1 from m= hashlib.sha1, but not m.Something(update, digest, etc)

Leave a comment

Filed under Uncategorized

VB rocks

When it comes down to GUI and cryptography, but sucks in everything else. I have a huge program and I’m not close to finishing my header-adding program, I would have already finished with few lines in python. Given that most users won’t need to add headers(only “webmasters”) I will move to python for this part. The worst part is that I don’t even know whether VB.net strings are binary-safe. That would easy my life somewhat.

Leave a comment

Filed under Uncategorized

Cryptography

Asides from GUI this seems to be the second major weakness in python. No public/private key generator. To make things worse, visual studio documentation doesn’t seem simple either. It has all stuff I probably need but it’s confusing. Does CngKey refers to a public or private key? And how do I store it in a plain text file?

Here is what I really needed:

getPrivateKey()
Gets a new private key that is random/prime/secure/etc

getPublicKey(privateKey)
Gets the matching public key

getSharedSecret(privateKey,friendPublicKey)
Gets a shared secret(that I’ll use in a keyed hash, like hmac)

Done. I could implement that in Python but I am 99% sure I would make a serious mistake and expose the private key.

I don’t want a certificate or to associate either key or the shared secret with any company or person. Just to have that link between private and public key.

Leave a comment

Filed under Uncategorized

Python GUI

I’ve been looking at some Python GUI’s screenshots in the last few days and it’s amazing how Visual Basic 6.0 still looks better than them after all this time.

Leave a comment

Filed under Uncategorized

Crawling WordPress

“301 Moved Permanently” is what I get while trying to get wordpress.com/tag/X/ on php. However, Python urllib still works.

The tip I gave about getting wordpress /page/2 of tags no longer works…

I could pay some serious hosting but I’m under the impression they would make crawling impossible sooner rather than later. Also, without page2+ crawling there is no way I would be able to do any real search.

Leave a comment

Filed under Uncategorized

Hard to believe

When I tried getting posts urls from wordpress reader I first tried finding post’s names on mozilla’s source code viewer, being unable to do that I conclude the website was made with flash/javascript and I would need a way around.

After months thinking about how to crawl wordpresses jumping from blog to blog or simply asking for bloggers to register their blogs or even asking random users to index a post I was quite upset. No alternative seemed good enough. Relevant content needs to be updated continually, needs to be fresh. I tried finding a feed for specific tags on wordpress.com, failing. I even thought today about filtering bing’s results for a given tag a selecting only *.wordpress.com websites, then showing to my users. But then I tried getting a wordpress.com/some_tag page using python script(not mozilla). And all posts urls showed up, as well as titles and even descriptions; a crawl-able website. Not only that, but I can also get older pages by adding “/page/x” where x is [2..Inf]

Unbelievable. That’s exactly what I wanted. Now let’s hope PHP can also get those crawler-friendly pages so I don’t need to pay a python-enabled host.

Leave a comment

Filed under Uncategorized

Python – Silicing with two Delimiters

While making a crawler for blogs I faced the following problem:

——————————-

Info_That_I_Want_Follows: “Important Information”

——————————-

Problem: How to extract “Important Information”?

The ideal way would probably be parsing the html and seeing which tag contains “Important Information”, but I don’t know how to do that so I wrote the following “hack”:

i1= s1.find(“Want_Follows: \””)
i2= s1[i1+len(“Want_Follows: \””):].find(“\””)
FinalString= s1[i1+len(“Want_Follows: \””):i1+len(“Want_Follows: \””)+i2]

But now that I think about it, a simpler way would be:

s2= s1.split(“Want_Follows: \””,1)[1]
FinalString= s2.split(“\””,1)[0]

A lot easier to read. Yet another cool thing would be to make a function like this:

def Slice2Ways(Delimiter1,Delimiter2,StringToSlice):
String2= StringToSlice.split(Delimiter1,1)[1]
return String2.split(Delimiter2,1)[0]

Because this kind of slicing is done frequently if one is to extract data without building a full blown parser.

Leave a comment

Filed under Uncategorized

Thoughts about my new website…

I’ll fix the issue with getting files from other webpages to my host by doing it locally(client-side)

This will release my webpage bandwidth. When you think about it, even if doing it from PHP worked, someone someday would request analytics for a huge file that wouldn’t be feasible to get to my host. By requesting the client to download the file I can also request it to send and/or analyze specific parts of a file(as opposed to getting a whole 1GB movie I could just ask for the first hundred bytes and conclude it is the header of a video and should therefore receive the TAG “video” for example)

Another idea that hit me though is to have a search-engine that could filter multiple tags on wordpress posts. It would still require a user(likely blog owner) to run an open-source Python script to extract tags from posts and upload them to my index. But it would probably be easier than going straight to my auto-tagger/analytics project.

The question though is: is there any demand for either of these websites?

I can see myself using the analytics website IF it could give significant insight into videos and/or foreign accents in text. But those are algos I haven’t developed(not even drafts).

A website that allowed me to filter “Short story” AND “Fiction”? Hmmmm
I guess the tag Flash Fiction already does that…(combine those two)… Well… I guess there is no tag unifying “Fantasy” and “Short story”… but I would need lots of posts to get any decent results with a multiple tags search engine…

The promising side is that neither would take lots of my time, so there is no reason to not do both if my internet connection allows.(I need to look at Python’s and PHP’s online references to code)

Leave a comment

Filed under Uncategorized