Python – Silicing with two Delimiters

While making a crawler for blogs I faced the following problem:

——————————-

Info_That_I_Want_Follows: “Important Information”

——————————-

Problem: How to extract “Important Information”?

The ideal way would probably be parsing the html and seeing which tag contains “Important Information”, but I don’t know how to do that so I wrote the following “hack”:

i1= s1.find(“Want_Follows: \””)
i2= s1[i1+len(“Want_Follows: \””):].find(“\””)
FinalString= s1[i1+len(“Want_Follows: \””):i1+len(“Want_Follows: \””)+i2]

But now that I think about it, a simpler way would be:

s2= s1.split(“Want_Follows: \””,1)[1]
FinalString= s2.split(“\””,1)[0]

A lot easier to read. Yet another cool thing would be to make a function like this:

def Slice2Ways(Delimiter1,Delimiter2,StringToSlice):
String2= StringToSlice.split(Delimiter1,1)[1]
return String2.split(Delimiter2,1)[0]

Because this kind of slicing is done frequently if one is to extract data without building a full blown parser.

Advertisements

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s