Tag Archives: PHP

PHP(ish) function for visual studio 2010

Explode function in VB:

 

Private Function phpishExplode(ByVal del As String, ByVal s1 As String)
Dim returnValue As New ArrayList

Dim lastMatchIndex As Integer = 0
For i As Integer = 0 To s1.Count – del.Length
Dim match As Boolean = True
For i2 As Integer = 0 To del.Length – 1

If s1(i + i2) <> del(i2) Then
match = False
Exit For
End If

Next
If match = True Then
returnValue.Add(s1.Substring(lastMatchIndex, i – lastMatchIndex))
lastMatchIndex = i + del.Length

End If
Next
returnValue.Add(s1.Substring(lastMatchIndex, s1.Length – lastMatchIndex))

Return returnValue
End Function

Advertisements

Leave a comment

Filed under Uncategorized

Crawling WordPress

“301 Moved Permanently” is what I get while trying to get wordpress.com/tag/X/ on php. However, Python urllib still works.

The tip I gave about getting wordpress /page/2 of tags no longer works…

I could pay some serious hosting but I’m under the impression they would make crawling impossible sooner rather than later. Also, without page2+ crawling there is no way I would be able to do any real search.

Leave a comment

Filed under Uncategorized

Hard to believe

When I tried getting posts urls from wordpress reader I first tried finding post’s names on mozilla’s source code viewer, being unable to do that I conclude the website was made with flash/javascript and I would need a way around.

After months thinking about how to crawl wordpresses jumping from blog to blog or simply asking for bloggers to register their blogs or even asking random users to index a post I was quite upset. No alternative seemed good enough. Relevant content needs to be updated continually, needs to be fresh. I tried finding a feed for specific tags on wordpress.com, failing. I even thought today about filtering bing’s results for a given tag a selecting only *.wordpress.com websites, then showing to my users. But then I tried getting a wordpress.com/some_tag page using python script(not mozilla). And all posts urls showed up, as well as titles and even descriptions; a crawl-able website. Not only that, but I can also get older pages by adding “/page/x” where x is [2..Inf]

Unbelievable. That’s exactly what I wanted. Now let’s hope PHP can also get those crawler-friendly pages so I don’t need to pay a python-enabled host.

Leave a comment

Filed under Uncategorized

Windows Market Share: 91%

http://www.netmarketshare.com/operating-system-market-share.aspx?qprid=8&qpcustomd=0

Just discovered that my free web host won’t allow me to use php’s file_get_contents() on other websites.

That, coupled with this statistic makes going back to visual basic more appealing.

Leave a comment

Filed under Uncategorized

The creawler lib idea is probably not going to work, crawling and analysing are way too interwined, I have however separated the graph creation part in a file. the function Create_Edge(node,node,typeOfconnection) creates 2 nodes(or edit them) and 1 edge. following are one file that uses the library(graphTest.php) and the library(libGraph.php)

graphTest.php

<html>
<body>

<?php

include "libGraph.php";

Create_Edge("node1","node2","relates",FALSE);
Create_Edge("node1","node3","Parent_Of");
Create_Edge("node3","node1","Child_Of");

?>

</body>
</html>

//======================================
//ANOTHER FILE!!!!!!!!!!!!!!!!
//libGraph.php
//======================================

<?php


function Create_Edge($nodeA,$nodeB,$edgeType,$Directed_RETURN_TRIP=TRUE)
{


mkdir("./graph/");

$nodeAfile= "./graph/".base64_encode($nodeA);

if (file_exists($nodeAfile) )
{
$fh1= fopen($nodeAfile,"r+");
$raw_node= fread($fh1,filesize($nodeAfile));
$node= unserialize($raw_node);
$node[$edgeType][$nodeB]= TRUE;
ftruncate($fh1,0);
rewind($fh1);
fwrite($fh1,serialize($node));
fclose($fh1);
}
else
{
$fh1= fopen($nodeAfile,"w");
$node[$edgeType][$nodeB]= TRUE;
fwrite($fh1,serialize($node));
fclose($fh1);
}


$nodeBfile= "./graph/".base64_encode($nodeB);
if (!file_exists($nodeBfile) )
{
$fh1= fopen($nodeBfile,"w");
fclose($fh1);
}


if ($Directed_RETURN_TRIP==FALSE)
 Create_Edge($nodeB,$nodeA,$edgeType,TRUE);//says true, but actually, directed is false in both cases. One could call it round_trip

$fLog1= fopen("graph.log","a");
fwrite($fLog1,"Created edge: ".$nodeA."\x19".$edgeType."\x19".$nodeB."\n");
fclose($fLog1);

return;

}

/*
Also, create file: nodeA+edgeType+nodeB

Todo, separate $Direction and $RoundTrip
nodeA[Edge][nodeB]= $Direction(POINTER, ARROW, NONE)

if $Direction==NONE and RT==false
 call create edge, with RT as true

*/


?>


Leave a comment

Filed under Uncategorized

Composite Hedges and Graphs

So, my tagtree website is going, but now that I have crawled a few hundred posts I would like to connect them… meaning I need graphs and probably some visual information to make solid design decisions.

What surprises me however is the total lack of libraries to deal with graphs(at least in the default php, without extensions)… I have been thinking about developing my own graph library and even a crawler library that would help me focus on stuff that matters without recoding everything all the time.

The same can be said about composite hedges, something so basic to financial analysis that is lacking from websites I visited so far. The closest one can get is finance.yahoo, you can make a composite chart, but changing percentages of stocks in your portfolio or actually measuring the difference between two quotes/indexes is impossible(you can see them on naked eye, but no numeric value is given).
I could also make a website that does that.

They are both good at setting up a more productive enviroment for me and others but unlikely to earn me cash in and of themselves.

Leave a comment

Filed under Uncategorized