Jump to content

How do I open another site's page and strip text from it for my page?


babysitter

Recommended Posts

I want to go to another site, get a specific page, collect the html and read specific parts of it and then display those bits in my page.

 

I have used the following in a website to collect the lat and long from one site to use in Google maps as in the UK we cannot use Geocode because the postcode/zipcode data is copyrighted.

 

I am afraid I can't wok out exactly what it is doing to get the lat and long.  My reverse engineering skills are letting me down.

 

Can anyone help please?

 

<?

if (isset($HTTP_GET_VARS['s'])){

 

$html = "";

 

$URL =file("http://www.schoolswebdirectory.co.uk/schoolinfo2.php?ref=29571".str_replace(" ","",$HTTP_GET_VARS['s'])."&advanced=&client=public&addr2=&quicksearch=".$HTTP_GET_VARS['s']."&addr3=&addr1=");

 

foreach ($URL as $url){

$html = $html.$url;

}

$html=strip_tags($html);

$page = explode("(",$html);

$lat = explode(")",$page[2]);

$longg = explode(")",$page[3]);

 

echo $lat;

echo $longg;

 

}

?>

Link to comment
Share on other sites

It is breaking up the page by the "(" and ")" characters.  It knows that there are a fixed number of "(" before the latitude and longitude turn up.  It's a very un-sophisticated method :)

 

Try printing out out each variable used during the processing to get a better understanding of what's going on.

 

Particularly print out $page (use var_dump($page))

Link to comment
Share on other sites

Of course it would help if I had included the origonal code not the oiece I was working on!!! Doh!

 

Thank you for your replies, much appreciated.

 

 

<?

if (isset($HTTP_GET_VARS['s'])){

 

$html = "";

 

$URL =file("http://www.multimap.com/map/browse.cgi?client=public&search_result=&db=pc&lang=&keepicon=true&pc=".str_replace(" ","",$HTTP_GET_VARS['s'])."&advanced=&client=public&addr2=&quicksearch=".$HTTP_GET_VARS['s']."&addr3=&addr1=");

 

foreach ($URL as $url){

$html = $html.$url;

}

$html=strip_tags($html);

$page = explode("(",$html);

$lat = explode(")",$page[2]);

$longg = explode(")",$page[3]);

 

 

 

}

?>

Link to comment
Share on other sites

I am getting increasingly frustrated trying to extract information from the following html as there are so many pieces of html code next to the text I want to extract.  I am finding it very difficult to create a regular expression that works to extract the highlighted bits below.  Is there anyone who can help me with the regular expressions please?

 

 

<tr align="left">

            <td width="80" height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">School: </font></div></td>

            <td width="227" valign="bottom" class="listtext1"><font size=2>Kendal Nursery School</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Street: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Queens Road</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Town: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Kendal</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">County: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Cumbria</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Postcode: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>LA9 4PH</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="top" class="listtext1"> </td>

            <td height="22" valign="bottom" class="listtext1"> </td>

<td height="22" valign="top" class="listtext1">

 

 

 

 

      </td>

          </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">School Website: </font> </div></td>

            <td height="23" colspan="2" valign="bottom" class="listtext1"><font size="2"><A HREF='http://www.kendalnurseryschoolbrantfield.co.uk/'>http://www.kendalnurseryschoolbrantfield.co.uk/</A><BR></font></td>

 

 

 

The full HTML is below if required

 

 

 

 

 

 

 

 

 

 

<body bgcolor="#FEF5DE">

 

<CENTER>

  <table width="120" height="8" border="0" cellspacing="0" cellpadding="0">

    <tr>

      <td> </td>

    </tr>

  </table>

  <table width="615" height="363" border="1" cellpadding="0" cellspacing="0" bordercolor="#999999" bgcolor="#FFFFFF">

    <tr>

      <td width="611" height="361"><table width="96%" height="431" border="0" align="center" cellpadding="2" cellspacing="0" bgcolor="#FFFFFF">

        <tbody>

          <tr align="left">

            <td colspan="2" valign="top" class="pagetext2"></td>

            <td valign="top" class="pagetext2"></td>

          </tr>

          <tr align="left" valign="top">

            <td height="64" colspan="3" class="pagetext2"><div align="left"><font color="#FF0000" size="6" face="Arial, Helvetica, sans-serif">schools</font><font color="#009900" size="6" face="Arial, Helvetica, sans-serif"><font color="#FF6600">web</font><font color="#FF0000">directory</font></font><font color="#FF6600" size="4">.co.uk</font></div>

                <font size="2" color="#000000"><u>School

                  Information</u></font>

                </td>

          </tr>

          <tr align="left" valign="middle">

            <td height="23" valign="bottom" class="listtext1"><div align="left">

              <div align="right"><font color="#000000" size="1">Our Ref No : </font></div>

            </div></td>

            <td height="23" valign="bottom" class="pagetext2"><font size= 2 color = red >31443</font> </td>

            <td width="268" rowspan="6" class="pagetext2"><div align="center"> <a href="image-link.php?ref=31443"> <img src="images/180x120.jpg" width="180" height="120" border="1" /></a> <br />

            </div></td>

          </tr>

          <tr align="left">

            <td width="80" height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">School: </font></div></td>

            <td width="227" valign="bottom" class="listtext1"><font size=2>Kendal Nursery School</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Street: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Queens Road</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Town: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Kendal</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">County: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>Cumbria</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Postcode: </font></div></td>

            <td height="22" valign="bottom" class="listtext1"><font size=2>LA9 4PH</font></td>

          </tr>

          <tr align="left">

            <td height="22" valign="top" class="listtext1"> </td>

            <td height="22" valign="bottom" class="listtext1"> </td>

<td height="22" valign="top" class="listtext1">

 

 

 

 

      </td>

          </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">School Website: </font> </div></td>

            <td height="23" colspan="2" valign="bottom" class="listtext1"><font size="2"><A HREF='http://www.kendalnurseryschoolbrantfield.co.uk/'>http://www.kendalnurseryschoolbrantfield.co.uk/</A><BR></font></td>

            </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Alumni Website: </font> </div></td>

            <td height="23" colspan="2" valign="bottom" class="listtext1"><font size="2"><A HREF=''></A><BR></font></td>

            </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Tel: </font> </div></td>

            <td height="23" valign="bottom" class="listtext1"><font size="2">01539 773626</font></td>

            <td height="23" valign="top" class="listtext1"><font size="2">click for <a href="http://uk8.multimap.com/map/browse.cgi?client=public&db=pc&addr1=&client=public&addr2=&advanced=&addr3=&pc=LA9 4PH" target="_blank" class="link_submenu">map</a></font></td>

          </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="listtext1" ><div align="right" class="listtext1">Fax:</div></td>

            <td height="23" valign="bottom" class="listtext1"><font size="2">01539 773626</font></td>

            <td valign="top" class="pagetext2"></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">Type:</font></div></td>

            <td height="22" valign="bottom" class="listtext1">

nur   <font size = -1 color = red>Independent</font></td>

            <td valign="top" class="pagetext2"><span class="tabletext1"><a href="i-spy360.php"></a></span></td>

          </tr>

          <tr align="left">

            <td height="22" valign="bottom" class="listtext1"><div align="right"><font color="#000000" size="1">LEA:</font></div></td>

            <td height="22" valign="bottom" class="listtext1">Cumbria</td>

            <td valign="top" class="pagetext2"><a href="i-spy360.php"></a></td>

          </tr>

          <tr align="left">

            <td height="23" valign="bottom" class="tabletext1"> </td>

            <td height="23" valign="bottom" class="pagetext2"> </td>

            <td valign="top" class="pagetext2"><div align="right"><a href="javascript: history.back(1)" class="link_submenu">BACK</a> </div></td>

             

<!-- http://www.multimap.com/map/browse.cgi?client=public&pc=NR2%204DX -->

          </tr>

          <tr align="left" valign="middle">

            <td height="74" colspan="3" class="pagetext2"><div  class="listtext2" align="center"><font face="Geneva, Arial, Helvetica, sans-serif" size="1"><br />

                  </font><font face="Geneva, Arial, Helvetica, sans-serif">To correct or change any of the above information</font><font face="Geneva, Arial, Helvetica, sans-serif" size="1"><span class="listtext2">please email us <a href="mailto:updates@schoolswebdirectory.co.uk?Subject=School Ref 31443, Kendal Nursery School, LA9 4PH - Updates!">here</a></span> </font></font></div>

              <p> </p>

              <div align="center"><font face="Geneva, Arial, Helvetica, sans-serif" size="1">Copyright

              © Deepspace Web Services Ltd 1999-2007, All rights reserved</font></div></td>

          </tr>

        </tbody>

      </table></td>

    </tr>

  </table>

  <p> </p>

</CENTER>

<p> </p>

</body>

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.