Jump to content

How to parse specific text from a file?


yusufali

Recommended Posts

lets say I have a file located here http://www.chess.com/groups/team_match?id=108425.html

 

*All highlighted in red occur once on the page

 

assuming a is an integer, how can i get the value of a?

<div class="content left">

  <span class="left big-icon players"></span>

  <h1 class="page-title">World League 2012 R1: "Team Russia" vs. "Team Australia"</h1>

  <ul class="list no-border clearfix bottom-16 left-170">

    <li>

      <aside class="rail"><span id="startingpositiondiagram" class="chessfenboard20 right" style="margin: 0 auto; height: 162px; width: 162px;"></span></aside>

 

            <table class="default simple border-top clearfix alternate">

        <tr>

          <td>Players per Team:</td>

          <td><strong>a</strong></td>

          <td> </td>

            <td>Started On:</td>

          <td><strong>Mar 10, 2012</strong></td>

          </tr>

        <tr>

 

also on the same page

&nbsp;</strong></td><td class="align-right"><strong>= b</strong></td><td> </td><td class="align-left"><strong>= c</strong>

 

how could i get the value of b and c assuming they are integers, in seperate variables of course

 

I tried using like

$url = file('http://....');

and tried looping through each line to somehow get close

I know there is a way to use delimiters (like in Java)

but i can never understand the tutorials that i've read

Link to comment
Share on other sites

Something like

 

<?php

$raw_html = file_get_contents('http://www.chess.com/groups/team_match?id=108425.html');

$expression = "#<td>Players per Team:</td>\s*<td><strong>(\d+)</strong></td>#i";

preg_match_all($expression, $raw_html, $results);

print_r($results);

?>

Link to comment
Share on other sites

alright that works perfectly, now when i try to get value b

i used

 

"#nbsp;</strong></td><td class=\"align-right\"><strong>(\d+)</strong></td>#i";

 

as my $expression

and as my out put i got

Array ( [0] => Array ( ) [1] => Array ( ) )

which isn't the value of b

Link to comment
Share on other sites

<?php

$raw_html = file_get_contents('http://www.chess.com/groups/team_match?id=108425.html');

$expression = "#nbsp;</strong></td><td class=\"align-right\"><strong>= (\d+)</strong></td>#i";

preg_match_all($expression, $raw_html, $results);

print_r($results);

?>

 

Gives me

 

Array
(
    [0] => Array
        (
            [0] => nbsp;</strong></td><td class="align-right"><strong>= 212</strong></td>
        )

    [1] => Array
        (
            [0] => 212
        )

)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.