Jump to content

How to parse specific text from a file?


yusufali

Recommended Posts

lets say I have a file located here http://www.chess.com/groups/team_match?id=108425.html

 

*All highlighted in red occur once on the page

 

assuming a is an integer, how can i get the value of a?

<div class="content left">

  <span class="left big-icon players"></span>

  <h1 class="page-title">World League 2012 R1: "Team Russia" vs. "Team Australia"</h1>

  <ul class="list no-border clearfix bottom-16 left-170">

    <li>

      <aside class="rail"><span id="startingpositiondiagram" class="chessfenboard20 right" style="margin: 0 auto; height: 162px; width: 162px;"></span></aside>

 

            <table class="default simple border-top clearfix alternate">

        <tr>

          <td>Players per Team:</td>

          <td><strong>a</strong></td>

          <td> </td>

            <td>Started On:</td>

          <td><strong>Mar 10, 2012</strong></td>

          </tr>

        <tr>

 

also on the same page

&nbsp;</strong></td><td class="align-right"><strong>= b</strong></td><td> </td><td class="align-left"><strong>= c</strong>

 

how could i get the value of b and c assuming they are integers, in seperate variables of course

 

I tried using like

$url = file('http://....');

and tried looping through each line to somehow get close

I know there is a way to use delimiters (like in Java)

but i can never understand the tutorials that i've read

Link to comment
https://forums.phpfreaks.com/topic/266957-how-to-parse-specific-text-from-a-file/
Share on other sites

Something like

 

<?php

$raw_html = file_get_contents('http://www.chess.com/groups/team_match?id=108425.html');

$expression = "#<td>Players per Team:</td>\s*<td><strong>(\d+)</strong></td>#i";

preg_match_all($expression, $raw_html, $results);

print_r($results);

?>

alright that works perfectly, now when i try to get value b

i used

 

"#nbsp;</strong></td><td class=\"align-right\"><strong>(\d+)</strong></td>#i";

 

as my $expression

and as my out put i got

Array ( [0] => Array ( ) [1] => Array ( ) )

which isn't the value of b

<?php

$raw_html = file_get_contents('http://www.chess.com/groups/team_match?id=108425.html');

$expression = "#nbsp;</strong></td><td class=\"align-right\"><strong>= (\d+)</strong></td>#i";

preg_match_all($expression, $raw_html, $results);

print_r($results);

?>

 

Gives me

 

Array
(
    [0] => Array
        (
            [0] => nbsp;</strong></td><td class="align-right"><strong>= 212</strong></td>
        )

    [1] => Array
        (
            [0] => 212
        )

)

alright I figured out what I did wrong, thanks for the help.

but

if value b  or c is a decimal (e.g.: 80.5) like in this example

 

http://www.chess.com/groups/team_match.html?id=108517

 

it returns that funny out put i was getting before,

any fix for this?

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.