Jump to content

regex not working


SidewinderX

Recommended Posts

ok so i have this script that displays the content of a webpage

[code] <?php
$url = "https://www.novaworld.com/Players/Stats.aspx?id=33680801261&p=616065";

//Gets the content of the submited URL;
$content = file_get_contents($url);

//Strips the tags of the $content;
$strip = strip_tags($content);

//Removes Everything in the up to For:;
$remove = stristr($strip, 'Hawk Down');

//Declares $remove as $getstats;
$stats = $remove;
echo $stats;
?> [/code]

what i want to do is parse the name Ozymandias. from the $stats string. While trying to build my regex i used a pseudo-code so i could avoide the long load of getting the actual content of the website. The psuedo code is below:

[code]<?php
$string = 'Down Ozymandias. Rank: 1-Star General ( 14 ) PCID: A13-3E71AC Player Created: June 9, 2006 TABLE.statsx ';
preg_match('#Down (.*?) Rank#', $string, $matches);
$str = $matches[1];
echo $str;
?>[/code]

the $string is a word for word exerpt from the $stats string in the first code. So i figured I could use $string to test out my regex and when i get it to work just replace the static $string with the dynamic string ($stats). The regex above works fine with the static string but when i encorperated it into the larger script it does not work.

[code]<?php
$url = "https://www.novaworld.com/Players/Stats.aspx?id=33680801261&p=616065";

//Gets the content of the submited URL;
$content = file_get_contents($url);

//Strips the tags of the $content;
$strip = strip_tags($content);

//Removes Everything in the up to For:;
$remove = stristr($strip, 'Hawk Down');

//Declares $remove as $getstats;
$stats = $remove;

preg_match('#Down (.*?) Rank#', $stats, $matches);
$str = $matches[1];
echo $str;

?>[/code]

I have NO idea whats wrong, this is the only part of the code that is holding me back from completing this script. If someone could please help me solve this problem i would really appreciate it.

Thank you
Link to comment
Share on other sites

You need to do some debugging. I got an error (failed to open stream) when using your code; although, this could be a configuration issue on my part. Are your errors on? Have you echoed out each step to see that it did what you expected?
Link to comment
Share on other sites

i believe thats a configuration issue on your part, check your php info and see if https is a registered stream, i dont think it is.

For me:
[quote][b]Registered PHP Streams[/b]  php, file, http, ftp, compress.zlib, compress.bzip2, https, ftps [/quote]


EDIT:
Well at first my errors werent on but i enabled theme and there were only a few warnings due to undefined variables, i fixed them but that didnt fix the overall problem as suspected; and yes, i have echoed every step of the way and it does exactley what i want. Infact i have parsed 100 stats from this page using a different parseing method and they all work fine. However because I am parseing a name which can be 16 alphanumeric characters including hypehns, underscores and spaces (and not just a number) only a regex would work for this.

Thank You for your input, any other ideas?

p.s. when i echo $stats in the first code it yields this:
[quote]Hawk Down Ozymandias. Rank: 1-Star General ( 14 ) PCID: A13-3E71AC Player Created: June 9, 2006 TABLE.statsx TD, TABLE.statsx TH { height: 24px; font-size: 10pt; white-space: nowrap; } TABLE.statsx TD.left { background: url(/Images/Stats/cont_grn_01.gif); width: 2px; } TABLE.statsx TH { background: url(/Images/Stats/cont_grn_02.gif); text-align: left; font-weight: bold; } TABLE.statsx TD { background: url(/Images/Stats/cont_grn_02.gif); text-align: right; } TABLE.statsx TD.right { background: url(/Images/Stats/cont_grn_03.gif); width: 2px; } TABLE.statsx TR.spacer TD { background: url(/Images/Stats/spacer.gif); height: 2px; } Total Team Games Played: 568 Total Time Played: 4d 6h 50m Total Kills: 9596 Team Win Percentage: 60.21% Favorite Weapon Class: Assault Rifle Minutes in Zone (KOTH) 1300 Flags Captured (FB) 2 Targets Destroyed (A&D) 258 Awards Received Army Commendation Medal with 1 Bronze Oak Leaf Cluster Bronze Star with 1 Bronze Oak Leaf Cluster CQB Badge 1st Award Headhunter's Medal 1st Award Hill Giant Medal 1st Award Combat Infantryman Badge 1st Award Marksman Badge Combat Medical Badge 1st Award Purple Heart (12) Sapper's Badge 1st Award Overall Statistics TABLE.statsy TD, TABLE.statsy TH { height: 24px; font-size: 10pt; } TABLE.statsy TD.left { background: url(/Images/Stats/cont_blue_01.gif); width: 2px; } TABLE.statsy TH { background: url(/Images/Stats/cont_blue_02.gif); text-align: left; font-weight: bold; } TABLE.statsy TD { background: url(/Images/Stats/cont_blue_02.gif); text-align: right; } TABLE.statsy TD.right { background: url(/Images/Stats/cont_blue_03.gif); width: 2px; } TABLE.statsy TR.spacer TD { background: url(/Images/Stats/spacer.gif); height: 2px; } TABLE.statsz TD, TABLE.statsz TH { height: 24px; font-size: 10pt; } TABLE.statsz TD.left { background: url(/Images/Stats/cont_blk_01.gif); width: 2px; } TABLE.statsz TH { background: url(/Images/Stats/cont_blk_02.gif); text-align: left; font-weight: bold; white-space: nowrap; } TABLE.statsz TD { background: url(/Images/Stats/cont_blk_02.gif); text-align: center; } TABLE.statsz TD.right { background: url(/Images/Stats/cont_blk_03.gif); width: 2px; } TABLE.statsz TR.spacer TD { background: url(/Images/Stats/spacer.gif); height: 2px; } TABLE.statsz TR.header TH { background: url(/Images/Stats/spacer.gif); text-align: center; } IMG[/quote]
and the regex is set up perfectly to parse that because it works in my pseudo code
Link to comment
Share on other sites

thats exactley the problem, i know the regex works fine, it just has to be the content of the string isnt matching the regex pattern.

When the content is passed through these functions

$content = file_get_contents($url);
$strip = strip_tags($content);

and then echoed, the output (echo $strip;) must be different then the actual content ($strip), no?

like $strip might equal "Down<br>Ozymandias.<br>Rank" but when it is outputed the html is not displayed or something.....
Link to comment
Share on other sites

Try [tt]echo $stats;[/tt] and then view the source to see what's really there. The HTML display is not necessarily an accurate depiction. It sounds like you may also need the [tt]/s[/tt] switch so that the . matches new lines.
Link to comment
Share on other sites

The source looks like this for the portion im trying to parse (the whole source is attached)

[quote]Player Statistics for Delta Force - Black Hawk Down










Ozymandias.
Rank: 1-Star General ( 14 )
[/quote]

EDIT: ok this script works with the regex...and does exactley what i want, but i need some help cleaning it up (dont laugh LOL)

[code]<?php
$url = "https://www.novaworld.com/Players/Stats.aspx?id=8542545489&p=616065";

//Gets the content of the submited URL;
$content = file_get_contents($url);

//Strips the tags of the $content;
$strip = strip_tags($content);
preg_match('#Down










(.*?)
Rank#', $strip, $matches);
$str = $matches[1];
echo $str;
?>[/code]

[attachment deleted by admin]
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.