Jump to content

Capturing Username


Hate

Recommended Posts

I'm trying to parse html and get the usernames of people registered. I'm able to get most of the usernames, however there are a few usernames that have stars beside their names inside of the a tag which is throwing my expression off.

 

Here's an example of what I mean:

<a href="user.php?i=252341">Marssz3<img src="/images/i67.gif"></a>

 

Here's my current code:

preg_match_all('/<a href=\"user\.php\?i=.*">(.*?)<\/a>/ism', $row, $title

 

How could I change my expression to ignore the "<img src="/images/i67.gif">" if it is present (not always present) and in this particular case grab the username "Marssz3" (though thats not static)

Link to comment
Share on other sites

First off there's no need to escape that double quote there, as it's not a meta character in either RegExp or single-quoted PHP strings.

 

Then to your issue: What you want to look for is everything that's not a "<", instead of everything up to the "</a>" tag. In other words [^<]+ is what you're looking for. :)

Link to comment
Share on other sites

First off there's no need to escape that double quote there, as it's not a meta character in either RegExp or single-quoted PHP strings.

 

Then to your issue: What you want to look for is everything that's not a "<", instead of everything up to the "</a>" tag. In other words [^<]+ is what you're looking for. :)

 

I'm still confused. Could you explain a bit better? I learn from example.

Link to comment
Share on other sites

swap out the part of your regex that matches the name, with the regex ChristianF gave you.  Do you know which part of your regex matches the username?  Were you the one who wrote this code?

 

I wrote my current code. I tried swapping what ChristianF gave me, but it's not working.

Link to comment
Share on other sites

This will match everything inside the brackets; a-z:

[a-z]

 

This will match everything but what's inside the brackets; a-z:

[^a-z]

 

The code he showed you will match everything but the "<" character, which is when the tag starts closing, and the username in your example is ended.

[^<]+

 

If it's not working, it's probably because you still have "<\/a>" at the end. "[^<]" will not match your string because of the <img> tag. So you could just remove "<\/a>" at the end of your regex, or add information about the img tag.

Link to comment
Share on other sites

$regEx = '#<a href="user\.php\?i=[^>]+>([^<]*)#ism';
preg_match_all($regEx, $text, $title);

 

Although, I will proffer another suggestion. Sometimes I have had a similar problem and found that trying to build the perfect regex is either not possible, too much work, or not efficient. In those cases look to other means to post-process the data. In this instance if there was not a good solution for what you needed you could have simply used striptags() on the result to remove the image tag.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.