Jump to content

How to convert ereg_replace into preg_replace


facido1

Recommended Posts

I need to convert following two lines of ereg_replace functions into preg_replace::

$output = ereg_replace('\[anchor="([[:graph:]]+)"\]', '<a name="\\1"></a>', $output);

$output = ereg_replace('\[link="([[:graph:]]+)"\]', '<a href="\\1">', $output);

 

 

Link to comment
Share on other sites

You have bigger problems than the conversion:

  • Your pattern was always wrong, because the [[:graph:]]+ part is greedy and may consume the next anchor or link as well. If the code seemingly “worked” in the past, that's because the graph class doesn't include whitespace. But this is pure luck. Try input without whitespace to see the pattern fail miserably.
  • There are no security measures whatsoever. If the input comes from the users or can be manipulated, you're wide open to cross-site scripting attacks. Anchor elements are particularly nasty in this regard, because simple HTML-escaping isn't enough; people can still inject code with javascript: or data: URLs.

Inventing your own language and trying to parse it with regex gymnastics is rarely a good idea. Use a standard markup language like Markdown and a proper parser.

 

parsedown looks OK. Unfortunately, they haven't thought about unsafe URLs either, so you need to modify the class a bit:

<?php

require_once '/path/to/parsedown/or/autoloader';



class SafeMarkdown extends Parsedown
{
    protected $allowedURLSchemes;

    public function __construct($allowedURLSchemes = ['http', 'https', 'mailto'])
    {
        // disable embedded HTML markup by default
        $this->setMarkupEscaped(true);

        // only accept specific URL schemes to prevent XSS attacks through javascript: or data: URIs
        $this->allowedURLSchemes = $allowedURLSchemes;
    }

    protected function inlineLink($excerpt)
    {
        $linkData = parent::inlineLink($excerpt);

        // only allow specific URLs schemes
        $url = parse_url($linkData['element']['attributes']['href']);

        if ($url === false)
        {
            throw new RuntimeException('Malformed URL while parsing link: '.$url);
        }

        if (isset($url['scheme']) && !in_array(strtolower($url['scheme']), $this->allowedURLSchemes, true))
        {
            throw new RuntimeException('Unexpected URL scheme while parsing link: '.$url['scheme'].' (allowed: '.implode(', ', $this->allowedURLSchemes).')');
        }

        return $linkData;
    }
}
<?php

require_once '/path/to/class/or/autoloader';



$markdownParser = new SafeMarkdown();

echo $markdownParser->text("[I'm an inline-style link](https://www.google.com)");

// test unsafe URL scheme
echo $markdownParser->text("[I'm an inline-style link](javascript:alert('XSS'))");
Link to comment
Share on other sites

What have you tried so far? Do you know what those regexes are doing now? Have you learned about PCRE? Have you checked the documentation?

 

Is the following conversion correct?

 

$output = preg_replace('/\[anchor="([[:graph:]]+)"\]/', '<a name="\\1"></a>', $output);

 

$output = preg_replace('/\[link="([[:graph:]]+)"\]/', '<a href="\\1">', $output);

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.