Jump to content

DomDocument issues. Urgent


GameYin

Recommended Posts

I have a form where users enter a URL and it will need to pull data from the page (all information is within a div that has class="page-main-content"). I need to select the first occurance of an H1 element, along with a handful of other HTML elements. Can anyone help? I have this code as my test.php page. Then in the URL bar for the form, I enter in test2.php

 

test.php

<?php
if(true)
{
  if(!isset($_POST['submit']))
  {
  ?>
    <form action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]); ?>" method="post">
    <label for="url">Enter the URL of the article:</label> <input id="url" name="URL" type="text" />
    <label for="submit"><input id="submit" class="button" name="submit" type="submit" /></form>
  <?php
  }
  else if(filter_var($_POST['URL'], FILTER_VALIDATE_URL) === false)
  {
  ?>
    <div class="error"><p>Error: The URL you entered was invalid. Please try again</p></div>
    <form action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]); ?>" method="post">
    <label for="url">Enter the URL of the article:</label> <input id="url" name="URL" type="text" />
    <label for="submit"><input id="submit" class="button" name="submit" type="submit" /></form>
  <?php
  }
  else
  {
    $url=$_POST['URL'];
    $doc = new DOMDocument;
    $doc->preserveWhiteSpace = FALSE;
    $doc->loadHTMLFile($url);
    $emailContents=array();
    $xpath=new DomXPath($doc);


    $h1Found=false;


    //Find element with class="page-main-content"
    $results=$xpath->query("//*[contains(@class, 'page-main-content')]");
    if (!is_null($results))
    {
      foreach ($results as $element)
      {
        $nodes = $element->childNodes;
        foreach ($nodes as $node)
        {
          if(trim($node->textContent, " \n\r\t\0\xC2\xA0")!=='' && $node->nodeName==='h1' && !$h1Found)
          {
            echo "THIS IS FINDING THE H1-END".$node->textContent."<br>";
            $h1Found=true;
          }
          elseif(trim($node->textContent, " \n\r\t\0\xC2\xA0")!=='')
          {
            echo $node->textContent. "<br>";
          }
        }
      }
    }
  }
}
?>

Please ignore the stupid stuff like if(true) because I removed the condition for security reasons.

 

test2.php

<html>
<head>
<title>My Page</title>
</head>
<body>
<div class="page-main-content">
<h1>h1 test</h1>
<h1>h1 test</h1>
<p><a href="mypage1.html">Hello World!</a></p>
<p><a href="mypage2.html">Another Hello World!</a></p>
</div>
<p>THIS SHOULD NOT BE OUTPUTTED</p>
</body>
</html>

Again, ignore the poor HTML, this is purely for testing purposes. Please help. 

Link to comment
Share on other sites

Urgent? Really? And this code is just for testing purposes and it's urgent? Really? Urgent?

What are you talking about? This will IMMEDIATELY go into the live application. Surely you are just pretending to be stupid, right?

Edited by GameYin
Link to comment
Share on other sites

To quote you: 'testing purposes'.

 

The real point is - don't use urgent on forum posts. Do you really think we are all going to drop everything to help you out? The use of urgent just shows us who's new to the whole idea of forums.

Why does any of that matter? I'm looking for an answer to my question. If you cannot provide any help, I ask that you please stop cluttering my thread. Please and thank you. This feature needs to go live ASAP, thus to me, it is urgent. Obviously this is in testing so that's why the HTML code you see is for testing purposes. I cannot post the actual HTML file for security reasons. None of which should mean anything to you or that you should care.

Edited by GameYin
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.