Jump to content

Raex

New Members
  • Posts

    5
  • Joined

  • Last visited

Raex's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. Hi, I have a php code that could extract the categories and display them. However, I still can't extract the numbers that goes along with it too(without the bracket). This is my code: <?php $grep = new DoMDocument(); @$grep->loadHTMLFile("http://www.lelong.com.my/Auc/List/BrowseAll.asp"); $finder = new DomXPath($grep); $class = "CatLevel1"; $nodes = $finder->query("//*[contains(@class, '$class')]"); foreach ($nodes as $node) { $span = $node->childNodes; echo $span->item(0)->nodeValue."<br>"; } ?> This is my desired output: Arts, Antiques & Collectibles : 9768 B2B & Industrial Products : 2342 Baby : 3453 etc... Any help is appreciated. Thanks!
  2. Hi, I need to extract the integers from the source code: <span class=CatLevel1><a onclick="Javascript:ShowMeu('21');">Arts</a> (9768)</span> <span class=CatLevel1><a onclick="Javascript:ShowMeu('271');">Industrial Products</a> (9321)</span> <span class=CatLevel1><a onclick="Javascript:ShowMeu('1273');">Baby</a> (11407)</span> What are the pattern that I can use to retrieve all the integer in the bracket (9768, 9321, 11407)? This is my php code: <!DOCTYPE html> <html> <body> <?php $file_string = file_get_contents('http://www.lelong.com.my/'); preg_match('/<title>(.*)<\/title>/i', $file_string, $title); //pattern $title_out = $title[1]; echo $title_out; ?> </body> </html> Thanks
  3. Hi, I'm trying to retrieve/scrape some information from a website using the class name and the tag name. Below is the example in VB: Dim htmL_cat As HTMLDocument Dim objTableL_cat As Object, objDatL_cat As Object, objItemL_cat As Object, objKeyL_cat As Object Dim intRowL_cat As Long Set htmL_cat = New HTMLDocument With CreateObject("MSXML2.XMLHTTP") .Open "GET", "http://www.lelong.com.my/Auc/List/BrowseAll.asp", False .send htmL_cat.body.innerHTML = .responseText End With With htmL_cat Set objTableL_cat = .getElementsByClassName("CatLevel1") 'Find elements with class name first For Each objDatL_cat In objTableL_cat Set objKeyL_cat = objDatL_cat.getElementsByTagName("a") 'Next, find elements with tag name For Each objItemL_cat In objKeyL_cat Sheets("Analytics").Range("E6").Offset(intRowL_cat, 0) = objItemL_cat.innerText intRowL_cat = intRowL_cat + 1 Next Next End With Set htmL_cat = Nothing Set objTableL_cat = Nothing Set objKeyL_cat = Nothing How do I do the same using PHP? Thanks.
  4. Hi, I have a php code that is used to find all the images in a website using simple_html_dom. When I tried to run it, this error occured: Fatal error: Call to a member function find() on a non-object in C:\xampp\htdocs\myPHP\index.php on line 20 This is my code: <!DOCTYPE html> <html> <body> <?php include_once("simple_html_dom.php"); //use curl to get html content function getHTML($url,$timeout) { $ch = curl_init($url); // initialize curl with given url curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]); // set useragent curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error return @curl_exec($ch); } $html=getHTML("http://www.website.com",10); // Find all images on webpage foreach($html->find("img") as $element) echo $element->src . '<br>'; ?> </body> </html> I have the simple_html_dom.php file in my directory. How can I correct this? thanks
  5. Hi, I'm trying to retrieve the integer value between the <span> tag from a HTML source code. HTML source code: <span> (3861822) </span> This is the php code: <!DOCTYPE html> <html> <body> <?php //use curl to get html content function getHTML($url,$timeout) { $ch = curl_init($url); // initialize curl with given url curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]); // set useragent curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error return @curl_exec($ch); } $html=getHTML("http://www.alibaba.com/Products",10); preg_match("/<span>(.*)<\/span>/i", $html, $match); $title = $match[1]; echo $title; ?> </body> </html> Whenever I try to run it, this error will come out: Notice: Undefined offset: 1 in C:\xampp\htdocs\myPHP\index.php on line 19. How to correct it so that it will display all the integer value within the tag name but without the bracket? Thanks
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.