Jump to content

Extract Tag based on the Class Value.


gmcust3

Recommended Posts

Assuming you can't put an id on the div element you would do something like this

<script>


window.onload = function() {

   var divs = document.getElementsByTagName('div');
  
   for (i in divs) {
       
       if (divs[i].className == "name") {
           
           var name = divs[i].innerHTML;
       } 
         
   }
    
   alert(name);
   
}
</script>

Link to comment
Share on other sites

<DIV class="entry clearfix " id=entry_9>

<DIV class="entry_title clearfix">

<H1 class=" ">Smith Colin P</H1></DIV>

<DIV class=full_listing>

<DIV class=blocks>

<DIV class="block indent-level-0" id=entry_9_block_0>

<DIV class=share_link wpol:contactPointId="711357117V00W"

wpol:entryId="711357117V00W">

<DIV class=save_menu>

<DIV class=icon></DIV></DIV>

<DIV class=share_menu>

<DIV class=icon></DIV></DIV><A class=screen_reader_only

href="http://192.168.0.2/mobile/send-to-mobile-accessible?entryId=711357117V00W&listingId=711357117V00W&searchType=R&channel=WP"

rel=nofollow name=Smith>Send this listing to your mobile</A></DIV>

<SPAN class="phone_number  ">(03) 9650 4978</SPAN>

<DIV class=address><SPAN class=street_line>118 Russell St</SPAN>

<SPAN class=locality>Melbourne</SPAN><SPAN class=state>VIC</SPAN>

<SPAN class=postcode>3000</SPAN></DIV>

 

 

Now trying to parse above code.

 

Link to comment
Share on other sites

function itemProxy(elType,classNeeded){
    var objElements = getElementsByTagName(elType);
    var arrResult = new array();
    for(i = 0;i <= objElements.length;i++){
         if(objElements[i].className == classNeeded){
             arrResult.push(objElements[i])
         }
    }
return arrResult;
}

 

Basicly....

 

Or you can just get jQuery....

Link to comment
Share on other sites

Thanks for the Code.

 

Any Working Example ( If any ) ?

 

My Motive : I have saved the HTML content in a Notepad.

 

Now, I want to Read the Notepad by looping and Extracting the Value between the Tag and Display the Content between those tags.

 

Content saved : http://www.whitepages.com.au/resSearch.do?subscriberName=smith&givenName=&location=Melbourne+VIC

 

Using :

 

 

// create a new cURL resource 
$ch = curl_init(); 
$fp = fopen (dirname(__FILE__) . '/a.txt', 'w+');//This is the file where we save the information 

// set URL and other appropriate options 
curl_setopt($ch, CURLOPT_URL, "http://www.whitepages.com.au/resSearch.do?subscriberName=smith&givenName=&location=Melbourne+VIC"); 
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_HEADER, 0); 

Link to comment
Share on other sites

Tried :

 


$file = fopen("a.txt", "r") or exit("Unable to open file!"); 

while(!feof($file)) 
  { 
   $regex="/clearfix\"><h1 class(.*)<\/h1><\/div><div class/";    
   preg_replace($regex,"",fgets($file)); 
  } 

 

 

How to Get Address as there are many tags with span ?

 

Code:

 

<span class="street_line">398 Lonsdale St</span>

 

Code:

 

<span class="locality">Melbourne</span>

 

Code:

 

<span class="state">VIC</span>

 

Code:

 

<span class="postcode">3000</span>

 

<div class="address"><span class="street_line">25 Spring St</span><span class="locality">Melbourne</span><span class="state">VIC</span><span class="postcode">3000</span></div>

 

 

This div has Address but There are Other Div also.

 

Just thinking How to Extract Address One Only .

 

Primary is <div class="address">

 

But Before address div, there are many more div.

Link to comment
Share on other sites

 

Tried below code also :

 

 


<?php 


/** 
* 
* @get text between tags 
* 
* @param string $tag The tag name 
* 
* @param string $html The XML or XHTML string 
* 
* @param int $strict Whether to use strict mode 
* 
* @return array 
* 
*/ 
function getTextBetweenTags($tag, $html, $strict=0) 
{ 
    /*** a new dom object ***/ 
    $dom = new domDocument; 

    /*** load the html into the object ***/ 
    if($strict==1) 
    { 
        $dom->loadXML($html); 
    } 
    else 
    { 
        $dom->loadHTML($html); 
    } 

    /*** discard white space ***/ 
    $dom->preserveWhiteSpace = false; 

    /*** the tag by its tag name ***/ 
    $content = $dom->getElementsByTagname($tag); 

    /*** the array to return ***/ 
    $out = array(); 
    foreach ($content as $item) 
    { 
        /*** add node value to the out array ***/ 
        $out[] = $item->nodeValue; 
    } 
    /*** return the results ***/ 
    return $out; 
} 
?> 


<?php 

// create a new cURL resource 
$ch = curl_init(); 
$fp = fopen (dirname(__FILE__) . '/a.txt', 'w+');//This is the file where we save the information 

// set URL and other appropriate options 
curl_setopt($ch, CURLOPT_URL, "http://www.whitepages.com.au/resSearch.do?subscriberName=smith&givenName=&location=Melbourne+VIC"); 
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_HEADER, 0); 

// grab URL and pass it to the browser 
$data = curl_exec($ch); 

$file = fopen("a.txt", "r") or exit("Unable to open file!"); 
//$fileTxt = "<pre>".htmlspecialchars(file_get_contents("a.txt"))."</pre>"; 

while(!feof($file)) 
  { 
   $html = fgets($file); 
   $content = getTextBetweenTags('h1', $html); 
   $content1 = getTextBetweenTags('street_line', $html); 
   foreach( $content as $itemName ) 
   { 
       echo "Name : " .$itemName.'<br />'; 
      foreach( $content1 as $itemAdd) 
      { 
          echo "Address : " .$itemAdd.'<br />'; 
      }  
   }  
} 

fclose($file); 

// close cURL resource, and free up system resources 
curl_close($ch); 

?> 

 

But probably not designed to get a tag based on sub-name, only on tag type.

Link to comment
Share on other sites

Ok i got that much.

I can write a JS object that can do that, but you need to know how to implement it.

Will you be able to do that?

 

Basicly you will need to:

  1: Add my script to a html page

  2: Add the html needed to be parsed

  3: Setup my script (I will explain how)

  4: Display (or do whatever) with the results that the script will give you (Which will be basically the DOM elements you need  )

Link to comment
Share on other sites

ok so this is the object:

var omirion = {
		arrResult: [],
		tags: ['div', 'h1'],
		neededClass: ['need', 'greed'],
		getElements: function () {
			var tags = this.tags;
			var neededClass = this.neededClass;
			for (i = 0; i < tags.length; i++) {
				var temp = document.getElementsByTagName(tags[i]);
				for (x = 0; x < temp.length; x++) {
					for (y = 0; y < neededClass.length; y++) {
						if (temp[x].className == neededClass[y]) {
							this.arrResult.push(temp[x])
						}
					}
				}
			}
			return this.arrResult;
		}
	}

 

Now you have a few main things here you have the tags you need to be searched denoted here

tags: ['div', 'h1'],

And the classes you need matched denoted here

neededClass: ['need', 'greed'],

 

You can have 1 or more than 1 in both.

 

So lets say you need only divs searched, you'll modify the tags notation like so

tags: ['div'],

lets say you need divs h1 and h6 you'll do this

tags: ['div','h1','h6'],

you can have n number of tags. You need to separate them with commas like the example above. And they need to be enclosed with ' ' or " " quotes.

Be VERY careful not to delete anything outside the [] brackets or you WILL break the code.

 

Now liek in your example you need the class name matched so you do this.

neededClass: ['name'],

lets say you need address to. You do this

neededClass: ['name','address'],

again separated with comas enclosed in quotes. AND BE VERY CAREFUL NOT to change anything outside the [] brackets.

 

So once you got everything set up you call the function in the window.onload event like so.

 

window.onload = function () {
var eles = omirion.getElements()
}

Now the variable eles has any matched DOM element. What this is good for is that yo ucan access the text in-between the element like so. Let's say we have only 2 matched elements.

eles[0].innerHTML
eles[1].innerHTML

 

So yeah i hope this helps. As you see it will be pretty simple to loop trough the array.

 

If you have no further question please mark the thread solved.(The yellow button on the far left and down)

Link to comment
Share on other sites

Thanks a A lot for all the pain you took to code it.

 


<script type="text/javascript">

var omirion = {
		arrResult: [],
		tags: ['div', 'h1'],
		neededClass: ['need', 'greed'],
		getElements: function () {
			var tags = this.tags;
			var neededClass = this.neededClass;
			for (i = 0; i < tags.length; i++) {
				var temp = document.getElementsByTagName(tags[i]);
				for (x = 0; x < temp.length; x++) {
					for (y = 0; y < neededClass.length; y++) {
						if (temp[x].className == neededClass[y]) {
							this.arrResult.push(temp[x])
						}
					}
				}
			}
			return this.arrResult;
		}
	}

</script>

<%php

// create a new cURL resource 
$ch = curl_init(); 
$fp = fopen (dirname(__FILE__) . '/a.txt', 'w+');//This is the file where we save the information 

// set URL and other appropriate options 
curl_setopt($ch, CURLOPT_URL, "http://www.whitepages.com.au/resSearch.do?subscriberName=smith&givenName=&location=Melbourne+VIC"); 
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_HEADER, 0); 

// grab URL and pass it to the browser 
$data = curl_exec($ch); 

$file = fopen("a.txt", "r") or exit("Unable to open file!"); 
//$fileTxt = "<pre>".htmlspecialchars(file_get_contents("a.txt"))."</pre>"; 

while(!feof($file)) 
{ 
   $html = fgets($file);
   $content = getTextBetweenTags('h1', $html); 
   foreach( $content as $itemName ) 
   { 
echo "<SCRIPT>var eles = omirion.getElements()</SCRIPT>";
   }
} 

fclose($file);

// close cURL resource, and free up system resources 
curl_close($ch); 
%>

 

This is the Code which I am trying !!

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.