[SOLVED] splitting an html element

Darghon · May 18, 2009

Hello all, I'm trying to split a html element into arguments

example:

into:

Array("tagName" : "img" , "src" : "Style/Imag...png", "height" : "75px", "alt" : "Civilian Img");

but I haven't got a clue how to do it.

any tips/tricks on how to pull this off?

Adam · May 18, 2009

I'd probably split the HTML by white space, then run a few regular expressions on each part to determine if it's the opening tag, an attribute, the end, etc. Probs need to watch out for things like "checked" - 1 word arguements. Best place to start is to work out what's distinct for each part, how you can recognize them. For example the start tag will have a '<' on the left, attributes will have the equals: 'something=something'.. etc. etc.

More information on regular expressions

Using regular expressions with JavaScript

Good luck!

Glad to help further if you run into troubles.

Ken2k7 · May 18, 2009

Well split by spaces isn't the best idea. Your alt case will be messed up.

I would just use Regex on the whole thing.

Adam · May 18, 2009

Ah yeah you're right! Wouldn't work if some of the HTML is a little messed up as well.

Yup, Regex all the way!

Darghon · May 18, 2009

well what I tried in the mean while was a loop that checks every character

so each time it passes a space, it dumps its buffer into an array, if it passes a quote, it ignores spaces till it finds another quote, and that all works incredibly well

but, the result of the entire thing is, elements are converted correctly and reassembled to Dom objects as they should, but I'd like to strip all the new lines and tabs from the tags, but doing something like =>if(tag.substr(i,1) == "\n" || tag.substr(i,1) == "\t" || tag.substr(i,1) == "")

but with no result, here is my splitTag function:

function splitTag(tag){
var list = new Array();
var buffer = "";
var ignoreSpace = false;
var ignoreNextChar = false;
var openedQuote = false;
for(var i = 0;i < tag.length;i++){
	if(tag.substr(i,1) == "\n" || tag.substr(i,1) == "\t" || tag.substr(i,1) == ""){
		//I do nothin'
	}
	else{
		if(!ignoreSpace && tag.substr(i,1) == " "){
			if(buffer.length > 0){
				list[list.length] = buffer;
				buffer = "";
			}
		}
		else{
			if(ignoreSpace && tag.substr(i,1) == " "){
				buffer += tag.substr(i,1);
			}
			else if(ignoreNextChar){
				buffer += tag.substr(i,1);
				ignoreNextChar = false;
			}
			else if(tag.substr(i,1) == "'" || tag.substr(i,1) == "\""){
				if(openedQuote == ""){
					openedQuote = tag.substr(i,1);
					ignoreSpace = true;
				}
				else{
					if(openedQuote == tag.substr(i,1)){
						openedQuote = "";
						ignoreSpace = false;
					}
					else{
						buffer += tag.substr(i,1);
					}
				}
			}
			else if(tag.substr(i,1) == "\\"){
				ignoreNextChar = true;
				buffer += tag.substr(i,1);
			}
			else{
				buffer += tag.substr(i,1);	
			}
		}
	}
}
if(buffer.length > 0){
	list[list.length] = buffer;
	buffer = "";
}
return list;
}

now how can I pull this last thing off,

because right now, if I put a newline in my source code, the resulting layout drops a line as well... and I don't want that

thx in advance

Ken2k7 · May 18, 2009

Wow, I won't even begin to parse all that.

Try somewhere along the lines of:

var str = '<img src="Style/Images/Sprites/Civilian/base-male.png" height="75px" alt="Civilian Img" />';
var matches = str.match(/([a-z]+?=\"[^\"]+\")/ig);

Darghon · May 19, 2009

Thx for the code, it seems to work nicely, only it always skips the first tag, (like img in the example)

but all I still need right now, is a way to check that all special chars are filtered out, like linebreaks in the source code

for example, if my source has =>

<div id='div1'>
<div id='div2'>stuff here</div>
</div>

I will get a blank line before "stuff here"

if I have the source =>

<div id='div1'><div id='div2'>stuff here</div></div>

then I don't have a blank line

any solutions for this problem?

Sign In

[SOLVED] splitting an html element

Recommended Posts

Darghon

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

Ken2k7

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

Darghon

Link to comment

Share on other sites

Ken2k7

Link to comment

Share on other sites

Darghon

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information