jimmyelewis Posted November 6, 2006 Share Posted November 6, 2006 I'm trying to use regular expression to remove some of the non-standard tags from into that is copyed from Microsoft Word to a text area. So far I have:[code]$search = array( '/<city[^>]*>(.*?)<\/city[^>]*>/is', '/<place[^>]*>(.*?)<\/place[^>]*>/is', '/mso-[^"]*|mso-[^;]*mso-[^"]*/is', '/<formulas>(.*?)<\/formulas>/is', '/o:[^=]*="[^"]*"/is' );$replace = array( '$1', '$1', '', '', '' );echo preg_replace($search,$replace,$row['contents']);[/code]One of the the values for $row['contents'] is:[code]<p class="MsoNormal"><span class="EmailStyle41"><font face="Arial" color="#003300" size="2"><span style="FONT-SIZE: 10pt">The Semiconductor Power and Electronics Center (SPEC) welcomes Dr. Engelbert Hetzmannseder as he presents an overview of the Eaton Corporation focused on the Eaton Electric Group and an overview of the mission and capabilities of the <place w:st="on"><placename w:st="on">Eaton</placename><placename w:st="on">Innovation</placename><placetype w:st="on">Center in Milwaukee, Wisconsin</placetype></place>. </span></font></span></p><span class="EmailStyle41"><font face="Arial" color="#003300" size="2"><span style="FONT-SIZE: 10pt"><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 14pt; mso-bidi-font-size: 10.0pt"><font color="#000000"><font face="Times New Roman">The Speaker</font></font></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 14pt; mso-bidi-font-size: 10.0pt"><font color="#000000"><font face="Times New Roman">Engelbert Hetzmannseder</font></font></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 14pt; mso-bidi-font-size: 10.0pt"></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style="mso-bidi-font-weight: normal"><span style="FONT-SIZE: 14pt; mso-bidi-font-size: 10.0pt"></span></strong><font size="3"><font color="#000000"><font face="Times New Roman"><strong style="mso-bidi-font-weight: normal">Engelbert Hetzmannseder</strong> was born in Klaffer,<place w:st="on">Upper Austria</place>.<span style="mso-spacerun: yes"> </span>He received his Dipl.-Ing. (B.S., M.S., 1990) and Dr. techn. (Ph.D., 1994) degree in Electrical Engineering from the Technical University of Vienna, Austria.<span style="mso-spacerun: yes"> </span>Since February 1995 he has been with Eaton Corporation /<placename w:st="on">Innovation</placename><placetype w:st="on">Center</placetype>in<place w:st="on"><city w:st="on">Milwaukee</city>, <state w:st="on">WI</state>, </place>, involved with fundamental and applied research on contacts, switching arc phenomena, and arc fault detection for industrial, aerospace, and automotive products.<span style="mso-spacerun: yes"> </span>He holds 8 patents and published 20 papers at international conferences.<span style="mso-spacerun: yes"> </span></font></font></font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><span style="mso-spacerun: yes"></span><br />Since 2000 he has been Technology Manager of the Electrical Architecture & Systems department at the<place w:st="on"><placename w:st="on">Eaton</placename><placename w:st="on">Innovation</placename><placetype w:st="on">Center</placetype></place>.<span style="mso-spacerun: yes"> </span>Capabilities of the EAS group include:</font></font></font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Electric Power Management</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Power electronic, Power quality, and power conversion architectures</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Diagnostics & prognostics of electrical components and systems</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Electro-mechanical switching technologies, Arc & plasma science, Contact physics</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Power systems modeling: magnetic, electric, thermal, electro-mechanical</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; mso-list: l0 level1 lfo1"><span style="COLOR: black; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"><span style="mso-list: Ignore"><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Mechanism synthesis and modeling</font></p><p class="MsoHeader" style="MARGIN: 24pt 0in 0pt; LINE-HEIGHT: normal; tab-stops: .5in"><p><font face="Times New Roman" color="#000000" size="3"> <stroke joinstyle="miter"></stroke><formulas><f eqn="if lineDrawn pixelLineWidth 0"></f><f eqn="sum @0 1 0"></f><f eqn="sum 0 0 @1"></f><f eqn="prod @2 1 2"></f><f eqn="prod @3 21600 pixelWidth"></f><f eqn="prod @3 21600 pixelHeight"></f><f eqn="sum @0 0 1"></f><f eqn="prod @6 1 2"></f><f eqn="prod @7 21600 pixelWidth"></f><f eqn="sum @8 21600 0"></f><f eqn="prod @7 21600 pixelHeight"></f><f eqn="sum @10 21600 0"></f></formulas><path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"></path><lock v:ext="edit" aspectratio="t"></lock><shape id="_x0000_s1026" style="MARGIN-TOP: 48.4pt; Z-INDEX: 1; MARGIN-LEFT: 401.55pt; WIDTH: 157.1pt; POSITION: absolute; HEIGHT: 213.55pt" type="#_x0000_t75"><imagedata o:title="ENGELBERT 2 005 head" src="file:///C:\DOCUME~1\jakirk\Local%20Settings\Temp\msohtml1\01\clip_image001.jpg"></imagedata></shape> <stroke joinstyle="miter"></stroke><formulas><f eqn="if lineDrawn pixelLineWidth 0"></f><f eqn="sum @0 1 0"></f><f eqn="sum 0 0 @1"></f><f eqn="prod @2 1 2"></f><f eqn="prod @3 21600 pixelWidth"></f><f eqn="prod @3 21600 pixelHeight"></f><f eqn="sum @0 0 1"></f><f eqn="prod @6 1 2"></f><f eqn="prod @7 21600 pixelWidth"></f><f eqn="sum @8 21600 0"></f><f eqn="prod @7 21600 pixelHeight"></f><f eqn="sum @10 21600 0"></f></formulas><path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"></path><lock v:ext="edit" aspectratio="t"></lock><shape id="_x0000_s1026" style="MARGIN-TOP: 48.4pt; Z-INDEX: 1; MARGIN-LEFT: 401.55pt; WIDTH: 157.1pt; POSITION: absolute; HEIGHT: 213.55pt" type="#_x0000_t75"><imagedata o:title="ENGELBERT 2 005 head" src="file:///C:\DOCUME~1\jakirk\Local%20Settings\Temp\msohtml1\01\clip_image001.jpg"></imagedata></shape></font></p></p><p class="MsoNormal"><p> </p></p></span></font></span>[/code]Whats returned from the preg_replace:[code]<p class="MsoNormal"><span class="EmailStyle41"><font face="Arial" color="#003300" size="2"><span style="FONT-SIZE: 10pt">The Semiconductor Power and Electronics Center (SPEC) welcomes Dr. Engelbert Hetzmannseder as he presents an overview of the Eaton Corporation focused on the Eaton Electric Group and an overview of the mission and capabilities of the <placename w:st="on">EatonInnovationCenter in Milwaukee, Wisconsin</place>. </span></font></span></p><span class="EmailStyle41"><font face="Arial" color="#003300" size="2"><span style="FONT-SIZE: 10pt"><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style=""><span style="FONT-SIZE: 14pt; "><font color="#000000"><font face="Times New Roman">The Speaker</font></font></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style=""><span style="FONT-SIZE: 14pt; "><font color="#000000"><font face="Times New Roman">Engelbert Hetzmannseder</font></font></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style=""><span style="FONT-SIZE: 14pt; "></span></strong></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><strong style=""><span style="FONT-SIZE: 14pt; "></span></strong><font size="3"><font color="#000000"><font face="Times New Roman"><strong style="">Engelbert Hetzmannseder</strong> was born in Klaffer,Upper Austria.<span style=""> </span>He received his Dipl.-Ing. (B.S., M.S., 1990) and Dr. techn. (Ph.D., 1994) degree in Electrical Engineering from the Technical University of Vienna, Austria.<span style=""> </span>Since February 1995 he has been with Eaton Corporation /InnovationCenterinMilwaukee, <state w:st="on">WI</state>, , involved with fundamental and applied research on contacts, switching arc phenomena, and arc fault detection for industrial, aerospace, and automotive products.<span style=""> </span>He holds 8 patents and published 20 papers at international conferences.<span style=""> </span></font></font></font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><font size="3"><font color="#000000"><font face="Times New Roman"><span style=""></span><br />Since 2000 he has been Technology Manager of the Electrical Architecture & Systems department at the<placename w:st="on">EatonInnovationCenter</place>.<span style=""> </span>Capabilities of the EAS group include:</font></font></font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Electric Power Management</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Power electronic, Power quality, and power conversion architectures</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Diagnostics & prognostics of electrical components and systems</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Electro-mechanical switching technologies, Arc & plasma science, Contact physics</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Power systems modeling: magnetic, electric, thermal, electro-mechanical</font></p><p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.35in; TEXT-INDENT: -0.2in; LINE-HEIGHT: normal; "><span style="COLOR: black; FONT-FAMILY: Symbol; "><span style=""><font size="3">·</font><span style="FONT: 7pt "Times New Roman""> </span></span></span><font face="Times New Roman" color="#000000" size="3">Mechanism synthesis and modeling</font></p><p class="MsoHeader" style="MARGIN: 24pt 0in 0pt; LINE-HEIGHT: normal; tab-stops: .5in"><p><font face="Times New Roman" color="#000000" size="3"> <stroke joinstyle="miter"></stroke><path gradientshapeok="t" ></path><lock v:ext="edit" aspectratio="t"></lock><shape id="_x0000_s1026" style="MARGIN-TOP: 48.4pt; Z-INDEX: 1; MARGIN-LEFT: 401.55pt; WIDTH: 157.1pt; POSITION: absolute; HEIGHT: 213.55pt" type="#_x0000_t75"><imagedata src="file:///C:\DOCUME~1\jakirk\Local%20Settings\Temp\msohtml1\01\clip_image001.jpg"></imagedata></shape> <stroke joinstyle="miter"></stroke><path gradientshapeok="t" ></path><lock v:ext="edit" aspectratio="t"></lock><shape id="_x0000_s1026" style="MARGIN-TOP: 48.4pt; Z-INDEX: 1; MARGIN-LEFT: 401.55pt; WIDTH: 157.1pt; POSITION: absolute; HEIGHT: 213.55pt" type="#_x0000_t75"><imagedata src="file:///C:\DOCUME~1\jakirk\Local%20Settings\Temp\msohtml1\01\clip_image001.jpg"></imagedata></shape></font></p></p><p class="MsoNormal"><p> </p></p></span></font></span>[/code]Things like:[code]<placename w:st="on">EatonInnovationCenter in Milwaukee, Wisconsin</place><placename w:st="on">EatonInnovationCenter</place>[/code]remain while the others are replaced. I think that it might have something with those tags been embed in other tags, I've tried different things but I'm not sure what to do. Quote Link to comment Share on other sites More sharing options...
rea|and Posted November 8, 2006 Share Posted November 8, 2006 Yes, regex engine usually doesn't work with nested patterns. But in this case I guess you could define one more pattern that matches <place*:first replace those tags starting with <place (something like '/<place[b]\B[/b][^>]*>(.*?)<\/place[b]\B[/b][^>]*>/is' ) and then replace the last ones (that match <place). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.