Jump to content

Excel and php dom charset problem


feri_soft

Recommended Posts

Hello i use a simple script for decoding excel exported xml. But i have problems with my encoding (windows-1251).

 

Here is the code i think i got it from ibm dev network and its pretty simple just trying to get it working:

 

<?php
  $data = array();
  
  function add_person( $first, $middle, $last, $email )
  {
  global $data;
  
  $data []= array(
  'first' => $first,
  'middle' => $middle,
  'last' => $last,
  'email' => $email 
  );
  }
  
  if ( $_FILES['file']['tmp_name'] )
  {
  $dom = DOMDocument::load($_FILES['file']['tmp_name']);
  $rows = $dom->getElementsByTagName( 'Row' );
  $first_row = true;
  foreach ($rows as $row)
  {
  if ( !$first_row )
  {
  $first = "";
  $middle = "";
  $last = "";
  $email = "";
  
  $index = 2;
  $cells = $row->getElementsByTagName( 'Cell' );
  foreach( $cells as $cell )
  { 
  $ind = $cell->getAttribute( 'Index' );
  if ( $ind != null ) $index = $ind;
  
  if ( $index == 1 ) $first = $cell->nodeValue;
  if ( $index == 2 ) $middle = $cell->nodeValue;
  if ( $index == 3 ) $last = $cell->nodeValue;
  if ( $index == 4 ) $email = $cell->nodeValue;
  
  $index += 1;
  }
  add_person( $first, $middle, $last, $email );
  }
  $first_row = false;
  }
  }
  ?>
  <html>
  <head>
  <META http-equiv="Content-Type" content="text/html; charset=windows-1251">
  </head>
  <body>
  <table>
  <tr>
  <th>First</th>
  <th>Middle</th>
  <th>Last</th>
  <th>Email</th>
  </tr>
  <?php foreach( $data as $row ) { ?>
  <tr>
  <td><?php echo( $row['first'] ); ?></td>
  <td><?php echo( $row['middle'] ); ?></td>
  <td><?php echo( $row['last'] ); ?></td>
  <td><?php echo( $row['email'] ); ?></td>
  </tr>
  <?php } ?>
  </table>
  </body>
  </html>

 

Firslty the line

<?xml version="1.0" encoding="UTF-8" ?>

was without the encoding statement so i manually added it. I have tried both with utf and windows-1251 but its the same just changing the symbols, however not to the correct ones. I tried searching the php manuall for some  domload charset atribute but i didnt find it and i supose the function gets the charset from the xml but why not working correctly then?

 

                              Thaks a lot to everyone who can help!  ;)

 

PS

I need this one asap so if you have anything in mind please help.

Link to comment
Share on other sites

Yes here is the xml:

 

<?xml version="1.0" encoding="UTF-8" ?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Author>Ilonka</Author>
  <LastAuthor>asdasd</LastAuthor>
  <LastPrinted>2005-06-27T07:18:10Z</LastPrinted>
  <Created>2003-06-10T11:09:22Z</Created>
  <LastSaved>2007-03-01T13:33:54Z</LastSaved>
  <Company>Fenix</Company>
  <Version>10.2625</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
  <DownloadComponents/>
  <LocationOfComponents HRef="file:///I:\Microsoft.Office.XP.Pro\"/>
</OfficeDocumentSettings>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>7320</WindowHeight>
  <WindowWidth>9720</WindowWidth>
  <WindowTopX>120</WindowTopX>
  <WindowTopY>120</WindowTopY>
  <AcceptLabelsInFormulas/>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font x:CharSet="1251"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
  <Style ss:ID="s21">
   <Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
   <Font ss:FontName="Times New Roman" x:CharSet="1251" x:Family="Roman"
    ss:Color="#800080"/>
   <Interior/>
  </Style>
  <Style ss:ID="s22">
   <Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
   <Font ss:FontName="Times New Roman" x:CharSet="1251" x:Family="Roman"
    ss:Color="#800080"/>
   <Interior/>
   <NumberFormat ss:Format="0%"/>
  </Style>
  <Style ss:ID="s23">
   <Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
   <Font ss:FontName="Times New Roman" x:CharSet="1251" x:Family="Roman"
    ss:Color="#800080" ss:Italic="1"/>
   <Interior/>
  </Style>
  <Style ss:ID="s24">
   <Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
   <Font ss:FontName="Times New Roman" x:CharSet="1251" x:Family="Roman"
    ss:Color="#800080"/>
   <Interior ss:Color="#FFFF00" ss:Pattern="Solid"/>
  </Style>
</Styles>
<Worksheet ss:Name="9305858">
  <Table ss:ExpandedColumnCount="12" ss:ExpandedRowCount="12" x:FullColumns="1"
   x:FullRows="1" ss:StyleID="s21">
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="37.5"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="20.25"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="41.25"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="28.5"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="33"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="100.5"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="19.5"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="21"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="22.5"/>
   <Column ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="64.5"/>
   <Column ss:Index="12" ss:StyleID="s21" ss:AutoFitWidth="0" ss:Width="27.75"/>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell><Data ss:Type="Number">38000</Data></Cell>
    <Cell><Data ss:Type="Number">38</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">бул. В.Левски</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="Number">2</Data></Cell>
    <Cell><Data ss:Type="String">отл</Data></Cell>
    <Cell><Data ss:Type="String">Златина</Data></Cell>
    <Cell><Data ss:Type="Number">56</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell ss:Index="6"><Data ss:Type="String">гр. Банско</Data></Cell>
    <Cell ss:Index="10"><Data ss:Type="String">комплекс</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">54</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="3"><Data ss:Type="Number">23700</Data></Cell>
    <Cell><Data ss:Type="Number">39</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">гр. Несебър</Data></Cell>
    <Cell ss:Index="8"><Data ss:Type="String">ло</Data></Cell>
    <Cell><Data ss:Type="Number">2</Data></Cell>
    <Cell><Data ss:Type="String">лукс</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">68</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="3"><Data ss:Type="Number">31900</Data></Cell>
    <Cell><Data ss:Type="Number">38</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">гр. Несебър</Data></Cell>
    <Cell ss:Index="9"><Data ss:Type="Number">2</Data></Cell>
    <Cell><Data ss:Type="String">шп/зам</Data></Cell>
    <Cell ss:StyleID="s22"><Data ss:Type="String">Данчо</Data></Cell>
    <Cell><Data ss:Type="Number">65</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell><Data ss:Type="String">E 62000</Data></Cell>
    <Cell><Data ss:Type="Number">40</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">к.к. Слънчев бряг</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell ss:Index="10"><Data ss:Type="String">обз/лукс</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">46</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell ss:Index="6"><Data ss:Type="String">к.к. Слънчев бряг</Data></Cell>
    <Cell ss:Index="10"><Data ss:Type="String">комплекс</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">55</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="3"><Data ss:Type="Number">45500</Data></Cell>
    <Cell><Data ss:Type="Number">48</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">Лозенец</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="Number">10</Data></Cell>
    <Cell><Data ss:Type="String">с трапез.</Data></Cell>
    <Cell><Data ss:Type="String">Злати</Data></Cell>
    <Cell><Data ss:Type="Number">67</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell><Data ss:Type="Number">29000</Data></Cell>
    <Cell><Data ss:Type="Number">48</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">Надежда 5</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="Number">6</Data></Cell>
    <Cell><Data ss:Type="String">преустр.</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">61</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="3"><Data ss:Type="Number">46600</Data></Cell>
    <Cell><Data ss:Type="Number">60</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">Овча купел 1</Data></Cell>
    <Cell ss:Index="8"><Data ss:Type="String">пр</Data></Cell>
    <Cell><Data ss:Type="Number">7</Data></Cell>
    <Cell><Data ss:Type="String">в строеж</Data></Cell>
    <Cell ss:Index="12"><Data ss:Type="Number">70</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell><Data ss:Type="Number">13500</Data></Cell>
    <Cell><Data ss:Type="Number">32</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">с. Кранево</Data></Cell>
    <Cell><Data ss:Type="String">-</Data></Cell>
    <Cell ss:Index="9"><Data ss:Type="Number">1</Data></Cell>
    <Cell><Data ss:Type="String">до плажа</Data></Cell>
    <Cell ss:StyleID="s23"/>
    <Cell ss:StyleID="s23"><Data ss:Type="Number">37</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell>
    <Cell><Data ss:Type="String">Е 26000</Data></Cell>
    <Cell><Data ss:Type="Number">48</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">Х.Димитър</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="Number">7</Data></Cell>
    <Cell><Data ss:Type="String">чист</Data></Cell>
    <Cell><Data ss:Type="String">Вили</Data></Cell>
    <Cell><Data ss:Type="Number">51</Data></Cell>
   </Row>
   <Row ss:StyleID="s24">
    <Cell ss:Index="3"><Data ss:Type="Number">90600</Data></Cell>
    <Cell><Data ss:Type="Number">60</Data></Cell>
    <Cell><Data ss:Type="String">1-ст.</Data></Cell>
    <Cell><Data ss:Type="String">ул. Л.Каравелов</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="String">+</Data></Cell>
    <Cell><Data ss:Type="Number">1</Data></Cell>
    <Cell><Data ss:Type="String">южен</Data></Cell>
    <Cell><Data ss:Type="String">Велислава</Data></Cell>
    <Cell><Data ss:Type="Number">71</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <Print>
    <ValidPrinterInfo/>
    <HorizontalResolution>600</HorizontalResolution>
    <VerticalResolution>0</VerticalResolution>
   </Print>
   <Selected/>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>7</ActiveRow>
     <ActiveCol>5</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
  <Sorting xmlns="urn:schemas-microsoft-com:office:excel">
   <Sort>Column F</Sort>
   <Sort>Column C</Sort>
  </Sorting>
</Worksheet>
</Workbook>

 

Those are real estate listings. Every non latin character is listed wrong.

Link to comment
Share on other sites

Thanks; the UTF-8 seems to be working properly. Can you tell me what some of the characters should be, or attach the Excel? For example, if <Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell> is incorrect, what should it be and what does it look like in Excel? What version are you using? What steps did you take to export it?

Link to comment
Share on other sites

<Cell ss:Index="2"><Data ss:Type="String">не</Data></Cell> не is actually the correct thing. but i am getting characters like: РіСЂ. Банско

 

I am using office 2002 and just did save as --> xml spreadsheet

 

                                                                                          Thanks!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.