Jump to content

Question about UTF-8


KingNeil

Recommended Posts

I have a PHP script that downloads web pages, and puts them in an SQL database, to later view them in another PHP script.

 

I have heard that I should do the following, to avoid "weird" characters showing up.. such as..ʉ۪ in replacement of an apostrophe, ""

 

1. encode PHP script with UTF-8 in Notepad

 

2. set database table to UTF-8 general ci, including fields in the table
 
3. use header in PHP script, 
header('Content-Type: text/html; charset=utf-8');
 
4. encode mysql connection in PHP with UTF-8
mysql_query("SET NAMES UTF8",$connect);
 
---
 
My question is... how does this work on the server side..?? Will this ensure that my script works fine, or do different websites use different encodings or what..??

Is it safe to go all-UTF8, so long as I'm using English-language sites, or what..?

I just want proper apostrophes to show up.. lol
Link to comment
https://forums.phpfreaks.com/topic/288422-question-about-utf-8/
Share on other sites

I do a similar thing with my website, where I pull data from another source where some texts is in another language.

 

You would just need to properly escape the apostrophes into your database.

header('Content-Type: text/html; charset=utf-8'); 
mysqli_query($link, "SET NAMES 'UTF8'") or die("ERROR: ". mysqli_error($link));

Can try mb-detect-encoding() and mb-convert-encoding()

 

I use iconv

 

Most likely will have to do multiple methods of detection and checking, it's a real mess.

 

 

Since you are trying to save html webpages...also have to worry about fixing relative links.

Never, I repeat, never use SET NAMES. It's so sad that everybody blindly copies and pastes this code snippet without understanding the consequences.

 

What this does is silently change the character encoding of the database connection without telling PHP about it. That means critical functions like mysql_real_escape_string() will assume you're still using the original encoding and may no longer work. As a result, you could break your database security entirely.

 

Always use mysql_set_charset(). Or even better, get rid of the old MySQL extension and enter the 21. century: PDO.

 

There's a lot more to say about your task, but I'm starting to think this forum isn't the right platform for in-depth information.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.