Jump to content

Recommended Posts

I have a PHP script that downloads web pages, and puts them in an SQL database, to later view them in another PHP script.

 

I have heard that I should do the following, to avoid "weird" characters showing up.. such as..ʉ۪ in replacement of an apostrophe, ""

 

1. encode PHP script with UTF-8 in Notepad

 

2. set database table to UTF-8 general ci, including fields in the table
 
3. use header in PHP script, 
header('Content-Type: text/html; charset=utf-8');
 
4. encode mysql connection in PHP with UTF-8
mysql_query("SET NAMES UTF8",$connect);
 
---
 
My question is... how does this work on the server side..?? Will this ensure that my script works fine, or do different websites use different encodings or what..??

Is it safe to go all-UTF8, so long as I'm using English-language sites, or what..?

I just want proper apostrophes to show up.. lol
Link to comment
https://forums.phpfreaks.com/topic/288422-question-about-utf-8/
Share on other sites

I do a similar thing with my website, where I pull data from another source where some texts is in another language.

 

You would just need to properly escape the apostrophes into your database.

header('Content-Type: text/html; charset=utf-8'); 
mysqli_query($link, "SET NAMES 'UTF8'") or die("ERROR: ". mysqli_error($link));

Can try mb-detect-encoding() and mb-convert-encoding()

 

I use iconv

 

Most likely will have to do multiple methods of detection and checking, it's a real mess.

 

 

Since you are trying to save html webpages...also have to worry about fixing relative links.

Never, I repeat, never use SET NAMES. It's so sad that everybody blindly copies and pastes this code snippet without understanding the consequences.

 

What this does is silently change the character encoding of the database connection without telling PHP about it. That means critical functions like mysql_real_escape_string() will assume you're still using the original encoding and may no longer work. As a result, you could break your database security entirely.

 

Always use mysql_set_charset(). Or even better, get rid of the old MySQL extension and enter the 21. century: PDO.

 

There's a lot more to say about your task, but I'm starting to think this forum isn't the right platform for in-depth information.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.