jasonc Posted September 3, 2016 Share Posted September 3, 2016 I have been using an outdated method for sometime to store and retrieve records from my database. I am wanting to know the best way to convert or translate, whatever, the text to a format that I store in the database and then what I should be doing to get it back before it is displayed on the webpage. I tried htmlentities and stuff like that but not realy understanding the flags and stuff and not getting it. Please help me or at least guide me to the right commands that I should be using for what. I do not really want someone to do it, just tel me what commands and how to use them. The adding to the database and getting back is something I know. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/ Share on other sites More sharing options...
NotionCommotion Posted September 3, 2016 Share Posted September 3, 2016 I think it is best to store the data in its raw format, but escape when presenting the content. That way whether inputted via the application or directly into the database, the viewer is protected. Of course, escape for SQL injection using PDO prepared statements. To escape, use the following (or better yet, start using a template engine such as Twig or Smarty). htmlentities($yourContentFromTheDB, ENT_QUOTES, 'UTF-8'); Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537053 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 I currently use MySQLi and $mysqli->real_escape_string($string) to escape the string before adding to the database. So I just add the text as they entered it without altering it in anyway but escape it using $mysqli->real_escape_string($string) Then when grabbing it from the table to display on the webpage I just unescape it using htmlentities($yourContentFromTheDB, ENT_QUOTES, 'UTF-8'); Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537054 Share on other sites More sharing options...
NotionCommotion Posted September 3, 2016 Share Posted September 3, 2016 There are two types of escaping going on, and they are not related. real_escape_string() protects against SQL injection where the bad guy sends content which can compromise your database. See https://www.owasp.org/index.php/SQL_Injection for more. After you escaped it, the data is not changed (unless it was a bad guy attack in which it will store the escaped content, but don't worry about that), and there is nothing to escape back. htmlentities() escapes against cross-site scripting and is to protect the viewer of your site. See https://www.owasp.org/index.php/Cross-site_Scripting_(XSS) for more. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537055 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 So if I have understood this the htmlentities will convert all special caracters to their ACSII numbers and still show the text as say... <script>...</script> to the viewer but in the source code it shows as the ASCII version. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537057 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 I was just checking the PHP.net site for the htmlentities it seems I may have found out why I was not able to convert/remove the special characters before as I was getting a blank string back. I may have forgotten to use the | ENT_IGNORE part. I am going to find the code I used to test this out. <?php$str = "\x8F!!!";// Outputs an empty stringecho htmlentities($str, ENT_QUOTES, "UTF-8");// Outputs "!!!"echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");?> Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537058 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 (edited) I just entered the text as is in the table using my script. escaping the string first.. ❤French style nest of table , oh la la £40❤ and it shows as this in the database... â¤French style nest of table , oh la la £40⤠But the £ and the heart do not show up even when I set the page as UTF-8 I used echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8"); to output it on the page, but the page is blank still and nothing in the source code. in the script I placed this at the start... <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>Untitled Document</title><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head><body> and the following at the end... </body></html> Edited September 3, 2016 by jasonc Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537059 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 ok I use the following to add the entry as it and escape the text in case it contains something it should not... $query="INSERT INTO `tablename` (`id`, `title`) VALUES ('1','".$mysqli->real_escape_string('<script>alert("1");< /script>')."')"; db_query($mysqli, $query); Then to get it back I use... echo(htmlentities($title, ENT_QUOTES | ENT_IGNORE, "UTF-8"); But I get a blank page, not even any source code... so I change it to this... echo($title); and sure enough the alert box shows up. Am I missing something here ? How do I stop code like this from working as the bad poster expected... Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537060 Share on other sites More sharing options...
mac_gyver Posted September 3, 2016 Share Posted September 3, 2016 But I get a blank page, not even any source code... because, for the example code you are posting, it has a php syntax error in it, and the code never runs. do you have php's error_reporting set to E_ALL and display_errors set to ON in the php.ini on your development system so that php will help you by reporting and displaying all the errors it detects? note: for a php syntax error in your main file. you cannot put the error_reporting/display_errors settings into your code since your code never runs in this case to make use of those settings. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537061 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 Yes both of these are already set and in php.ini file. But no errors are logged Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537063 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 ah after splitting up the line in to sections I found the error... the missing ) all working now, and I see who this works too. Thank you Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537064 Share on other sites More sharing options...
mac_gyver Posted September 3, 2016 Share Posted September 3, 2016 since there was a php syntax error, that means the php.ini where you have placed the error_reporting/display_errors settings likely isn't the one that php is using. you need to make sure that your development system is set up to report and display all errors before you waste any more time on problems. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537065 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 Now the big issue I have is to convert everything back to normal text in the database after I have incorrectly converted most if not all special characters to their hash and other formats. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537067 Share on other sites More sharing options...
jasonc Posted September 3, 2016 Author Share Posted September 3, 2016 (edited) ok all seem ok, apart from the £ sign, i get the diamond with a question mark in it. ❤French style nest of table , oh la la �40❤ it should be ❤French style nest of table , oh la la £40❤ ok what I have done is echod out all the ord() codes... 226 �157 �164 �70 F114 r101 e110 n99 c104 h32115 s116 t121 y108 l101 e32110 n101 e115 s116 t32111 o102 f32116 t97 a98 b108 l101 e3244 ,32111 o104 h32108 l97 a32108 l97 a32163 �52 448 0226 �157 �164 � Edited September 3, 2016 by jasonc Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537068 Share on other sites More sharing options...
Jacques1 Posted September 3, 2016 Share Posted September 3, 2016 I'm afraid this isn't the only issue. The malformed output you posted earlier indicates that there are other encoding-related problems. Make sure the database is configured for UTF-8. Note that the database itself, every table and each individual column can have their own encodings. Make sure the database connection is configured for UTF-8. Make sure the Content-Type HTTP header declares the document as UTF-8 text. Either let your webserver do that or set the header in PHP with the header() function. Get rid of the ENT_IGNORE flag. As the manual explicitly says, this is a security risk. Use ENT_SUBSTITUTE to deal with invalid encoding sequences. It's also time to switch to HTML5. The days of HTML 4.01 are over. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537069 Share on other sites More sharing options...
NotionCommotion Posted September 3, 2016 Share Posted September 3, 2016 It's also time to switch to HTML5. The days of HTML 4.01 are over. Thank God! Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537071 Share on other sites More sharing options...
jasonc Posted September 4, 2016 Author Share Posted September 4, 2016 Yes my page is html5, but some of the code was outdated. It seems that so far all information that I add as new is being stored as is, no change to the text apart from escaping before being added and when displayed all looks ok and of the scripting text I tried this also shows as html entities in the source code. All good so far. But still there is the issue of all the entries in the database that was incorrectly stored. How would I start to convert it to normal text as it would be if I had entered it as of now. £ instead of the chr(163) which shows on the page as a diamond with question mark in it. and the other unicode characters. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537074 Share on other sites More sharing options...
Jacques1 Posted September 4, 2016 Share Posted September 4, 2016 The fact that “£” is still ISO-8859-1 encoded means you have not fully converted your application to UTF-8 yet. Re-read my reply #15 and go through each of the four steps. We cannot do that for you. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537075 Share on other sites More sharing options...
jasonc Posted September 4, 2016 Author Share Posted September 4, 2016 (edited) All tables and databases are set to utf-8. Adding/editing new entries is working ok. The actual data in the database which was added ages ago is still in the old format. Is there a way to convert the previous existing data before I changed to the correct method of storing the data. Edited September 4, 2016 by jasonc Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537076 Share on other sites More sharing options...
Jacques1 Posted September 4, 2016 Share Posted September 4, 2016 (edited) CONVERT TO Edited September 4, 2016 by Jacques1 Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537077 Share on other sites More sharing options...
jasonc Posted September 4, 2016 Author Share Posted September 4, 2016 (edited) ALTER TABLE `tablename` CONVERT TO CHARACTER SET utf8 [COLLATE utf8_general_ci] I used this but still the text in the data is not showing on the webpage as that of a new entry that is added via my CMS area. Edited September 4, 2016 by jasonc Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537078 Share on other sites More sharing options...
jasonc Posted September 4, 2016 Author Share Posted September 4, 2016 The £ is showing as a diamond for one, I'll have to check for the other characters. Quote Link to comment https://forums.phpfreaks.com/topic/302074-correctly-store-and-retrieve-records-from-db-table-that-may-contain-unicode/#findComment-1537079 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.