Print

Foreign Characters create symbols in PHP and MySQL

Applies to:
What
Another article trying to help people display foreign characters on their website without the funny question marks in diamond symbols and how I solved it in my case.

Why?
My company has started using international country and region names which include foreign characters. When we copy and paste their content into our website, our webpages display a question mark inside a diamond shape instead of the foreign character.

How does it happen? Have I tried the other solutions on the web? I have tried adding the following to my headers:
Did I miss anything?
So none of the above worked. If there is another solution out there, I didn't find one that worked otherwise I'd have included it in this article.

How?
The fix is a PHP one and has to do with versions of PHP and MySQL. As quoted by the PHP htmlspecialchars:
"As of PHP 5.4 they changed default encoding from "ISO-8859-1" to "UTF-8". So if you get null from htmlspecialchars or htmlentities..." Source: PHP Manual

My fix (in some cases):
copyraw
$my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT, 'ISO-8859-1', true), ENT_COMPAT, 'UTF-8');
  1.  $my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT, 'ISO-8859-1', true), ENT_COMPAT, 'UTF-8')
A more updated version:
copyraw
$my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT, 'ISO-8859-15', true), ENT_COMPAT, 'UTF-8');
  1.  $my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT, 'ISO-8859-15', true), ENT_COMPAT, 'UTF-8')
My other fix (for Croatian characters and Western languages):
copyraw
// insert after database connection and prior to database query (where $db_conn is your mysqli connection)
mysqli_set_charset($db_conn,"utf8");

// decode and display in HTML-safe
$my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT,'UTF-8', true), ENT_COMPAT, 'UTF-8')
  1.  // insert after database connection and prior to database query (where $db_conn is your mysqli connection) 
  2.  mysqli_set_charset($db_conn,"utf8")
  3.   
  4.  // decode and display in HTML-safe 
  5.  $my_description = html_entity_decode(htmlentities($my_description, ENT_COMPAT,'UTF-8', true), ENT_COMPAT, 'UTF-8') 

Other things to consider:
Conclusion:
The final fix in my case was due to me not specifying some extra parameters on the htmlentities or htmlspecialchars PHP functions. My database stored the foreign characters as they were so the fault was somewhere between PHP reading from MySQL to displaying the characters on the webpage.

Category: Personal Home Page :: Article: 637