Rahul Dev Posted January 25, 2011 Share Posted January 25, 2011 Hello guys i have a problem when i screen scrape a piece of text from a url and save it to my db. The text is in french and contains special characters like é. so when i screen scrape it i receive it in this form é. e.g i have a word région in the website but when i screen scrape it, it becomes région. The reason that i want to store it as it is displayed is that i need to perform some operations on the text after saving it in the db as i want. Is there any way to store the screened scrape text in the form that it is displayed or convert it to the way i want(like this - région) my code is as follows: $html = file_get_dom('http://www.defimedia.info/news/8425/Grosses-averses-%3A-les-pompiers-inond%C3%A9s-d%E2%80%99appels-'); foreach($html->find('div[class=PostContent]') as $element) { $tags = array('<div class="PostContent">', '<!-- The Adsense will automatically be inserted half way through the content. Applies for both Side and Middle options. -->', '<font face="Georgia">', '<font size="2">', ''); $new_element = str_replace($tags, "", $element); $sql1 = "UPDATE articles SET original_text = '" . mysql_real_escape_string($new_element) . "' WHERE article_id = '$item_id'"; $result1 = mysql_query($sql1) or die('Query failed: ' . mysql_error()); } Link to comment https://forums.phpfreaks.com/topic/225630-screen-scrape-special-characters-from-url/ Share on other sites More sharing options...
AbraCadaver Posted January 25, 2011 Share Posted January 25, 2011 It is é in the HTML source of the page you are scraping (check it out). In order to display in a browser it will need to be é so why do you wan't to translate it? If you must then try html_entity_decode(). Link to comment https://forums.phpfreaks.com/topic/225630-screen-scrape-special-characters-from-url/#findComment-1165060 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.