manuel2 Posted July 27, 2007 Share Posted July 27, 2007 This is driving me mad! I have tried CURL and the well know HTTPRequest class (uses fsockopen) to scrap translate.google.com/translate_t and always get bogus utf-8 files. Any clue? I have scrapped many utf-8 content pages before and never got into this, HELP! Code is in here: http://www.phpfreaks.com/forums/index.php/topic,138145.0.html Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/ Share on other sites More sharing options...
btherl Posted July 27, 2007 Share Posted July 27, 2007 Can you give more detail please? How do you know the utf-8 is bogus? Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308633 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Can you give more detail please? How do you know the utf-8 is bogus? Hello. Thanks for your comment. I get ������� instead of utf-8.... (and I insert the charset = utf-8 on the metatags to display the page) Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308639 Share on other sites More sharing options...
btherl Posted July 27, 2007 Share Posted July 27, 2007 Please post your complete code. Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308645 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Please post your complete code. $lang = "ar"; //example $url = "http://translate.google.com/translate_t"; $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, $useragent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_POST, 4); $postdata="hl=en&ie=UTF8&langpair=en|".$lang."&text=".$text; curl_setopt($ch, CURLOPT_POSTFIELDS,$postdata); $result= curl_exec ($ch); curl_close ($ch); echo $result; Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308647 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Isn't it strange? Thanks in advance for any help. Regards Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308659 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Any help? Thanks. This is really weird. Just run the code above and check for yourself... Is Google sending pages in a strange format?! Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-308922 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Please some help! I have used several methods (besides curl) to get the page and still can't get a decent utf-8 page... Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-309148 Share on other sites More sharing options...
per1os Posted July 27, 2007 Share Posted July 27, 2007 Maybe the page isn't UTF-8 Encoded ??? Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-309149 Share on other sites More sharing options...
manuel2 Posted July 27, 2007 Author Share Posted July 27, 2007 Maybe the page isn't UTF-8 Encoded ??? Damn. Is this a trick from Google to protect itself from scrappers and automatic script translators? Indeed I don't see the utf-8 metatag set on http://translate.google.com/translate_t How can I figure it out how the page is encoded? Sniffing http headers? Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-309191 Share on other sites More sharing options...
manuel2 Posted July 28, 2007 Author Share Posted July 28, 2007 I solved it, I solved it! It's indeed a Google problem. Forget the Accept-Charset: utf-8, it will never work... the solution is rather tricky, lol. I wasted hours trying everything. Link to comment https://forums.phpfreaks.com/topic/61966-google-translate-and-utf8/#findComment-309280 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.