PHP读取文件中文乱码UTF-8;
$opts = array( "file" => array( "encoding" => "utf-8" ) ); $opts = array("http" => array("encoding" => "utf-8")); $ctxt = stream_context_create($opts); $content = file_get_contents($filePath, FILE_TEXT, $ctxt);
最简单的就是将GF2312→UTF-8
$str = iconv("gb2312", "utf-8", $str);
不管用的
$content = mb_convert_encoding($content, "UTF-8", "auto");
******************************************丑陋的分割线来告诉大家上面的不好的:下面的才是正确的方法···哈哈···**********************************************************
define("UTF32_BIG_ENDIAN_BOM", chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF)); define("UTF32_LITTLE_ENDIAN_BOM", chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00)); define("UTF16_BIG_ENDIAN_BOM", chr(0xFE) . chr(0xFF)); define("UTF16_LITTLE_ENDIAN_BOM", chr(0xFF) . chr(0xFE)); define("UTF8_BOM", chr(0xEF) . chr(0xBB) . chr(0xBF)); $text = file_get_contents($newPath); $first2 = substr($text, 0, 2); $first3 = substr($text, 0, 3); $first4 = substr($text, 0, 3); $encodType = ""; if ($first3 == UTF8_BOM) $encodType = "UTF-8 BOM"; else if ($first4 == UTF32_BIG_ENDIAN_BOM) $encodType = "UTF-32BE"; else if ($first4 == UTF32_LITTLE_ENDIAN_BOM) $encodType = "UTF-32LE"; else if ($first2 == UTF16_BIG_ENDIAN_BOM) $encodType = "UTF-16BE"; else if ($first2 == UTF16_LITTLE_ENDIAN_BOM) $encodType = "UTF-16LE"; $content = file_get_contents($newPath); $content = iconv($encodType, "utf-8", $content);
终极版·····
$text = file_get_contents($filePath); //$encodType = mb_detect_encoding($text); define("UTF32_BIG_ENDIAN_BOM", chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF)); define("UTF32_LITTLE_ENDIAN_BOM", chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00)); define("UTF16_BIG_ENDIAN_BOM", chr(0xFE) . chr(0xFF)); define("UTF16_LITTLE_ENDIAN_BOM", chr(0xFF) . chr(0xFE)); define("UTF8_BOM", chr(0xEF) . chr(0xBB) . chr(0xBF)); $first2 = substr($text, 0, 2); $first3 = substr($text, 0, 3); $first4 = substr($text, 0, 3); $encodType = ""; if ($first3 == UTF8_BOM) $encodType = "UTF-8 BOM"; else if ($first4 == UTF32_BIG_ENDIAN_BOM) $encodType = "UTF-32BE"; else if ($first4 == UTF32_LITTLE_ENDIAN_BOM) $encodType = "UTF-32LE"; else if ($first2 == UTF16_BIG_ENDIAN_BOM) $encodType = "UTF-16BE"; else if ($first2 == UTF16_LITTLE_ENDIAN_BOM) $encodType = "UTF-16LE"; //下面的判断主要还是判断ANSI编码的· if ($encodType == "") {//即默认创建的txt文本-ANSI编码的 $content = iconv("GBK", "UTF-8", $text); } else if ($encodType == "UTF-8 BOM") {//本来就是UTF-8不用转换 $content = $text; } else {//其他的格式都转化为UTF-8就可以了 $content = iconv($encodType, "UTF-8", $text); }
以上的终极版·可以适应中文操作windows系统建立的ANSI``````````````UTF-8`````````Unicode`````的txt文本····
别的编码的待补充···
还是那句话···:····
↓
欢迎交流··········
O(∩_∩)O
声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。
- 上一篇: file_get_contents模拟浏览器访问的时候乱码
- 下一篇: php 抓取页面乱码