如何在文件编码不明使用ReadAllText文件、如何在、ReadAllText

由网友(月光再亮终究冰凉)分享简介:即时阅读的文件是 ReadAllText String[] values = File.ReadAllText(@"c:\\c\\file.txt").Split(';');int i = 0;foreach (String s in values){System.Console.WriteLine("output:...

即时阅读的文件是 ReadAllText

    String[] values = File.ReadAllText(@"c:cfile.txt").Split(';');

    int i = 0;

    foreach (String s in values)
    {
        System.Console.WriteLine("output: {0} {1} ", i, s);
        i++;
    }

如果我尝试阅读一些文件,我得到有时错误的字符后面(的​​ÖÜÄÀ...)。输出类似,其原因是与编码一些麻烦'?':

If I try to read some files I get sometimes the the wrong character back (for ÖÜÄÀ...). The output is like '?', its because there is some trouble with the encoding:

output: 0 TEST
output: 1 A??O?

一个解决办法是设置在ReadAllText编码,可以说像 ReadAllText(@C: Çfile.txt的,Encoding.UTF8),可以解决这个问题。但是,如果我仍然会得到什么?作为输出?如果我不知道该文件的编码?而如果每一个文件有不同的编码?什么是用C#做到这一点的最好方法是什么?谢谢

One solution would be to set the encoding in ReadAllText, lets say something like ReadAllText(@"c:cfile.txt", Encoding.UTF8) that could fix the problem. But what if I would still get '?' as output? What if I dont know the encoding of the file? And what if every single file got a different encoding? What would be the best way to do it with c#? Thank you

推荐答案

可靠地做到这一点的唯一方法是寻找字节顺序标记在文本文件的开头。 (此Blob更普遍地重presents字符编码所使用的字节顺序,而且编码 - 如UTF8,UTF16,UTF32)。不幸的是,这种方法仅适用于统一code为基础的编码,并没有在这之前(对此不太可靠的方法,必须使用)。

The only way to reliably do this is to look for byte order marks at the start of the text file. (This blob more generally represents the endianness of character encoding used, but also the encoding - e.g. UTF8, UTF16, UTF32). Unfortunately, this method only works for Unicode-based encodings, and nothing before that (for which much less reliable methods must be used).

的StreamReader类型支持检测这些标志,以确定编码 - 你只需要一个标志传递给该参数,因为这样的:

The StreamReader type supports detecting these marks to determine the encoding - you simply need to pass a flag to the parameter as such:

new System.IO.StreamReader("path", true)

您可以检查 stremReader.CurrentEncoding 的值来确定所使用的文件的编码。然而,需要注意的是,如果没有字节编码标记存在,那么 CurrentEncoding 将默认为 Encoding.Default

You can then check the value of stremReader.CurrentEncoding to determine the encoding used by the file. Note however that if no byte encoding marks exist, then CurrentEncoding will default to Encoding.Default.

参考$ C $的CProject的解决方案来检测编码

阅读全文

相关推荐

最新文章