KXmlParser抛出"意外标记"除了在RSS考取开始抛出、标记、意外、KXmlParser

由网友(罗曼蒂克)分享简介:我试图解析从怪物RSS源使用此URL Android的第17节:I'm trying to parse an RSS feed from Monster on Android v.17 using this URL:http://rss.jobsearch.monster.com/rssquery.ashx? Q =...

我试图解析从怪物RSS源使用此URL Android的第17节:

I'm trying to parse an RSS feed from Monster on Android v.17 using this URL:

http://rss.jobsearch.monster.com/rssquery.ashx? Q = java的

要获得以下列方式我使用HttpURLConnection的内容

To get the content I'm using HttpUrlConnection in the following fashion

this.conn = (HttpURLConnection) url.openConnection();
this.conn.setConnectTimeout(5000);
this.conn.setReadTimeout(10000);
this.conn.setUseCaches(true);
conn.addRequestProperty("Content-Type", "text/xml; charset=utf-8");
is = new InputStreamReader(url.openStream());

什么回来是,据我可以说(我验证它太)一个合法的RSS

What comes back is as far as I can say (and I verified it too) a legit RSS

Cache-Control:private
Connection:Keep-Alive
Content-Encoding:gzip
Content-Length:5958
Content-Type:text/xml
Date:Wed, 06 Mar 2013 17:15:20 GMT
P3P:CP=CAO DSP COR CURa ADMa DEVa IVAo IVDo CONo HISa TELo PSAo PSDo DELa PUBi BUS LEG PHY ONL UNI PUR COM NAV INT DEM CNT STA HEA PRE GOV OTC
Server:Microsoft-IIS/7.5
Vary:Accept-Encoding
X-AspNet-Version:2.0.50727
X-Powered-By:ASP.NET

它开始像这样(点击,如果你想看到完整的XML上面的网址):

It starts like this (click the URL above if you want to see complete XML):

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
  <channel>
    <title>Monster Job Search Results java</title>
    <description>RSS Feed for Monster Job Search</description>
    <link>http://rss.jobsearch.monster.com/rssquery.ashx?q=java</link>

但是,当我试图解析:

But when I attempt to parse it:

final XmlPullParser xpp = getPullParser();
xpp.setInput(is);
for (int type = xpp.getEventType(); type != XmlPullParser.END_DOCUMENT; type = xpp.next()) { /* pasing goes here */ }

在code立即扼流圈键入= xpp.next(),以下情况例外

03-06 09:27:27.796: E/AbsXmlResultParser(13363): org.xmlpull.v1.XmlPullParserException: 
   Unexpected token (position:TEXT @1:2 in java.io.InputStreamReader@414b4538) 

这实际上意味着在行它不能处理第二个字符1 &LT; XML版本=1.0编码=UTF-8&GT;

下面是在KXmlParser.java(425-426)有问题的线路。类型== TEXT计算结果为

Here are the offending lines in the KXmlParser.java (425-426). The type == TEXT evaluates to true

if (depth == 0 && (type == ENTITY_REF || type == TEXT || type == CDSECT)) {
    throw new XmlPullParserException("Unexpected token", this, null);
}

任何帮助吗?我也尝试设置解析器 XmlPullParser.FEATURE_PROCESS_DOCDECL = FALSE 但这并没有帮助

我没有研究这个在网络上,并在这里并不能找到任何有助于

I did research this on the web and here and can't find anything that helps

推荐答案

您收到此错误的原因是,XML文件实际上并没有开始与&LT; XML版本=1.0编码=UTF-8&GT; 。它从三个特殊字节 EF BB BF 这是 字节顺序标记

The reason you are getting the error is that the xml file doesn't actually start with <?xml version="1.0" encoding="utf-8"?>. It starts with three special bytes EF BB BF which are Byte order mark.

InputStreamReader的不会自动处理这些字节,所以你必须手动处理它们。最简单的方法是使用BOMInpustStream在 共享IO 可库:

InputStreamReader doesn't handle these bytes automatically, so you have to handle them manually. The simplest way to it is to use BOMInpustStream available in Commons IO library:

this.conn = (HttpURLConnection) url.openConnection();
this.conn.setConnectTimeout(5000);
this.conn.setReadTimeout(10000);
this.conn.setUseCaches(true);
conn.addRequestProperty("Content-Type", "text/xml; charset=utf-8");
is = new InputStreamReader(new BOMInputStream(conn.getInputStream(), false, ByteOrderMark.UTF_8));  

我检查了code以上,它很适合我。

I've checked the code above and it works well for me.

阅读全文

相关推荐

最新文章