如何Python和JavaScript的工作工作、Python、JavaScript

由网友(妃唇宛若帝王血)分享简介:我工作的一个scrapy应用程序,以scrapte网页上的一些数据。I am working on a scrapy app to scrapte some data on a web page但是,通过AJAX加载一些数据,从而蟒蛇就不能执行,要获得这些数据。But there is some data loa...

我工作的一个scrapy应用程序,以scrapte网页上的一些数据。

I am working on a scrapy app to scrapte some data on a web page

但是,通过AJAX加载一些数据,从而蟒蛇就不能执行,要获得这些数据。

But there is some data loaded by ajax, and thus python just cannot execute that to get the data.

是否有模拟浏览器的行为,任何的lib?

Is there any lib that simulate the behavior of a browser?

推荐答案

对于你必须使用一个全面的Javascript引擎(如谷歌的V8在Chrome),以获得浏览器的真正的功能,以及它如何交互。但是,你可能得到一些信息,通过查找所有URL在源代码中,做一个请求到每一个,希望对一些有效的数据。但总体而言,你坚持不完整的Javascript引擎。

For that you'd have to use a full-blown Javascript engine (like Google V8 in Chrome), to get the real functionality of the browser and how it interacts. However, you could possibly get some information by looking up all URLs in the source and doing a request to each, hoping for some valid data. But in overall, you're stuck without a full Javascript engine.

喜欢的东西蟒蛇-的SpiderMonkey 。一个包装的Mozilla的JavaScript引擎。但是使用它可能是相当复杂的,但是这依赖于你的具体应用。

Something like python-spidermonkey. A wrapper to the Javascript engine of Mozilla. However using it might be rather complicated, but that's dependant on your specific application.

您会基本上都要建立一个浏览器,但似乎Python的人都让它变得更简单。随着 PyWebkitGtk 你会得到的DOM,它可以使用的python-前的SpiderMonkey或提及 PyV8 通过邓肯提到你会得到理论上需要一个浏览器的全部功能/ webscraper。

You'd basically have to build a browser, but seems Python-people have made it simple. With PyWebkitGtk you'd get the dom and using either python-spidermonkey mentioned before or PyV8 mentioned by Duncan you'd theoretically get the full functionality needed for a browser/webscraper.

阅读全文

相关推荐

最新文章