Форум сайта python.su
0
В ответ на Get запрос сервер мне присылает страницу, в коде которой мне нужно спарсить данные в формате Json
Каким инструменотом Grab это удобнее сделать?
Вот часть кода страницы:
…………
{"phoenician_salesman_hint":false,"militia_hint":false,"player_protection_finished_hint":true,"build_time_reduction_hint":false,"age_of_wonder_hint":false,"display_new_year_hint":false,"world_ends_hint":false,"grepolympia_hint":false};
Layout.militia_hint_shown = false;
Layout.show_confirmation_popup = true;
Layout.displayServerTime();
ITowns.initialize({"groups":null,"towns":[{"id":694,"name":"55 Seganapa-7 F","island_x":537,"island_y":516,"plenty":"iron","rare":"wood","has_conqueror":false,"researches":{"berth":true,"conscription":true,"mathematics":true},"favor":500},{"id":57457,"name":"55 \u0421\u043f\u0430\u0440\u0442\u0430","island_x":537,"island_y":516,"plenty":"iron","rare":"wood","has_conqueror":false,"researches":{"berth":true,"conscription":true,"mathematics":true},"favor":500}],"tmpl":"<div class=\"box top left\">\n\t<div class=\"box top right\">\n\t\t<div class=\"box top center\"><\/div>\n\t<\/div>\n<\/div>\n<div class=\"box middle left\">\n\t<div class=\"box middle right\">\n\t\t<div class=\"
……………
Отредактировано Seganapa (Авг. 23, 2012 08:24:39)
Офлайн
0
Я нашел один способ:
f = g.rex_text('towns":(.+),"tmpl')
Отредактировано Seganapa (Авг. 23, 2012 09:36:37)
Офлайн
6
не думали использовать модуль json?
Офлайн
0
Я использую модуль json, после того как получаю эти данные…
Офлайн
0
Как такое может быть??? Вот фрагмент кода
for k in city: g.setup(headers = {'X-Requested-With' : 'XMLHttpRequest'}, referer = 'http://ru8.grepolis.com/game/index?login=1') g.go('http://ru8.grepolis.com/game/index?action=switch_town&town_id=%s&h=%s&json={town_id:"%s","nlreq_id":%s}&_=%s' %(k['id'], token, k['id'], nlreq, timef)) resource = g.rex_text('resources":(.*),"storage') print resource
pydev debugger: starting
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":19244,"iron":25500}
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":23435,"iron":25500}
{"wood":25500,"stone":23435,"iron":25500}
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":25500,"iron":25500}
Traceback (most recent call last):
File "C:\Python\eclipse-SDK-4.2-win32\eclipse\plugins\org.python.pydev_2.6.0.2012062818\pysrc\pydevd.py", line 1392, in <module>
debugger.run(setup['file'], None, None)
File "C:\Python\eclipse-SDK-4.2-win32\eclipse\plugins\org.python.pydev_2.6.0.2012062818\pysrc\pydevd.py", line 1085, in run
pydev_imports.execfile(file, globals, locals) #execute the script
File "C:\Users\Noutbook\workspace\proba\Seganapa3RU8.py", line 106, in <module>
resource = g.rex_text('resources"
.*),"storage')
File "C:\Python27\Lib\site-packages\grab-0.4.5-py2.7.egg\grab\ext\rex.py", line 28, in rex_text
raise DataNotFound('Regexp not found')
grab.error.DataNotFound: Regexp not found
pydev debugger: starting
{"wood":25500,"stone":25500,"iron":25500}
{"wood":25500,"stone":19586,"iron":25500}
Traceback (most recent call last):
File "C:\Python\eclipse-SDK-4.2-win32\eclipse\plugins\org.python.pydev_2.6.0.2012062818\pysrc\pydevd.py", line 1392, in <module>
debugger.run(setup['file'], None, None)
File "C:\Python\eclipse-SDK-4.2-win32\eclipse\plugins\org.python.pydev_2.6.0.2012062818\pysrc\pydevd.py", line 1085, in run
pydev_imports.execfile(file, globals, locals) #execute the script
File "C:\Users\Noutbook\workspace\proba\Seganapa3RU8.py", line 106, in <module>
resource = g.rex_text('resources"
.*),"storage')
File "C:\Python27\Lib\site-packages\grab-0.4.5-py2.7.egg\grab\ext\rex.py", line 28, in rex_text
raise DataNotFound('Regexp not found')
grab.error.DataNotFound: Regexp not found
Отредактировано Seganapa (Авг. 23, 2012 13:15:28)
Офлайн
0
При помощи log_dir выяснил, что при отправке подряд нескольких запросов, в один прекрасный момент возвращается пустая страница… Что это может быть?
Отредактировано Seganapa (Авг. 23, 2012 14:06:08)
Офлайн
41
Seganapa
задайте этот вопрос админу сайта, который парсите …
Офлайн