Телеграмм чат группы scrapy

Is there way to persist session for two particular Request?

I'm trying to replicate the answer code in this question.
https://stackoverflow.com/questions/38511444/python-download-files-from-google-drive-using-url

It downloads the file from given Google drive url.

In our requirements, we are getting Gdrive links. We can resolve the link for small file. However, Google sends to without scan page for large files. Stackoverflow thread help resolve that.

However how to achieve the session persistence with scrapy?
If anyone have encountered similar scenario, the insights will be helpful. Thanks

источник

14:42пожаловаться #7

i in Scrapy

You can perform "login" in upper-level callback, then it will propagated in next callbacks. You must ensure, that you did not disabled cookies. By default spiders use one cookie jar.
Example: https://docs.scrapy.org/en/latest/topics/request-response.html?highlight=login#using-formrequest-from-response-to-simulate-a-user-login

источник

14:55пожаловаться #8

Harsh in Scrapy

Thanks for response.

Yes, cookies are enabled. The nature of request is get. I'll check it

источник

14:57пожаловаться #9

Harsh in Scrapy

I have replicated code to get confirm token using scrapy.Request

If download request could be triggered with same session used by previous request, it'll succeed I guess.

источник

15:00пожаловаться #10

i in Scrapy

by "login", I mean, you can just pass cookies/headers to request, in some cases could be achieved even by short selenium session (not recommended though):

yield response.follow(self.start_urls[0], callback=self.after_cookies_inserted, cookies=cookies_selenium, headers=after_login_headers, dont_filter=True)

источник

15:07пожаловаться #11

Harsh in Scrapy

Response.follow, this seems to chain. I'll try that. Thanks

источник

15:09пожаловаться #12

Harsh in Scrapy

Yes, I can't use selenium.

источник

15:09пожаловаться #13

Andrey Rahmatullin in Scrapy

response.follow doesn't do anything magical btw

источник

15:29пожаловаться #14

Harsh in Scrapy

mp4.mp4

(88.28 Кб)

источник

16:34пожаловаться #15

Harsh in Scrapy

Sorry if gifs not allowed

источник

16:34пожаловаться #16

rink0 in Scrapy

Ребята а bs4 умеет с json работать?
а то я спарсил <script></script>
А там внутри этого скрипта, json.
Попытался написать .text, написало что AttributeError: 'NoneType' object has no attribute 'text'
а в доке ничего про json не нашёл

источник

18:27пожаловаться #17

Andrey Rahmatullin in Scrapy

так тебе содержимое тега надо или чтобы оно как дикт отдалось? потому что звучит как будто первое, а там не нужна поддержка джсона

источник

18:29пожаловаться #18

rink0 in Scrapy

мне именно содержимое тега

источник

18:29пожаловаться #19

rink0 in Scrapy

но второе тоже мне нравится

источник

18:29пожаловаться #20