Forums

USERNAME.pythonanywhere.com didn't send any data

I have an Django app that runs a scrapy spider depending on a name given in the webapp. I can see in the server logs that the spider is started properly, but after some time, I get this error(in the browser, not the logs):

This page isn't working

AZServices.pythonanywhere.com didn't send any data

ERR_EMPTY_RESPOSE

I read something about whitelisted sites and free accounts (I'm using a free one :)), could that be the problem? I am trying to scrape ra.co.

I don't think that would be the cause of this specific error (though you won't be able to scrape sites that aren't on the allowlist) with a free account. It sounds like your code just isn't returning a response. What code do you have specifically for this view?

This is my views.py

def index(request):

    THIS_FOLDER = Path(__file__).parent.parent.resolve()
    print(f'THIS_FOLDER is: {THIS_FOLDER}')

    if request.method == "POST":

        artist = request.POST.get("artist")
        print(f'artist is: {artist}')
        try:
        # configure_logging({'LOG_FORMAT': '%(levelname)s: %(message)s'})
            crawler = CrawlerRunner(get_project_settings())
            q = crawler.crawl(PostsSpider, artist=artist)
            q.addBoth(lambda _: reactor.stop())

            reactor.run()

            #download the XLSX file
            file_path = os.path.join(THIS_FOLDER, 'output.xlsx')
            print(f'file path is:{file_path}')
            response = FileResponse(open(file_path, 'rb'), as_attachment=True, filename='output.xlsx')
            print('hallo, ik ben zelfs hier')
            return response

        except Exception as e:
            print('Error', str(e))

    return render(request, 'index.html')

Any help would be very much appreciated!! ; )

PS: Through some very advanced commenting out code and print statements I am confident that the code gets stuck at 'reactor.run()'

If that get's stuck for longer than 5 minutes the process handling the request would be killed. If it passes through it, you could check the contents of the response (around this print 'hallo, ik ben zelfs hier') and see if it's not empty. Also you could check if the file_path represents a real file with some contents.

HI, it does not take 5 minutes, rather 30 sec or so. I put a print statement right underneath reactor.run(), but it doesn't get triggered. How can I check the contents of the response if it is empty before reactor.run() and won't get triggered if I put it after? I print the file path and it leads to an existing file, with contents in it. If I just comment out the reactor.run(), it downloads the existing file (that should be overwritten).

So it never gets to the response? hallo, ik ben zelfs hier is not printed in the log? How can you tell reactor.run() takes about 30 seconds then?

My apologies for the haziness. no, it never gets to the print statement after react.run(). However, it got to the print statement I put before react.run(). after that statement got printed, it took about 30 seconds for the browser to display the aforementioned error.

If that code starts and threads, it will not work in a web app. See https://help.pythonanywhere.com/pages/AsyncInWebApps/ for how to move that sort of code out of your web app.