Selenium File Downloads not working on PythonAnywhere : Forums : PythonAnywhere

Selenium File Downloads not working on PythonAnywhere

Hi,

I have a selenium-based script that is working perfectly on my local machine and on another cloud provider's service. Unfortunately, when I try it on PythonAnywhere everything works up until the moment where a file is downloaded. No error is output, the file is simply never downloaded to the intended directory. I've taken screenshots and confirmed the automation is occuring as expected. I've also configured various settings in an attempt to overcome these issues. My setup is as follows:

downloadpath = '/home/exampledirectory/downloads/
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--safebrowsing-disable-download-protection")
chrome_options.add_argument("window-size=1440,900")
chrome_options.add_argument('--disable-software-rasterizer')
chrome_options.add_argument('--disable-dev-shm-usage')

chrome_options.add_experimental_option("prefs", {
    "download.default_directory": downloadpath,
    "download.prompt_for_download": False,
    "download.directory_upgrade": True,
    "safebrowsing_for_trusted_sources_enabled": False,
    "safebrowsing.enabled": False
})

browser = webdriver.Chrome(options=chrome_options)

Any ideas why this setup works elsewhere but not on PythonAnywhere?

tsgamingllc | 15 posts | Oct. 23, 2022, 12:36 a.m. | permalink

Try adding code to do a snapshot of the page that you're getting when you expect the download to happen; perhaps the site will tell you what the problem is there.

browser.get_screenshot_as_file(filename)

...where filename is something like "mysnapshot.png".

fjl | 4348 posts | PythonAnywhere staff | Oct. 23, 2022, 1:06 p.m. | permalink

If you read my post you will see I've already done that

tsgamingllc | 15 posts | Oct. 23, 2022, 2:05 p.m. | permalink

Are you taking the screenshot repeatedly during the download? If not, perhaps there's an error in the page at some point that you're not seeing.

glenn | 9498 posts | PythonAnywhere staff | Oct. 24, 2022, 9:47 a.m. | permalink

Yes, i've taken screenshots at various times throughout the process in loops.

I will remind you that the script works perfectly on local machine and another cloud machine.

tsgamingllc | 15 posts | Oct. 24, 2022, 4:34 p.m. | permalink

One other possibility - that your script is ending/calling quit on the webdriver before the download has completed. Have you made sure that you're waiting long enough for the download to complete or that you have some code to make sure that the download has completed before continuing?

glenn | 9498 posts | PythonAnywhere staff | Oct. 24, 2022, 5:31 p.m. | permalink

I've tried long sleeps (several minutes) with no success either, the download takes seconds on my local machine. here is the smallest example script I can make which runs locally but not on PythonAnywhere:

import time
from selenium import webdriver

downloadpath = '/home/tsgamingllc/tsgamingdjango/project-python-django-webapp-master/downloads/'

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--safebrowsing-disable-download-protection")
chrome_options.add_argument("window-size=1440,900")

chrome_options.add_experimental_option("prefs", {
        "download.default_directory": downloadpath
})

browser = webdriver.Chrome(options=chrome_options)

try:
    browser.get('https://fastest.fish/test-files')
    print(browser.title)
    node = browser.find_element_by_xpath("//a[@href='https://www.dundeecity.gov.uk/sites/default/files/publications/civic_renewal_forms.zip']")
    node.click()
    time.sleep(100)

finally:
    browser.quit()

tsgamingllc | 15 posts | Oct. 24, 2022, 7:20 p.m. | permalink

I tried your code (with different downloadpath, obviously) and it seemed to work fine -- I got civic_renewal_forms.zip file downloaded pretty immediately. Maybe add an Except clause and print the exception, since now, if anything goes wrong, you wouldn't know about it.

pafk | 2973 posts | PythonAnywhere staff | Oct. 25, 2022, 10:01 a.m. | permalink

Which version of selenium and python are you using?

tsgamingllc | 15 posts | Oct. 25, 2022, 12:48 p.m. | permalink

I'm on fishnchips btw

tsgamingllc | 15 posts | Oct. 25, 2022, 10:03 p.m. | permalink

Right, fair point! Should've mentioned that I was testing on the most recent system image ("haggis") with Python 3.10 and selenium 4.1.5 initially, then upgraded to the most recent version, 4.5.0.

On "fishnchips" (recommended selenium 4.1.3, Python 3.8) I got this error:

Traceback (most recent call last):
  File "tsgamingllc_test.py", line 24, in <module>
    node.click()
AttributeError: 'dict' object has no attribute 'click'

pafk | 2973 posts | PythonAnywhere staff | Oct. 26, 2022, 8:45 a.m. | permalink

Hi, did you manage to fix this? Not sure why my Selenium script only fails on pythonanywhere. I'm on Glastonbury. The code never runs to completion but works perfectly fine on my local machine. I've moved from python3.6 to 3.9 and selenium 3.1 which was on my previous Virtual environment to 4.8.2 same problem.

this line particularly fails

paginators = WDwait(driver, timeout=100).until(EC.visibility_of_all_elements_located((by.CSS_SELECTOR, "a.fl")))

Then I tried this too

paginators = driver.find_elements(by.CLASS_NAME, "fl")

None worked on Pythonanywhere, paginators return an empty list.

BTW, the code above attempts to fetch the paginator links from a Google SERP. Works perfectly locally.

I've tried multiple wait times (minutes) still fails

squirreppr | 11 posts | May 29, 2023, 3:40 p.m. | permalink

It sounds to me like you might be getting a different page when you run your code on PythonAnywhere to what you get when you run it locally; some sites will send back different results depending on the IP address of the requesting browser. The best way to debug that is to use the driver.get_screenshot_as_file(filename) function to get a screenshot, and see what the page looks like.

giles | 11788 posts | PythonAnywhere staff | May 29, 2023, 5:33 p.m. | permalink

No, this is not the case. I took snapshots at different stages of the execution and everything is in order. This is exactly the same issue @tsgamingllc reported. The question is why does it work locally and on other cloud providers but not on Pythonanywhere?

squirreppr | 11 posts | May 30, 2023, 7:03 a.m. | permalink

Does the screenshot of the page with the paginators show the paginators?

glenn | 9498 posts | PythonAnywhere staff | May 30, 2023, 1:51 p.m. | permalink

I'll chime in and say ultimately I couldn't get it to work with my setup. I was using an older system image and couldn't upgrade to haggis because I had a bunch of legacy app stuff I'm too lazy to go through package migration on.

I did start a second account as a test, and the script seemed to work on haggis with the most updated selenium. So I guess if you are in a place where you can upgrade system image it might be worth a try.

tsgamingllc | 15 posts | May 30, 2023, 2:15 p.m. | permalink

@Glenn, yes the screenshot shows the paginators. It shows a typical Google SERP, and that comes with paginators at the bottom of the page. @tsgamingllc many thanks for your suggestion, I'll try this out.

squirreppr | 11 posts | May 30, 2023, 2:44 p.m. | permalink