Forums

Flask w/ Selenium

Hi guys, I have problem with running selenium on PA.

This is my code which works completely fine on my local but here on PAI don't get any error, any content, nothing. I am using selenium 4.1.3 just like you mentioned in other posts, I tried the demo from the Getting started with selenium example with google.com, gets me the title but for any other website where I try to extract a paragraph it just won't work.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


def get_extracted_content(url):
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--disable-gpu")
    driver = webdriver.Chrome(options=chrome_options)

    try:
        # Navigate to the URL
        driver.get(url)

        # Wait for the page to load
        timeout = 10
        WebDriverWait(driver, timeout).until(
            EC.presence_of_element_located((By.TAG_NAME, "p")))

        # Extract relevant content from the page
        content = ''
        for element in driver.find_elements_by_css_selector('h1, h2, h3, h4, h5, h6, p, span'):
            content += element.text + ' '
    except Exception:
        content = ''
    finally:
        # Close the driver
        driver.quit()

        if len(content):
            return ' '.join(content.split())
        else:
            raise Exception("Connection to website could not be made.")

If you're using a free account on PythonAnywhere, you can only access external websites if they are on our allow-list of official public APIs; this is to protect other sites on the Internet from being attacked by hackers using our service as a way of anonymising themselves.

Paid accounts (because we can tie them to a real person via the payment details) have unrestricted Internet access.