I'm trying to run a reliable process that scrapes simple pages at scale. The issue is that I can't seem to make it reliable. It sometimes works fine, but often times - especially if I run it a few times in a row - it inexplicably doesn't work even though nothing has changed.
Here's the error I get:
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: crashed
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
And here's the code where it errors (at the driver = webdriver.Chrome step):
def get_chromedriver():
path = os.path.dirname(os.path.abspath(__file__))
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(
os.path.join(path, 'chromedriver'),
chrome_options=chrome_options)
return driver
Any ideas? Any thing I can adjust to make it reliable?