Forums

Potential problem with PA servers: Temporary failure in name resolution

Hi PA team,

Is everything ok with your side? as of 20 mins ago I have starting having multiple processes start to fail to connect to any external APIs, and I am receiving messages like:

httplib2.ServerNotFoundError: Unable to find the server at www.googleapis.com

Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f297eec5610>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

This is happening for multiple different external apis which suggests the problem might be our side

Let me know if you can help

I'm seeing the following frequent but intermittent errors starting at 5:14am US-MST. This is an always-on task that's been unchanged for years. I have no problems connecting to the remote server from my desktop for error #2

  1. "Can't connect to MySQL server on '#.mysql.pythonanywhere-services.com:3306' (-2 Name or service not known)"
  2. "HTTPSConnectionPool(host='##', port=443): Max retries exceeded with url: /### (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x####>: Failed to establish a new connection: [Errno -2] Name or service not known',))"

Is this ongoing right now, and if so, for which task or tasks? We've been investigating after a report of a problem like this on Twitter, and it does look like one of the always-on task servers had a brief network issue (which would manifest as a name resolution problem, because any outbound network connection will start off by doing a DNS lookup to determine the IP address of the destination server), but that appears to have cleared at 15:28:36 UTC. We'll be looking into why our systems didn't page us about the issue, and of course what the underlying problem might be.

Giles, yes ongoing. I should only have one task running - that's the one.

I can confirm the issues are ongoing for our end, also for all of my tasks. Based on the logs it appears to be intermittent (occasionally the task is running fine) but its definitely still ongoing

I'm experiencing similar problems right now. It seems only my tasks are affected. Not when running the same code from Files.

Same here to, i am facing similar problems too. Only from tasks.

I can confirm that as of 6.15pm UTC (10 mins ago), my tasks are now all working normally - thank you for fixing on a sunday, much appreciated

Thanks, everyone. We're investigating. The initial indications pointed to an issue with one always-on server, which we replaced, but we're still seeing issues there after that, and there are very occasional network glitches on some of the others, so it's clearly a higher-level problem.