Forums

Cannot make external requests to two hosts my app requires

Hello!

Just setup my web app (Django 3.1.7) and it uses the requests module to get data from two external sources, one is https://microsoft.com/en-ca (it scrapes game store pages) and the other is https://www.giantbomb.com/api (the official Giant Bomb API).

I've seen other posts here asking similar things and it seems like they would need to be added to a whitelist?

Here is the full traceback for one of my functions that seeds the DB:

Traceback (most recent call last):
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 696, in urlopen
    self._prepare_proxy(conn)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 964, in _prepare_proxy
    conn.connect()
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/urllib3/connection.py", line 366, in connect
    self._tunnel()
  File "/usr/lib/python3.8/http/client.py", line 898, in _tunnel
    raise OSError("Tunnel connection failed: %d %s" % (code,
OSError: Tunnel connection failed: 403 Forbidden
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.microsoft.com', port=443): Max retries exceeded with url: /en-ca/p/immortals-fenyx-rising/c07kjzrh0l7s?activetab=pivot:overviewtab (Caused by Proxy
Error('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/tromik/final/wishlist/xbox/management/commands/seed_games.py", line 21, in handle
    xbox_store_page = scrape_xbox_store_game_page(url)
  File "/home/tromik/final/wishlist/xbox/util.py", line 262, in scrape_xbox_store_game_page
    game_page = get_game_page(url)
  File "/home/tromik/final/wishlist/xbox/util.py", line 50, in get_game_page
    response = requests.get(url)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/tromik/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/requests/adapters.py", line 510, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='www.microsoft.com', port=443): Max retries exceeded with url: /en-ca/p/immortals-fenyx-rising/c07kjzrh0l7s?activetab=pivot:overviewtab (Caused by ProxyEr
ror('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))

We can only whitelist sites that provide an official public API on the specific hostname that you're trying to access; I think it's unlikely that www.microsoft.com does that, but if it does, we can add it to the list if you give us a link to the docs for that API.

Thanks for the info. Sorry, just to clarify I'm not using a Microsoft API, I'm just making a request to get a web page for scraping, in case that changes anything; the host/endpoints I'm hitting are public. E.g. https://www.microsoft.com/en-ca/p/wasteland-3-xbox-one/BQ9T0JF0D3L4?activetab=pivot:overviewtab

How about the Giant Bomb API? It requires a login but registration is free.

We can't whitelist the Microsoft domain, unfortunately.

The Giant Bomb API does sound like a possible candidate for the whitelist, though -- is the documentation publicly-visible? If not, perhaps you could email us credentials (at support@pythonanywhere.com) that would allow us to see the docs?