Hi, I googled and googled but I can't come up with a solution so I wanted to ask you for help. I think it's related to the Scrapy modul that I am not allowed to pip uninstall.
Background Informations
I am running a flask web app at kulturdata.pythonanywhere.com/audit where I use the great pip package "advertools" to run some web scraping tasks.
If I run the web scraping file alone in my virtualenv it works just fine. But when I try to import that same module from my flask server.py file and try to run a scrape it serves an error.
import flask
from flask import request, render_template, redirect, url_for
app = flask.Flask(__name__, instance_relative_config=True,
template_folder='../frontend', static_folder="../frontend")
@app.route('/audit', methods=['GET', 'POST'])
def audit():
import advertools as adv
adv.crawl("https://www.muenchenmusik.de", "try.jl")
the traceback says:
2021-05-03 11:40:34 2021-05-03 11:40:34 [scrapy.utils.log] INFO: Scrapy 1.8.0 started (bot: scrapybot)
2021-05-03 11:40:34 2021-05-03 11:40:34 [scrapy.utils.log] INFO: Versions: lxml 4.2.3.0, libxml2 2.9.8, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 19.7.0, Python 2.7.12 (default, Oct 8 2019, 14:14:10) - [GCC 5.4.0 20160609], pyOpenSSL 19.0.0 (OpenSSL 1.1.1d 10 Sep 2019), cryptography 2.8, Platform Linux-5.4.0-1029-aws-x86_64-with-Ubuntu-16.04-xenial
2021-05-03 11:40:34 Traceback (most recent call last):
2021-05-03 11:40:34 File "/usr/local/bin/scrapy", line 8, in <module>
2021-05-03 11:40:34
2021-05-03 11:40:34 sys.exit(execute())
2021-05-03 11:40:34 File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 146, in execute
2021-05-03 11:40:34
2021-05-03 11:40:34 _run_print_help(parser, _run_command, cmd, args, opts)
2021-05-03 11:40:34 File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 100, in _run_print_help
2021-05-03 11:40:34
2021-05-03 11:40:34 func(*a, **kw)
2021-05-03 11:40:34 File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 154, in _run_command
2021-05-03 11:40:34
2021-05-03 11:40:34 cmd.run(args, opts)
2021-05-03 11:40:34 File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/runspider.py", line 79, in run
2021-05-03 11:40:34
2021-05-03 11:40:34 module = _import_file(filename)
2021-05-03 11:40:34 File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/runspider.py", line 21, in _import_file
2021-05-03 11:40:34
2021-05-03 11:40:34 module = import_module(fname)
2021-05-03 11:40:34 File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
2021-05-03 11:40:34
2021-05-03 11:40:34 __import__(name)
2021-05-03 11:40:34 File "
2021-05-03 11:40:34 /home/KulturData/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/advertools/spider.py
2021-05-03 11:40:34 ", line
2021-05-03 11:40:34 4
2021-05-03 11:40:34
2021-05-03 11:40:34 SyntaxError
2021-05-03 11:40:34 :
2021-05-03 11:40:34 Non-ASCII character '\xf0' in file /home/KulturData/.virtualenvs/myvirtualenv/lib/python3.8/site-packages/advertools/spider.py on line 5, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
I already fixed this ascii (emoji) problem. But it's not the underlying problem. It can't read f'Strings as well (bc python2.7 I think)
I think its running the wrong version of Scrapy (/usr/local/bin/scrapy) although I have installed it via pip at
I tried to uninstall it but that wasn't allowed. I don't know why it only chooses the wrong scrapy path when I run it via the server file. Thats the path in my virtualenv:
(myvirtualenv) 11:50 ~/website (main)$ which scrapy
/home/KulturData/.virtualenvs/myvirtualenv/bin/scrapy
And that is my WSGI configuration file:
import sys
path = '/home/KulturData/website'
if path not in sys.path:
sys.path.append(path)
#
from backend.server import app as application
I would appreciate any help you can give me.