Forums

importing custom module

I am trying to deploy my webapp that takes a name and gives it to my scrapy spider. The webapp seems to run, but when you actually make it run, an error occurs:

2023-04-18 10:11:45   Traceback (most recent call last):
2023-04-18 10:11:45   File "/home/AZServices/.virtualenvs/venv/bin/scrapy", line 8, in <module>
2023-04-18 10:11:45     sys.exit(execute())
2023-04-18 10:11:45   File "/home/AZServices/.virtualenvs/venv/lib/python3.10/site-packages/scrapy/cmdline.py", line 125, in execute
2023-04-18 10:11:45     settings = get_project_settings()
2023-04-18 10:11:45   File "/home/AZServices/.virtualenvs/venv/lib/python3.10/site-packages/scrapy/utils/project.py", line 71, in get_project_settings
2023-04-18 10:11:45     settings.setmodule(settings_module_path, priority="project")
2023-04-18 10:11:45   File "/home/AZServices/.virtualenvs/venv/lib/python3.10/site-packages/scrapy/settings/__init__.py", line 316, in setmodule
2023-04-18 10:11:45     module = import_module(module)
2023-04-18 10:11:45   File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
2023-04-18 10:11:45     return _bootstrap._gcd_import(name[level:], package, level)
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
2023-04-18 10:11:45   File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
2023-04-18 10:11:45 ModuleNotFoundError: No module named 'scraperra'

My project directory is as follows:

AZServices
    .cache/
    .local/
    .virtualenvs/
    webapp_bullit/
        .git/
        .idea/
        mysite/
            mysite/
                settings.py
                wsgi.py
                morefiles...
            scraperra/
                spiders/
            website/

        dbsqlite3
        manage.py
        output.xlsx
        virt/

    .bashrc
    morefiles....

I haven't included all files since... well.... that would take ages, but I think these are the relevant files. In my wsgi I have the following:

path = '/home/AZServices/webapp_bullit/mysite'
if path not in sys.path:
    sys.path.insert(0, path)

os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'
os.environ["PATH"] = "/home/AZServices/.virtualenvs/venv/bin/" + os.pathsep + os.environ["PATH"]

Where this last line is to solve the error where it wouldn't find the module scrapy.

Can anybody help me importing this module 'scraperra'?

See our help page that helps you to understand how Python searches for the modules that you are trying to import: https://help.pythonanywhere.com/pages/DebuggingImportError/

I have read this multiple times, but I do not see how to solve my error using this information. I have tried the parts 'Can you run the wsgi file itself?', 'Can you run the files it's trying to import?' and 'Shadowing'. Both result in no errors or weird paths etc.

Some results I got when following the debugging page, maybe you can see if anything is going wrong:

First debugging tip:

  • print(scraperra.file)

  • /home/AZServices/webapp_bullit/mysite/scraperra/init.py

Second debugging tip:

  • print('\n'.join(sys.path))

  • /home/AZServices/webapp_bullit/mysite
  • /var/www
  • /usr/local/lib/python310.zip
  • /usr/local/lib/python3.10
  • /usr/local/lib/python3.10/lib-dynload
  • /usr/local/lib/python3.10/site-packages

On Stackoverflow, someone advised the following:

You appear to be calling scrapy as an external program. You will need to configure the python path for that program independently of the web app itself.

I do not know how to implement this advise. It does seem to be on the right path as I call the scraper with

process = subprocess.Popen(['scrapy', 'crawl', spider_name, '-a', 'artist=%s' % artist])

Maybe this helps?

That may be something that the scrapy forums can help with.

It works when I run the site on localhost, so I doubt the problem is scrapy related. As the error says as well, it struggles finding the module. An error I only have when running it on PythonAnywhere. Do you have any suggestions as to why it doesn't find the module?

Where is the scraperra module that you're trying to load actually located?

this sounds way more sassy then I mean it to, but does my project structure above not answer your question? Please let me know if I misunderstood your question!

No, you're totally right! I somehow missed that when reading through the previous posts, my apologies.

OK, so as you're using subprocess.Popen to run scrapy, the best way to make sure that it has the right stuff on the Python path is to set the environment variable appropriately. So try this:

process = subprocess.Popen(['scrapy', 'crawl', spider_name, '-a', 'artist=%s' % artist], env={"PYTHONPATH": "/home/AZServices/webapp_bullit/mysite"})

...and see if that helps.