Forums

Tesseract is so flow on pythonanywhere

My account is Custom Account:

-------------------------------------------------------------------------------------------------------
All paid plans support custom domains for all of your web apps.
CPU time per day: 2000 seconds Your current plan has 2000. [?]
Number of web apps: 1 Your current plan has 1. [?]
Number of web workers: 2 Your current plan has 2. [?]
Number of always-on tasks: 2 Your current plan has 2. [?]
Disk space: 4 GB Your current plan has 4. [?]
Postgres server: [?]
Postgres disk space: 1 GB Your current plan has 1. [?]
Price for this plan: $6.75 / month 
-----------------------------------------------------------------------------------------------------------

I use this code to detect digits in image:

with PyTessBaseAPI(path=r'/home/iamcaominhtien/dev/code/research/my-research-2022-2023/tools/tessdata') as api:
    for _img in gray_images:
        pil_image = Image.fromarray(_img)
        api.SetImage(pil_image)
        start_time = time.time()
        text = ''.join(re.findall(r'\d+', api.GetUTF8Text()))
        print('get_cells_2-ocr: ', time.time() - start_time)
        _results.append(text)

I try it on local and pythonanywhere, but it is very slow on pythonanywhere and I don't know why:

On local:

get_cells_2-ocr:  0.009741067886352539
get_cells_2-ocr:  0.007970333099365234
get_cells_2-ocr:  0.0022356510162353516
get_cells_2-ocr:  0.00189208984375
get_cells_2-ocr:  0.25850939750671387
get_cells_2-ocr:  0.16674566268920898
get_cells_2-ocr:  1.3970351219177246
get_cells_2-ocr:  0.06349921226501465
get_cells_2-ocr:  0.05964350700378418
get_cells_2-ocr:  0.05002236366271973
get_cells_2-ocr:  0.06497740745544434
get_cells_2-ocr:  0.018608570098876953

On Pythonanywhere:

2023-07-10 04:03:57 3.670355796813965
2023-07-10 04:03:57 
2023-07-10 04:04:01 get_cells_2-ocr: 
2023-07-10 04:04:01  
2023-07-10 04:04:01 3.228891134262085
2023-07-10 04:04:01 
2023-07-10 04:04:06 get_cells_2-ocr: 
2023-07-10 04:04:06  
2023-07-10 04:04:06 5.227738380432129
2023-07-10 04:04:06 
2023-07-10 04:04:13 get_cells_2-ocr: 
2023-07-10 04:04:13  
2023-07-10 04:04:13 7.223729372024536
2023-07-10 04:04:13 
2023-07-10 04:04:17 get_cells_2-ocr: 
2023-07-10 04:04:17  
2023-07-10 04:04:17 3.716923475265503
2023-07-10 04:04:17 
2023-07-10 04:04:22 get_cells_2-ocr: 
2023-07-10 04:04:22  
2023-07-10 04:04:22 4.74852442741394
2023-07-10 04:04:22 
2023-07-10 04:04:29 get_cells_2-ocr: 
2023-07-10 04:04:29  
2023-07-10 04:04:29 7.429743528366089
2023-07-10 04:04:29 
2023-07-10 04:04:32 get_cells_2-ocr: 
2023-07-10 04:04:32  
2023-07-10 04:04:32 2.571870803833008
2023-07-10 04:04:32 
2023-07-10 04:04:35 get_cells_2-ocr: 
2023-07-10 04:04:35  
2023-07-10 04:04:35 3.774812936782837
2023-07-10 04:04:35 
2023-07-10 04:04:40 get_cells_2-ocr: 
2023-07-10 04:04:40  
2023-07-10 04:04:40 4.6428773403167725

[edit by admin: formatting]

.

On PythonAnywhere you are sharing a server with other users, so in general code will run slower than it does locally. Additionally, if your code is able to use a GPU when it's running on your machine, it will run even faster there, as the PythonAnywhere servers do not have GPUs.