Forums

Non-blocking call to subprocess, but response still delayed... Please explain!

My server runs a Bottle application. One request (/start in following code) starts a subprocess with Popen. The call is non-blocking (other requests can be served during the execution of this subprocess), however, the response to the original request is only received when the subprocess is over (if no other request was received, if another request has been served, the response to the original request is never sent). I don't understand.

Here is my test code :

@route('/start', method=['GET'])
def start():
    dateStart = datetime.strftime(datetime.today(), "%Y-%m-%d %H:%M:%S")
    cmd = ['python2.7', 'testdate.py']
    process = subprocess.Popen(cmd)
    return {'success':True, 'dateStart':dateStart, 'dateStop':datetime.strftime(datetime.today(), "%Y-%m-%d %H:%M:%S")}

Result:

  1. testdate.py writes the current date every second 10 times in a file. I get : 13:58:04 13:58:05 .......... 13:58:13
  2. The response is received 10s after the request was sent with : {"dateStart": "13:58:03", "dateStop": "13:58:03", "success": true}
  3. So the response is prepared as soon as the subprocess is started but not sent???? Can anyone explain this to me? Thank you!

That's very odd. Is there anything in the server or the error logs? Especially in the case where another request is served and causes the original request to not get handled.

Thanks for answering so fast. There is nothing in the server log or error log. Though, I may not have a correct configuration. On the server log I can read :

  • Python threads support is disabled. You can enable it with --enable-threads

  • Operational MODE: single process

Hmm. One thing that may be relevant -- your free account only supports one worker process. So if one request is being processed, other incoming ones will be queued until it's done before they're served.

I don't think that explains what you're seeing, but thought it would be useful to make clear just in case it wasn't already!

So, originally you wrote:

if another request has been served, the response to the original request is never sent

Just to make sure I understand this correctly -- if you kick off a request to /start/ in one browser tab, it takes ten seconds to respond. If, however, you hit the web app (on another URL?) in the meantime, the request to /start doesn't respond.

What happens if you do those two hits in different browser tabs? What do you see in the one where you triggered the first request?

That was an interesting test : If the 2 requests are sent from 2 different tabs, the response to the first request is received once the process is finished. <br> My actual purpose is to launch a simulation (instead of testdate in my example), and be able to pause and resume it. <br>

  • /simulate starts the simulation with subprocess.Popen (the process is stored in a global variable)
  • /pause sends a SIGSTOP to the process
  • /resume sends a SIGCONT to the process.

If I call /simulate in tab1, and then /pause in tab 2, I immediately get the response from tab2, as tab1 is waiting. <br> Then, when I call /resume in tab 2, I also get an immediate response. <br> I finally get a response in tab1 when the process is over (simulation or testdate) <br> The global behavior is actually OK for me, I can deal with the late or missing response. I just don't understand why the response is not sent as the Popen call is non-blocking.<br> Thanks for your interest in my situation!

I assume that /simulate is pretty much the same as the /start view you were showing earlier, right?

Perhaps you could try putting a print to sys.stderr after the subprocess.Popen and keep an eye on your error logs?

I am using : process = subprocess.Popen(cmd, stdout=fout, stderr=subprocess.STDOUT) So I guess the errors should be in fout file, and there are none... As I told you, the application fulfills my needs as it is. I was just wondering why the response was delayed till the end of the subprocess execution (it is not when I call /resume). If you are interested, I can make a clean example illustrating my situation.

Sure, I understand that the system is OK for you right now -- just wondering about what's causing it myself! If you put a the print to stderr in the code that's calling the subprocess.Popen (rather than in the code that is being called by it) then it should wind up in the error log, so that might be interesting to see.

I am not sure that I understand what you would like me to do. My error log is not empty, I can see the unhandled exceptions in it whenever they occur but none for the sequence of interest.

Here is my code for the subprocess : testdate.py :

from datetime import datetime
dateLast = datetime.today()
n = 0
f = open('testdate.dat', 'w')
while (n<10):
    dateNow = datetime.today()
    delay = (dateNow - dateLast).seconds
    if (delay >= 1):
        dateNowString = datetime.strftime(dateNow, "%Y-%m-%d %H:%M:%S")
        dateLast = dateNow
        n += 1
        f.write(dateNowString+'\n')
f.close()

And the code of the test app :

from bottle import default_app, route, debug
import subprocess
import signal
from datetime import datetime

running_process = {'p':None}

@route('/start', method=['GET'])
def start():
    dateStart = datetime.strftime(datetime.today(), "%Y-%m-%d %H:%M:%S")
    cmd = ['python2.7', '/home/celinekessler/testdate.py']
    running_process['p'] = subprocess.Popen(cmd)
    return {'success':True, 'dateStart':dateStart, 'dateStop':datetime.strftime(datetime.today(), "%Y-%m-%d %H:%M:%S")}

@route('/pause', method=['GET'])
def pause():
    running_process['p'].send_signal(signal.SIGSTOP)
    return "Simulation paused"

@route('/resume', method=['OPTIONS', 'GET'])
def resume():
    running_process['p'].send_signal(signal.SIGCONT)
    return "Simulation resumed"

debug(True)
application = default_app()

if __name__ == "__main__":
    from paste import httpserver
    httpserver.serve(application, host="", port=None)

Hope it can help to understand what happens. Thank you again.

When you use Popen, the child process inherits the file descriptors from the parent, and that will include the web app connection the the WSGI server, so the process needs to run and finish before the connection is released and the server notices that the request is finished. Try passing close_fds=True to Popen and see if that helps.

Problem solved!!!! Thank you!