Forums

Mercurial throwing 'thread.error' on any network access, as of 2021-01-04

New today: many Mercurial commands (seemingly ones that access the network, e.g. incoming) die with this error:

thread.error: can't start new thread

Full traceback: https://dpaste.com/D7Q7L5F8W

Is this something I have the power to fix? I've changed nothing in my config since yesterday (when it worked fine).

I can't pull updates to my site with this broken.

Update: hopefully this means somebody is working on fixing this (if so, please confirm here!) but the error output is now:

abort: Resource temporarily unavailable

or

remote: /bin/sh: 1: Cannot fork

abort: no suitable response from remote hg! ```

Also: Not that I need to highlight this, but why is Python 2 being used at all (and not even the final patch level)?

Likewise the Mercurial version is over a year old -- current is 5.6.x

You can upgrade your hg with pip install mercurial -U --user

Thanks for the tip, but that doesn't work.

Fails with CalledProcessError: Command '('lsb_release', '-a')' returned non-zero exit status 1 (full output: https://dpaste.com/6BC8D8968)

I also tried with pip3 (which is at /usr/bin/pip3 on that machine) -- it makes a long atttempt at installing but ultimately fails with: RuntimeError: can't start new thread

There's now an additional problem, or a worsening of whatever's causing the problem I started with: when I ssh to the box, either it fails with shell request failed on channel 0 or "succeeds" with a shell that is unusable (bash: fork: retry: No child processes).

Logging into a Bash console via pythonanywhere.com/user/paulbissex/consoles does succeed, and in that shell the Mercurial errors do not occur.

Just to clarify -- were you connecting over SSH when you saw the errors you mentioned above? If so, could you check and see if they're cleared now?

Yes, I normally connect via terminal SSH.

Errors are no longer occuring. Thanks for the fix. I've upgraded Mercurial as well.

By the way, Giles -- if it's possible, I'd really appreciate some explanation of what happened, inasmuch as you know. Being stopped dead by an opaque failure like this does temper my enthusiasm for the platform; transparency about the problem and the fix can mitigate that. A gesture of respect toward the paying customer.

Excellent, thanks for confirming that the problem is fixed now! It was quite an obscure one and I wasn't sure that what we did would have sorted it out, which is why I didn't speculate as to the cause in my last post.

It appears that the server that handles your SSH logins wasn't allowing you to create new processes. In-browser consoles are handled by a different set of servers, which is why you were able to run commands there. I'm not 100% sure why the SSH server was having that problem; there are two possibilities:

  • The server had been up and running for quite some time, and might have reached some resource limit that we do not track. That was my initial assumption yesterday, so I rebooted it, and that was what cleared the issue.
  • However, while thinking about this overnight, it occurred to me that it might have reached your own per-user process limit -- you can only have 128 simultaneous running processes (to prevent fork bombs). That would have the same visible behaviour to you, and might also explain why no-one else on the same behaviour was reporting issues. Rebooting the server would have killed them all, so that would explain the fix.

If you get the same problem in the future, could you check whether you have lots of processes running? You can see processes started over SSH, along with those started from consoles, in the "Running processes" table on the "Consoles" page. Perhaps some command that you run is leaving processes around after it's exiting, leading to a slow resource leak.

To follow up on this, since I just had it happen again -- thanks for the note about "lots of processes". The culprit turned out to be scads of ssh-agent -s processes.

Nothing that I can see in my dotfiles is explicitly starting the ssh-agent. I do use ssh to connect to my project's repo. In any case I'm not sure why it would be left hanging -- but that's not a puzzle I have to solve today.

oh nice! thanks for letting us know!