Forums

bad performance with sqlite

It appears to me that the PA system is not optimised for using sqlite as the webapp database. The performance is very bad. Is it just me or do others have the same experience? Are MySQL databases faster (not that I would change my application)?

SQLite is generally pretty slow in a networked environment -- and I think that in general, MySQL is always going to be faster. I'd definitely recommend using MySQL for any production web application, regardless of the platform. As far as I know, SQLite is mostly used for throwaway apps that you're not planning to use for any length of time, or as an embedded database in client-side software.

By "networked environment", you mean accessing the file over a network filesystem? If so, then yes, sqlite has problems with that. But we're not talking about something like that here. For me, I like the simplicity of sqlite, and the fact that it is included in the standard python library. Also, my webapps do not expect lots of traffic so sqlite should work fine. But it seems that the PA system has very slow disk access on the webapp filesystem.

you mean accessing the file over a network filesystem? If so, then yes, sqlite has problems with that. But we're not talking about something like that here.

Yes we are -- we store your files on a remote fileserver, to enable different consoles, web servers, scheduled task servers to all access the same files.

So when a webapp accesses a file that is on the same filesystem as the webapp files, does it go through the network?

Anything in /home, /var/www, /var/log and /tmp is on a networked filesystem.

Moreover, we ourselves are running in a virtualised environment, on AWS, using EBS storage, which is a networked filesystem. So, ultimately, all the other filesytem access is networked too...

One day we'll be able to buy some of our own servers with shiny SSDs! But for now, it's networks all the way down...

HP

So there is no place that I can move my sqlite databases to so that when the webapp needs them it won't have to go through the network?

That's correct. The only thing I can think of is that you could make it an in-memory database. But then it would only be available to that one web worker, and only for its lifetime. But maybe you could write some code to sync it back to disk every so often!

Or, try using MySQL. Let me know if you need any help migrating data -- if you're using Django, a manage.py dumpdata would do it. If it's something else, we'd probably have to figure out some alternative kind of serialisation... there's some tips here: http://stackoverflow.com/questions/18671/quick-easy-way-to-migrate-sqlite3-to-mysql

Okay thanks but I don't really feel like converting everything to MySQL, too much trouble.

Still, it would be nice if PA had a way of providing better performance for sqlite, maybe by moving the database files to a special directory for which non-network access could be guaranteed. Contrary to what many believe, it is possible to run websites of considerable size using sqlite.

That would be possible -- the problem is that apps run that way could only run on one physical machine. Right now we tend to move them around to maximise machine usage, and we want to start splitting them between machines for reliability in the medium term too.

I love the idea of a tiered SQLite in memory DB and sync writes to another on disk SQLite DB in the background approach. May not show up on any recommended design lists, but for some reason it sounds like a fun experiment.

Can you elaborate?

I think a2j is talking about my suggestion to use an in-memory sqlite, with some kind of process that periodically dumps it to disk for archiving. As long as you're prepared to handle the occasional bit of data loss, it might work well...

Hi, out of interest has the performance of SQlite on PythonAnywhere improved at all over the last 3+ years?

We don't really track the relative performance of sqlite over time, but there have been substantial changes to our infrastructure over the past few years, so it may have been improved. However, the main issue is that our file infrastructure is, by necessity, network-based, and that hasn't changed.

I posted a nasty-but-working solution for a small, read-only Sqlite database in that massively speeds up my Django app.

Here: https://www.pythonanywhere.com/forums/topic/12281/#id_post_46808

[edit by admin: make link clickable]