gh-143198: fix some crashes in sqlite3.executemany with concurrent mutations#143210
gh-143198: fix some crashes in sqlite3.executemany with concurrent mutations#143210picnixz wants to merge 27 commits into
sqlite3.executemany with concurrent mutations#143210Conversation
…nt parameter iterator
sqlite3.execute[many] with re-entrant parameter iteratorsqlite3.execute[many] with concurrent mutations
sqlite3.execute[many] with concurrent mutationssqlite3.executemany with concurrent mutations
|
There were more UAFs that I could find... |
sqlite3.executemany with concurrent mutationssqlite3.executemany with concurrent mutations
This is not always possible and it's more prone to errors in the future. The connection may be closed but we would check it way later. I tried to only add checks when the code in question never uses the sqlite3 API (e.g., I can't make the diff smaller though. It's the minimal diff on the C code I can do. |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thank you for minimizing the diff, @picnixz.
It seems that most new checks are not covered by tests. At least tests were passed when I commented out them one by one. Maybe more tests are needed?
It seems that neither bind_param(), nor bind_parameters() use self->connection->db, so the check can be added only after the call of bind_parameters(). Maybe much later -- immediately before sqlite3_changes() and sqlite3_last_insert_rowid().
I wonder, what happens when we close the connection in one of numerous callbacks? Then self->connection->db can be set NULL after calling some SQLite C API, not only after calling some Python C API.
| num_params = PySequence_Size(parameters); | ||
| if (num_params == -1) { | ||
| return; | ||
| if (num_params == -1 && PyErr_Occurred()) { |
There was a problem hiding this comment.
Isn't PyErr_Occurred redundant?
There was a problem hiding this comment.
This is what I thought but, in release mode the check is not performed. You can have a __len__ that returns -1 without setting an exception:
PySequenceMethods *m = Py_TYPE(s)->tp_as_sequence;
if (m && m->sq_length) {
Py_ssize_t len = m->sq_length(s);
assert(_Py_CheckSlotResult(s, "__len__", len >= 0));
return len;
}There was a problem hiding this comment.
Actually, you're right. I thought we directly called __len__ but it's warpped as follows:
PyObject *res = vectorcall_method(&_Py_ID(__len__), stack, 1);
Py_ssize_t len;
if (res == NULL)
return -1;
Py_SETREF(res, _PyNumber_Index(res));
if (res == NULL)
return -1;So returning -1 always implies that there is an exception set here. I'll remove that check.
There was a problem hiding this comment.
We can still get -1 for the extension class. But this is a bug in the extension.
There was a problem hiding this comment.
Do you want me to re-add the check though?
There was a problem hiding this comment.
Because on DEBUG builds there's some assert that would be crashing if the extension class does something bad
|
@serhiy-storchaka I've extensively added test cases for the adpater's part. I didn't check all |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thank you for adding more tests.
I think that all checks in bind_parameters() can be replaced with a single check after bind_parameters(). All tests will pass except the one with broken __getitem__(), which is expected and not related to this issue. It will pass after fixing __getitem__().
| if (current_param == NULL) { | ||
| return -1; | ||
| } | ||
| else if (!pysqlite_check_connection(conn)) { |
There was a problem hiding this comment.
Tests are passed after deleting this check.
| return; | ||
| return -1; | ||
| } | ||
| else if (!pysqlite_check_connection(conn)) { |
|
|
||
| rc = bind_param(state, self, i + 1, adapted); | ||
| Py_DECREF(adapted); | ||
| if (!pysqlite_check_connection(conn)) { |
There was a problem hiding this comment.
Because this check covers them. And it can be moved out of the loop, and even out of bind_parameters().
|
The minimal fix: diff --git a/Modules/_sqlite/cursor.c b/Modules/_sqlite/cursor.c
index 4611c9e5e3e..9a7e3d7164e 100644
--- a/Modules/_sqlite/cursor.c
+++ b/Modules/_sqlite/cursor.c
@@ -881,6 +881,9 @@ _pysqlite_query_execute(pysqlite_Cursor* self, int multiple, PyObject* operation
}
}
+ if (!pysqlite_check_connection(self->connection)) {
+ goto error;
+ }
PyObject *stmt = get_statement_from_cache(self, operation);
Py_XSETREF(self->statement, (pysqlite_Statement *)stmt);
if (!self->statement) {
@@ -930,6 +933,9 @@ _pysqlite_query_execute(pysqlite_Cursor* self, int multiple, PyObject* operation
if (PyErr_Occurred()) {
goto error;
}
+ if (!pysqlite_check_connection(self->connection)) {
+ goto error;
+ }
rc = stmt_step(self->statement->st);
if (rc != SQLITE_DONE && rc != SQLITE_ROW) { |
|
I think I'll need to check with more than 1 parameter as well. I cannot say for sure that checks are all redundant because my table has a single parameter to bind. But if binding doesn't require the connection at all, it could be possible to be safe and reduce the amount of checks. |
|
Ok, I'm now back to business with CPython. I will likely work on this one later today or tomorrow. I have some other shorter PRs that I want to address before this one. |
|
This PR is stale because it has been open for 30 days with no activity. |
FTR, the reason why
executeis actually not affected is that the iterator we are taking is built from a list we create ourselves so it's not possible to have "bad" side effects._sqlitecursor cache via re-entrant parameter iterator #143198