Sorting out Autocommit
The PostgreSQL server does not have a feature called "autocomit", but the most popular database library for Python, psycopg2 has it enabled by default. I say "enabled", but to disable this behavior we have to turn something on
conn.autocommit = True
The psycopg2 documentation has a warning about the default behavior
Warning: By default, any query execution, including a simple SELECT will start a transaction
Support for Autocommit in MySQL
MySQL does have a feature called autocommit, and it can be enabled globally, or per session.
SET autocommit=0;
This means "start a transaction an leave it open after the first statement I run". Hence, this will begin a transaction
SELECT 1;
This is the inverse of garbage collection! I didn't start a transaction, but it is holding resources and I am responsible to clean up. Furthermore, MySQL/MariaDB never log the fact that the transaction was started
SET GLOBAL log_output = 'FILE'; SET GLOBAL general_log_file = 'my_logs.txt'; SET GLOBAL general_log = 'ON';
$ tail -f /var/lib/mysql/my_logs.txt 190313 14:30:21 3 Query select 1
While psycopg2 claims to follow the Python DBAPI specification. It's behavior is different because MySQL doesn't handle transactions in the same way. In PostgreSQL,
- Schema changes are transactional
- Errors within a transaction are fatal
- The current time is pinned to the transaction start time
None of these are true for MySQL. Instead, all schema changes applied immediately, errors do not stop the evaluation of subsequent statements, and system resources such as the current time keep changing.
Psycopg2 Resource Management
A
with
clause in Python
is usually used to obtain and release resources. If
conn.autocommit
is set to
True
then psycopg2 does not implement
any behavior for
cursor.__enter__
and
cursor.__exit__
.
The following code appears to do transaction management, but does not
with conn: with conn.cursor() as cur: cur.execute('SELECT 1;')
The Psycopg2 docs also assert that with autocommit "all the commands executed will be immediately committed and no rollback is possible."
I don't know why they say this, when transactions within a session are evaluated as expected
cur.execute("BEGIN") cur.execute("...") cur.execute("ABORT")
Mitigations
People will not normally start transactions and leave them open, but the default behavior of autocommit compells the programmer to clean up resources they never allocted! One way to detect this behavior si to add limits to PostgreSQL that will raise an error
SET idle_in_transaction_session_timeout = '15s';
This can be set per session or globally in
postgresq.conf
.
Autocommit in SQLite3
The standard SQLite library for Python also has an autocomit feature , but it tries to be samrt by only implicitly starting transactions if you execute a statement that modifies data. Consider the following
import sqlite3 con = sqlite3.connect("test.db") persons = [ ("Hugo", "Boss"), ("Calvin", "Klein") ] con.executemany("INSERT INTO person(firstname, lastname) values (?, ?)", persons) # Select using a CTE for row in con.execute("WITH p AS ( select firstname from person ) SELECT * FROM p"): print row
The
INSERT
statement causes an implicit
BEGIN
,
and the transaction is aborted if the program ends since there is no commit.
This is not what happens! With Python2.7
WITH p AS ...
causes an implicit commit.