I recently had fun troubleshooting an issue on a Zabbix server, where the logs would start to fill up with reports such as these.
... 23966:20120412:234746.094 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'] 23942:20120412:234750.128 slow query: 50.982490 sec, "update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'" zabbix_server [23942]: ERROR [file:db.c,line:1582] Something impossible has just happened. ... 23969:20120412:234752.153 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'] 23969:20120412:234752.153 slow query: 50.206625 sec, "update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'" zabbix_server [23969]: ERROR [file:db.c,line:1582] Something impossible has just happened.
The problem itself is the “Lock wait timeout exceeded; try restarting transaction", which is actually a MySQL problem in this case. It’s caused in part by InnoDB row-level locks on the very same row that the DB Syncers in Zabbix are trying to reach.
The query in particular, “update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'
”, is Zabbix’s way of insuring a unique ID for each event. Instead of using a custom method per database backend (sequences for PostgreSQL or auto_increment in MySQL), this implementation was chosing. In race conditions where a lot of events are fired at once, this causes quite long row-level locks on the very same row (‘events’) making the DBSyncer processes or others re-try the transaction over and over again, never completing.
A sort of workaround was to transform our database from a (relatively) slow innodb to a (relatively) faster MyISAM for the ‘ids’ table. Seeing as this table usually only contains around 40 rows, that are frequently updated, this seemed like a good solution.
mysql> ALTER TABLE ids ENGINE = MyISAM; Query OK, 44 rows affected (0.06 sec) Records: 44 Duplicates: 0 Warnings: 0
I believe a long-term solution would be for Zabbix Server to have knowledge of which database backend is being used (MySQL/PostgreSQL/…) and use the ID-incrementer provided by that database itself, instead of using a generic workaround such as this.