Hi
I recently asked in the IRC channel about RabbitMQ connection reset errors in celeryd logs.
I think there are two issues:
1) The example systemd file (mediagoblin-celeryd.service) from
https://mediagoblin.readthedocs.io/en/stable/siteadmin/deploying.html does not specify that celeryd must be started after RabbitMQ, so it is sometimes started before and fails because RabbitMQ is not running yet.
2) In mediagoblin/mediagoblin/init/celery/__init__.py, it sets celery_settings['BROKER_HEARTBEAT'] = 1. In slower systems or under heavy load if the worker is too slow to respond in < 1 second it will miss the heartbeat and after a few missed heartbeats the connection is considered dead and reset.
I'm not sure what is the purpose of changing BROKER_HEARTBEAT to 1 but the celery docs recommend not using such a small value. In my install I changed it to 20 and I no longer see any connection problems.
Are you willing to accept a patch for mediagoblin/docs/source/siteadmin/deployment.rst and mediagoblin/mediagoblin/init/celery/__init__.py to fix those two problems?
Thanks
Fernando