We have a rancher-server with 10 hosts running around 250 containers. We faced lots of issues with UI as heap dump and saw in one of the post that heap dump is fixed in version 1.0.1 so we upgraded rancherserver to 1.0.1. Performance has definitely improved but having problems with load balancer.
We have a load balancer and runs each container on each host and has hostname based routing.
After upgrade when we add new host it spins up agent v0.8.1 but we have hosts which has version 0.8.0. Below is the error I see while spinning up load balancer
Degraded (Waiting for [instance:Default_rancher-lb_1]. Instance status: 500 Server Error: Internal Server Error ("Cannot start container 58fd1a41f3121143d3e5ce41a9706b15ee820f55c233980101af47f9eb93192e: [9] System error: argument list too long"))
Type:
time="2016-05-04T15:28:57Z" level="info" msg="Processing event: &docker.APIEvents{Status:\"start\", ID:\"57e4efc4f995535e7743f55e3c863f3e2473e3b5536002638c88bd68a2694714\", From:\"rancher/agent-instance:v0.8.1\", Time:1462375737}"
time="2016-05-04T15:28:57Z" level="info" msg="Processing event: &docker.APIEvents{Status:\"die\", ID:\"57e4efc4f995535e7743f55e3c863f3e2473e3b5536002638c88bd68a2694714\", From:\"rancher/agent-instance:v0.8.1\", Time:1462375737}"
2016-05-04 15:28:57,476 ERROR agent [139765886887568] [event.py:112] Error in request : 66fbd568-c411-4555-a0da-1b827fd8492e
Traceback (most recent call last):
File "/var/lib/cattle/pyagent/cattle/agent/event.py", line 95, in _worker_main
resp = agent.execute(req)
File "/var/lib/cattle/pyagent/cattle/agent/__init__.py", line 15, in execute
return self._router.route(req)
File "/var/lib/cattle/pyagent/cattle/plugins/core/event_router.py", line 13, in route
resp = handler.execute(req)
File "/var/lib/cattle/pyagent/cattle/agent/handler.py", line 34, in execute
return method(req=req, **req.data.__dict__)
File "/var/lib/cattle/pyagent/cattle/plugins/docker/compute.py", line 529, in instance_activate
self._do_instance_activate(instance, host, progress)
File "/var/lib/cattle/pyagent/cattle/plugins/docker/compute.py", line 608, in _do_instance_activate
client.start(container_id)
File "/var/lib/cattle/pyagent/dist/docker/utils/decorators.py", line 21, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/var/lib/cattle/pyagent/dist/docker/api/container.py", line 363, in start
self._raise_for_status(res)
File "/var/lib/cattle/pyagent/dist/docker/client.py", line 146, in _raise_for_status
raise errors.APIError(e, response, explanation=explanation)
APIError: 500 Server Error: Internal Server Error ("rpc error: code = 2 desc = "oci runtime error: argument list too long"")
time="2016-05-04T15:28:57Z" level="info" msg="Container [57e4efc4f995535e7743f55e3c863f3e2473e3b5536002638c88bd68a2694714] not running. Can't assign IP [10.42.250.37/16]."
