i detected a strange bug which occurs at least in marketmanagerservice.java of the sample agent, but should be reproducable everywhere.
in activate() it asks with timeslotRepo.enabledTimelots() for the enebled timeslots. These are future timeslots, and should always be larger than the current timeslot.
However, after about 180 timeslots in every simulation the enabled timeslots are smaller than the currenttimeslot!
I believe this is related to issue #492. I also believe that it will happen sooner on slower machines or machines with fewer processors, because it is fundamentally a performance issue. In other words, it will not happen if there is enough time for both the server and the broker to finish their work in the 5-second timeslot interval. If someone can show that this is not the case, then I will dig deeper.
I'm still using the 0.1 server, don't know if this matters.
Probably it does not matter, but the 0.2 server is much more stable than the 0.1 server.
I believe I know how to fix this, and I hope to have a fix done and tested within 2-3 days. I expect the fix will be on the broker side, and will not affect the server.
This evening I pushed what I believe is a comprehensive fix for issue #492. It affects common, server-main, and sample-broker, and to get it you will (for now) have to use the 0.5.0-SNAPSHOT development branch. To make it a little easier, I have also deployed a new common:0.5.0-SNAPSHOT to the snapshot repo. Note that at this point only server-master and common are deployed, so you need to pull source from github for the rest of the server modules. We could potentially update the 0.2.0 release with this fix if it's necessary, but retro-fitting the 0.1.0 release would be impractical.
Here's how it now works:
There is a new TimeslotComplete message sent as the last outgoing message in each timeslot. It carries the timeslot serial number, so the broker can check whether it is in sync with the server.
The broker extracts the current timeslot serial number from the TimeslotUpdate message, and compares that with the serial number in the TimeslotComplete message in order to make sure the timeslot has not advanced. It does this by checking at a couple of points during its internal process to make sure it's still in the same timeslot.
The broker runs its internal processing (portfolio management and market interactions) in a "worker" thread, to allow the jms threads to return as soon as their payloads are delivered. This avoids the problem of incoming messages (potentially) being blocked during broker deliberation, and thereby getting the broker out of sync.
The server now has an additional configuration property server.simulationClockControl.minAgentWindow that sets the minimum time that must elapse between sending the TimeslotComplete message at the end of timeslot n and sending the TimeslotUpdate message at the beginning of timeslot n+1. This was hardcoded as 200 msec, and it's now 2000 msec by default. This guarantees that the broker will have at least 2 seconds (less message delivery latency) to respond after receiving the last server message in each timeslot, before the next market clearing starts. This seems to be necessary because the server is spending much more time/timeslot with brokers attached than it does in bootstrap mode. We are looking into jms tuning issues, but in the meantime this will not choke off brokers regardless of how long the server takes for its per-timeslot processing.
Please let me know if this does not fix your problem.