Faster Simulation

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Faster Simulation

Andyccs
Hi,

I was looking for a way to run simulation faster. If we can do that, then many researches can be done with shorter iteration cycle.

If we look at the way we run bootstrap mode today, the DefaultBroker and the simulation server are running in the same process. As a result, bootstrap mode completes within few seconds since no ActiveMQ and internet connection involve.

My first idea was to compile and run my Broker in the simulation server. However, this approach would probably only works for brokers with source codes. From simulation server point of view, putting broker codes on the server is definitely not desirable.

My second idea is, maybe we still run all brokers and the server as it is today, but tweaking the simulation clock: https://github.com/powertac/powertac-server/wiki/Time-management.

I am not sure whether I am on the right direction. Let me know if anyone have any ideas about making the simulation faster.

Thank you.

Reply | Threaded
Open this post in threaded view
|

Re: Faster Simulation

grampajohn
Administrator
Andyccs wrote
Hi,

I was looking for a way to run simulation faster. If we can do that, then many researches can be done with shorter iteration cycle.

If we look at the way we run bootstrap mode today, the DefaultBroker and the simulation server are running in the same process. As a result, bootstrap mode completes within few seconds since no ActiveMQ and internet connection involve.
The overhead for remote brokers includes both the xml serialization and deserialization, as well as ActiveMQ. The ActiveMQ overhead is much lower between entities on the same machine. If you have enough cores and memory, you can run several agents on a single machine. I have run the server and three agents on my high-end desktop without noticeable slowdown.
My first idea was to compile and run my Broker in the simulation server. However, this approach would probably only works for brokers with source codes. From simulation server point of view, putting broker codes on the server is definitely not desirable.
The default broker is just a broker (it's special in the sense that it's the source of the default tariffs, but otherwise it's a real broker). To run in the server, it runs under control of DefaultBrokerService, which is a subtype of InitializationService. When the server starts up, it looks in its classpath for Spring services that implement InitializationService, loads them into the process, and calls their initialize() methods with a couple of arguments. There's no reason you could not build your own service that would start up your broker and run it in the server's process. Your broker would also need to set the inherited instance variable 'local' to true so its traffic is not serialized and sent through JMS/MQ.

So yes, I think you would need the source to make this work. Partly for this reason, but mostly because we did not want to be responsible for running code developed by competitors, we decided early on that brokers would be remote. Because of that, several brokers are no longer simple Java processes. I know of two at least that run a Python process in parallel doing machine learning tasks. These side processes are not synchronous with the server; instead they essentially just update various parameters in the broker periodically. I know of one that communicates between processes with an in-memory database. There is really no way you would want to even try to run something like this inside the server.
My second idea is, maybe we still run all brokers and the server as it is today, but tweaking the simulation clock: https://github.com/powertac/powertac-server/wiki/Time-management.
The overall clock rate is controlled by a config variable competition.simulationRate. The default value is 720 (3600/5); you can certainly change it.

The server guarantees that there will be at least a minimum period between the timeslot-complete message and the start of the next timeslot. That, minus network latency, is the time brokers have to do their synchronous work. This is controlled by another config variable server.simulationClockControl.minAgentWindow. The value in server-main/server.properties is 2000 (2 seconds), although it should probably be a bit larger. You can override it in server-distribution/config/server/properties. If you speed up the server by changing the simulation rate, or if you increase the minAgentWindow, the server will pause when it needs to in order to satisfy the minAgentWindow. You could probably get the timeslot period down to 3 seconds without too much problem.
I am not sure whether I am on the right direction. Let me know if anyone have any ideas about making the simulation faster.

Thank you.
Of course there is another way to speed up experiments, and that is to run multiple servers in parallel. It takes quite a few (virtual) machines to do this, but it's quite effective. I was involved in a project 10-12 years ago where we did that; We would log into a set of virtual machines, along with a number of lab and office desktop machines, and run our processes. Once in a while we got complaints, but mostly folks could not tell we were running processes on their machines. At one point we had 22 machines in our collection.

I believe we are very close to releasing an experiment manager that pretty much works this way, built on the framework developed by Govert Buijs at Rotterdam for the tournament scheduler. More to come soon, I hope.

Cheers -

John
Reply | Threaded
Open this post in threaded view
|

Re: Faster Simulation

Andyccs
Thanks for your reply John! I added the following configurations:

```
server.simulationClockControl.minAgentWindow = 500
common.competition.simulationTimeslotSeconds = 3
```

Now the simulation can completed within 1 hour 12 minutes instead of the original 2 hours.
Reply | Threaded
Open this post in threaded view
|

Re: Faster Simulation

grampajohn
Administrator
Andyccs wrote
Thanks for your reply John! I added the following configurations:

```
server.simulationClockControl.minAgentWindow = 500
common.competition.simulationTimeslotSeconds = 3
```

Now the simulation can completed within 1 hour 12 minutes instead of the original 2 hours.
I'm glad it's working for you. Keep in mind that 500 msec may not be enough for some agents to respond. If you are on a reasonably fast machine, I suspect you would see about the same total time with minAgentWindow at 2000.

John