Multi-agent reinforcement learning using simulated quantum annealing