autoplay_move() scores the legal columns for a game state and returns the
recommended column without changing the state. The default strategy is a
depth-limited lookahead search. future_mode = "visible" plays fairly from
the visible queue and uses independent planning seeds for unknown future
tiles; future_mode = "rng" allows simulations to use the state's internal
deterministic random-number stream.
Arguments
- state
A game-state list returned by
new_game()ordrop_tile().- strategy
Move-selection strategy:
"lookahead","growth_lookahead","monte_carlo", or"greedy".- depth
Lookahead depth for the
"lookahead"and"growth_lookahead"strategies.- simulations
Number of rollout simulations for
"monte_carlo".- horizon
Maximum rollout length for
"monte_carlo".- beam_width
Number of states retained per lookahead layer.
- future_mode
Whether planning uses only visible future information or the state's deterministic RNG stream.
- seed
Optional positive integer seed for visible-future planning.
Reinforcement-learning extensions
Natural next steps include tabular Q-learning on handcrafted board features, approximate Q-learning using the current heuristic features, Deep Q-Networks with legal-action masks, PPO or actor-critic policy gradients, MCTS/UCT rollouts over the deterministic engine, and evolution strategies for tuning heuristic weights before introducing neural models.