Skip to contents

autoplay_move() scores the legal columns for a game state and returns the recommended column without changing the state. The default strategy is a depth-limited lookahead search. future_mode = "visible" plays fairly from the visible queue and uses independent planning seeds for unknown future tiles; future_mode = "rng" allows simulations to use the state's internal deterministic random-number stream.

Usage

autoplay_move(
  state,
  strategy = c("lookahead", "growth_lookahead", "monte_carlo", "greedy"),
  depth = 3L,
  simulations = 100L,
  horizon = 30L,
  beam_width = 10L,
  future_mode = c("visible", "rng"),
  seed = NULL
)

Arguments

state

A game-state list returned by new_game() or drop_tile().

strategy

Move-selection strategy: "lookahead", "growth_lookahead", "monte_carlo", or "greedy".

depth

Lookahead depth for the "lookahead" and "growth_lookahead" strategies.

simulations

Number of rollout simulations for "monte_carlo".

horizon

Maximum rollout length for "monte_carlo".

beam_width

Number of states retained per lookahead layer.

future_mode

Whether planning uses only visible future information or the state's deterministic RNG stream.

seed

Optional positive integer seed for visible-future planning.

Value

A list with column, strategy, future_mode, score_estimate, and a candidates data frame.

Reinforcement-learning extensions

Natural next steps include tabular Q-learning on handcrafted board features, approximate Q-learning using the current heuristic features, Deep Q-Networks with legal-action masks, PPO or actor-critic policy gradients, MCTS/UCT rollouts over the deterministic engine, and evolution strategies for tuning heuristic weights before introducing neural models.