# Ball Catcher Game in Python

It is not a difficult job to combine the concepts of game development with reinforcement learning and make a program play the game on its own. In this article, we are going to develop a simple ball catcher game in python using the concepts of reinforcement learning to make our program “intelligent”. But before that, make sure you understand the basics of Reinforcement Learning, and more specifically, Q Learning.

In our game, there is going to be a ball that is going to drop continuously from top to bottom. Then a rectangular catcher is going to catch the dropping ball. If it succeeds, we score a point, or else, we miss a point. There are four parts to this article, and in the end, you’re going to have an agent play a ball catcher game for you. Also, make sure you have the following to libraries installed:

• Pygame
• NumPy

## Step 1: Initializing classes

We start by initializing the Circle class for our ball and State class to define each state of catcher and ball.

```class Circle:
def __init__(self, circleX, circleY):
self.circleX = circleX
self.circleY = circleY
# X and Y coordinates of circle with respect to the window

class State:
def __init__(self, rect, circle):
self.rect = rect
self.circle = circle
# States of rectangle (catcher) and circle (ball)```

## Step 2: Initializing window, ball, and catcher

We define the shapes of the window and RGB color schemes in the window.

```import numpy as np

windowWidth = 800
windowHeight = 400

RED = (255, 0, 0)
GREEN = (0, 255, 0)
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)```

Similarly, we initialize sizes of ball, catcher and how fast the ball is going to fall from the top

```# Initial position of Ball with respect to window
crclCentreX = 400
crclCentreY = 50

crclYStepFalling = windowHeight / 10 # 40 pixels each time

# Initial position of Catcher with respect to window
rctLeft = 400
rctTop = 350
rctWidth = 200
rctHeight = 50```

We initialize the Q-learning table and use a dictionary to access the index of the table. The Q-learning table consists of state-action pairs of the game.

```QIDic = {}

# number of states = (windowWidth / 8) * (windowHeight / crclYStep) * (windowWidth / rectWidth)
Q = np.zeros([5000, 3])
```

## Step 3: Defining functions for each case of the ball catcher game

Firstly, we change the state of the game after each required action. This means, a new state calls for new positions of ball and catcher. We use Rect class of pygame to define the state of catcher (Rectangle). The arguments to the function are state and action objects.

```import pygame as pg
def new_state_after_action(s, act):
rct = None
if act == 2: # 0 == stay, 1 == left, 2 == rctHeight
if s.rect.right + s.rect.width > windowWidth:
rct = s.rect
else:
rct = pg.Rect(s.rect.left + s.rect.width, s.rect.top, s.rect.width,
s.rect.height) # Rect(left, top, width, height)
elif act == 1: # action is left
if s.rect.left - s.rect.width < 0:
rct = s.rect
else:
rct = pg.Rect(s.rect.left - s.rect.width, s.rect.top, s.rect.width,
s.rect.height) #Rect(left, top, width, height)
else: #action is 0, means stay where it is
rct = s.rect

newCircle = Circle(s.circle.circleX, s.circle.circleY + crclYStepFalling)

return State(rct, newCircle)```

We define another function to make catcher follow the constraints of the window. Arguments we use are rectangle and action objects.

```def new_rect_after_action(rect, act):
if act == 2:
if rect.right + rect.width > windowWidth:
return rect
else:
return pg.Rect(rect.left + rect.width, rect.top, rect.width, rect.height)
elif act == 1:
if rect.left - rect.width < 0:
return rect
else:
return pg.Rect(rect.left - rect.width, rect.top, rect.width,
rect.height)
else:
return rect```

The next functions are:

• circle_falling(circle_radius) – To randomly initialize the x axis position of ball after each fall
• calculate_score(rectangle, circle) – To keep the score tally of the agent
• state_to_number(state) – To add values of state objects in integer in QIDic
• get_best_action(state) – To retrieve the best action for the agent
```import random
multiplier = random.randint(1, 8)
newx *= multiplier
return newx

def calculate_score(rect, circle):
if rect.left <= circle.circleX <= rect.right:
return 1
else:
return -1

def state_to_number(s):
r = s.rect.left
c = s.circle.circleY
# print(r, c, s.circle.circleX   )
n = (str(r) + str(c) + str(s.circle.circleX))

if n in QIDic:
return QIDic[n]
else:
if len(QIDic):
maximum = max(QIDic, key=QIDic.get)
QIDic[n] = QIDic[maximum] + 1
else:
QIDic[n] = 1
return QIDic[n]

def get_best_action(s):
return np.argmax(Q[state_to_number(s), :])
```

## Step 4: Let’s set up the learning rate of our agent and play the game!

Let’s initialize our “pygame” and set FPS, window and rectangle objects.

```import sys
from pygame.locals import *

# Initializing frames per second
FPS = 20
fpsClock = pg.time.Clock()

# Initializing the game
pg.init()

# Window and Rectangle objects
window = pg.display.set_mode((windowWidth, windowHeight))
pg.display.set_caption("Catch the Ball")

rct = pg.Rect(rctLeft, rctTop, rctWidth, rctHeight)```

Some variables that we’re going to use in our logic and the learning rate. Try tuning the learning rate to understand the algorithm’s behavior.

```# Initialzing variables and learning rates
action = 1

score, missed, reward = 0, 0, 0
font = pg.font.Font(None, 30)

lr = .93
y = .99
i = 0
```

Finally, let’s teach the agent some rules of the game and check its performance. We provide the conditions for reward, the Q-learning algorithm and finally, the scores.

```# Executing the game rules and Q-Learning logic
while True:
for event in pg.event.get():
if event.type == QUIT:
pg.quit()
sys.exit()

window.fill(BLACK)

#at this position, the rectangle should be here
if crclCentreY >= windowHeight - rctHeight - crclRadius:
reward = calculate_score(rct, Circle(crclCentreX, crclCentreY)) # +1 or -1
crclCentreY = 50
else:
reward = 0
crclCentreY += crclYStepFalling

s = State(rct, Circle(crclCentreX, crclCentreY))
act = get_best_action(s)
r0 = calculate_score(s.rect, s.circle)
s1 = new_state_after_action(s, act)

Q[state_to_number(s), act] += lr * (r0 + y * np.max(Q[state_to_number(s1), :]) - Q[state_to_number(s), act])

rct = new_rect_after_action(s.rect, act)
crclCentreX = s.circle.circleX
crclCentreY = int(s.circle.circleY)

pg.draw.rect(window, GREEN, rct)

if reward == 1:
score += reward
elif reward == -1:
missed += reward

text = font.render("Score: " + str(score), True, (238, 58, 140))
text1 = font.render("Missed: " + str(missed), True, (238, 58, 140))
window.blit(text, (windowWidth - 120, 10))
window.blit(text1, (windowWidth - 280, 10))

pg.display.update()
fpsClock.tick(FPS)
if i == 10000:
break
else:
i += 1```

Here’s what your output could look like:

Q-learning is a powerful algorithm to make the agent intelligent. Furthermore, reinforcement learning algorithms are heavily used in robotics.

If you liked the article, you might like:

If you find any difficulties in following the article, do let us know in the comments.