The article considers the general information and the main concepts of SLAM methods and computer vision. This article also describes the use of such methods in path planning and decision making in videogames.
Nowadays, AI is becoming more and more humanizing in many different spheres of life. There are a lot of technologies, that can teach AI to behave and think like a human, and one of them is computer vision.
We will look at some examples of working with computer vision, learn something about SLAM methods and analyze their role in videogames.
Perception module using computer vision helps vehicles to collect surrounding information in order to process such data using various algorithms and methods later and, finally, coordinate the vehicle in space. Similar methods are using in games for path planning inside the levels.
Computer vision is not the only technique to process the information, there are several others. But the advantage of computer vision is that we can analyze each step, in a straightforward way.
To successfully navigate through the environment, vehicle needs detailed information about everything around it. So, most computer vision system rely on image sensors, which detect electromagnetic radiation, which is typically in the form of either visible or infra-red light.
Then the AI needs to detect different objects on the road. For example, we want to detect other vehicles on the road. Here are some steps taken:
1. Analysis of data
2. HOG (histogram of oriented gradients) feature extraction
3. Train a support vector machine classifier
4. Implement a sliding-window and use the classifier to search the vehicle
5. Generation of Heatmap with detection and bound box on vehicles
On the first step, we need to teach the algorithm to detect a car in a picture. The best way is to train algorithm with a lot of images, labeled “cars” and “non-cars”.
The HOG extractor allows us to extract the meaningful features of an image. It captures only the general aspects of the picture, but not the specific details of it. It divides an image in several pieces and calculates the gradient of variation in a given number of orientations. So, on the final step the algorithm takes a histogram of orientations and directions, make a block regularization and return a single dimension array of data to be fed in classifier.
The next step is to train a classifier. It receives the cars and non-cars data transformed with HOG detector and returns if the sample is or is not a car.
Then we can feed a classifier with our image and get a result from it, is it a car or not.
One of the most important component of computer vision is SLAM (Simultaneous Localization and Mapping) module.
Imagine a robot with some sensor set. It has no information about the surroundings, all that it has – sensor readings and the ability to memorize some changes, that appeared in the past. Relying on such equipment, the robot can locate itself relative to the previous location. But because of the many factors, the global map of robot’s positions will be full of distortions.
SLAM uses the robot pose estimate to improve the map landmark position estimates. It can be described as a concept for solving two problems:
1. Producing an accurate map
2. Producing the trajectory of the robot
Given a series of sensor observations ot over discrete time steps t, the SLAM problem is to compute an estimate of the agent’s location xt and a map of the environment mt. All quantities are usually probabilistic, so the object is to compute:
P(mt, xt | o1/t)
SLAM can be separated in several different groups: backend and frontend.
Data association. On this step the main features and landmark extractions are taken from the sensors. Features are such characteristics, that can help the robot in the future to navigate in space. It is important for such things to be static. The examples of features: angles, straights.
Local Motion Estimation. Based on the comparison between the features, given in current moment of time, with the features, saved in robot’s memory, it can determine the offset on the map. Based on this information, it is possible to express camera coordinates through a system of linear equations.
Features Integration. Updating the structure of the history-storing location, where each state represents the robot’s global position and the inter-location of the surveyed features over a certain period of time.
Comparison of local shift data and features. On this step the robot has two different shifts because of some factors during the scanning of environment and the inaccuracies in the algorithms. The robot should save and consider this difference in further calculations.
Global Optimization. Based on the newly obtained data, the current views on the map are refined. The state estimation, features and their probabilities are recalculated.
There are a bunch of SLAM methods:
1. EKF SLAM
2. FastSLAM 2.0
6. Parallel Tracking and Mapping
EKF (Extended Kalman Filter) – this is one of the most popular methods. I fully described it’s working process upwards.
Path planning is an important primitive for autonomous robots that lets robot find the most optimal path between two points.
There are a lot of different algorithms solving this problem.
DECISION MAKING IN FIRST-PERSON SHOOTER
Many people in our world think, that AI in videogames is based on machine learning, that the system gets stronger and stronger upon being fed more data. Some experimental games are really trying to make AI with learning capability, but in real life that is not what usual players want.
In modern game industry one of the most widely used algorithms is FSM (Finite State Machine). In a FSM, a designer generalizes all possible situations that an AI could encounter, and then program a specific reaction for each situation. For example, in shooter games, AI would attack human if he shows up in his line of sight.
There is another more advanced method, that is usually used to enchase the personalized game experience. It is called the MCST (Monte Carlo Search Tree). The idea of this algorithm is to embody the strategy of using random trials to solve a problem. For each point in the game, AI would firstly consider all the possible human player moves, then all the possible actions, that can be done in this situation and so on. After multiple repeating, the AI calculates the payback and then decide the best branch to follow.