CHAPTER 2

LITERATURE REVIEW

2.1 Introduciton Prediction

Prediction is a scientific process in order to get the

knowledge systematically based on physical data. The more accurate the data and the method

used, the better prediction result can be obtained. The prediction itself, is not 100% accurate,

some of the prediction can also fail to meet expectation.

According

to Herdianto (2013), prediction is a

systematic estimating process about something that is most likely to happen in

the future based on the information of the future then and now owned, in order

for his mistake (the difference between something that is occur with predicted

results) can be minimized. Prediction

does not have to deliver the exact answer to events that will happen, but rather

to strive for looking for answers as closely as possible that will happen.

Nowadays,

the traffic congestion problem that occur in the big city is already become

major problem. Usually this happen in the any place and any time. But usually, the traffic congestion have a

pattern that can be learned about, based on the time or the place. Traffic congestion problem can be predicted

to analyze the possibility for some of the place that might become a traffic

congestion.

2.2 Introduction

To Artificial Neural Network (ANN)

Artificial Neural Network (ANN) is a group of small

processing unit in network that are modeled based on human neural system. ANN

is a type of adaptive system that can change the structure based on the

external and internal information that flowing through the network to solve the problem.

According

to Ramadhani(2016), ANN

is a network architecture modelling based on the work of human nervous system (brain) when it is

carrying out specific tasks.

This model has the human brain ability to organize its constituent cells (neurons) by carry out certain tasks, especially with the effectiveness of the network pattern recognition which is

very high.

The Artificial Neural Network (ANN)

structure is parallel and able to adapt and learn to give expected output

correctly for input that hasn’t been trained. The learning ability to produce

the correct result make an Artificial Neural Network (ANN) as an indenpence

one.

Artificial Neural Network (ANN), is

divided into 3 things :

1.

Structure of relationship between

neurons (Network Architecture)

2.

Method for link weight (Training or

Learning Algorithm)

3.

Activation Function

Figure 2.1 Is an example of a feed forward network with multiple layers

Da Silva(2016)

From figure above, it’s also

explain if the Artificial Neural Network (ANN) has 3 layer, which is :

1.

Input

layer

This layer is the one who will receive any information

data from external environment. The

input will be normalize by activation functions with a limit value that will

give a better numerical precision result in mathematics operation that being

held by network.

2.

Hidden

layer

This layer is focus to extracting patterns that

related to process or a system that are being analyze.

3.

Output

layer

This layer is the layer that will given a produce and

present the final network outputs that has been performed by the some of the

layer before.

2.2.1 Backpropagation

Based on Werbos(1990) Backpropagation is the

most widely used tool in the field of

artificial neural networks. At the core of backpropagation is a method

for calculating derivatives exactly and efficiently in any large system made up of elementary subsystems or

calculations which are represented by known, differentiable

functions; thus, backpropagation has many

applications which do not involve neural

networks as such.

An example of backpropagation network architecture can be

seen in Figure 2.2

Figure

2.2 General Structure of

Backpropagation Gunawan(2009)

2.2.1.1 Backpropagation

Algorithm

There are three different Artificial Neural Network

(ANN) training algorithms, Levenberg-Marquardt, conjugate gradient and

resilient backpropagation, which are used in the present study. This is done to see which algorithm produces

better results and has faster training for the application under construction

Kisi(2005).

1.

Training

Algorithm

The aim for the

training algorithm is focus to reduce the global error, (E) that can be determine as :

(1.1)

While P is the total of training pattern and Ep is the error for training

pattern p. That can put into formula

:

(1.2)

While N as number of output nodes, Oi as network output and ti as target output. For in

training algorithm, a test is created to reduce the global error by configurate

the weight and biases.

2.

Levenberg-Marquardt

Algorithm

Levenberg-Marquardt

Algorithm is created to gain second-training speed without compute for Hessian

matrix. The function has a form as sum of the squares, then Hessian matrix can

be calculated as :

H=JTJ (1.3)

As for gradient

will be calculate :

g=JTe (1.4)

While J as Jacobian matrix, that have first

derivate of network errors by weight and biases and e as network vector error.

The

Levenberg-Marquardt Algorithm is concentrate to calculate the Hessian matrix by

using based on newton, that can be form :

(1.5)

If µ is large, gradient will decresease

with a small step-size. µ will

decrease after each complete step and also increase if tentative step will

increase the performance function.

The

Levenberg-Marquardt is better than conventional gradient descent techniques

3.

Conjugate

Gradient Algorithm

Conjugate

Gradient Algorithm is performed alongside conjugate directions, which is can

have faster convergence than steepest descent directions.

Conjugate

Gradient Algorithm, begin by searching steepest descent directions for the

first iteration , then the line searching is used to decide for optimal

distance to move along for the current search direction. After that, the next searching is decided to

previous search directions. The new search direction will start to combine

between new steepest descent and previous search direction. Various version for this, are differentiate

by the step of in that will be constant ?k

is computed.

4.

Resillient

Backpropagation Algorithm

Resillient

Backpropagation Algorithm main focus to reduce and decrease the negative

effects of magnitude partial derivate.

Just only the mark of derivative that will be used to decide the

direction for the weight update, but magnitude of derivate will not changing the

weight update. The size for weight

change will be known by separate update value.

Update value for each weight and bias will be increase from a factor

everytime derivative from performance function which is related to the weight

has a same mark for two successive iteration. The update values will be

decrease by a factor if the derivate which is related to the weight change mark

from previous iteration. If devirative is zero, update value will be same.

2.2.1.2 Activation

Function

There are some

of the choices for activation function that can be used inside Backpropagation

method such as Sigmoid Biner Function, Sigmoid Bipolar and Tangen

Hiperbolik. Characteristics that must

have for activation function are continuous, differencial with ease and a

function that not down. Nyura(2016)

Some of the activation

function that usually use in Backpropagation method are :

1.

Function

Sigmoid Biner

Function Sigmoid

Biner is used for neural network that will be training in Backpropagation

method. The Function Sigmoid Biner has a

range value (0,1). This function usually used for neural network that have

interval for output value from 0 until

Function Sigmoid

Biner can be state as :

(1.6)

With derivative

:

(1.7)

2.

Function

Sigmoid Bipolar

Function Sigmoid

Bipolar is almost similar to Function Sigmoid Biner but the range of the output

is (-1,1).

3.

Function

Linear

For Functionn

Linear, output from this function is same as input for this function

2.2.2 Backpropagation

Training Algorithm

There is steps

of standar backpropagation algorithm :

2.2.2.1

Training Algorithm

A.

Initialization

of intial system

1.

Initialization

the weight, epoch = 1 and MSE = 1

2.

Decide

the maximum epoch, learning rate (?) and also error target

3.

Make

into step 4 until step 12 as long Epoch is less than maximum epoch and MSI is

more than error target

4.

Epoch

= Epoch + 1

B.

Feed

Forward

5.

For

every input unit xi sends to the input signal to all unit in the

input layer.

6.

For

each unit that in hidden layer zj

(1.8)

Hidden layer signal will be calculate by using

activation function to sum the weight input signal xi

(1.9)

After that, will be sent to the output layer

7.

For

every unit in output layer yk will be calculate same by using

activation function to sum the weight input signal zj

(1.10)

Then, it will be using

activation function to calculate the signal output

(1.11)

C.

Backpropagation

Error

8.

For

every output yk wil

receive tk the target

pattern. After that error information in

output layer (

) will be calculated.

will be sending

to the hidden layer and to calculate the weight and bias correction between

hidden and output layer

(1.12)

(1.13)

(1.14)

9.

For

each hidden layer unit, the error information in stop layer

will

begin.

will be used to

calculate the weight and bias correction between input and hidden layer.

(1.15)

(1.16)

(1.17)

D.

Updating

connection weight and bias

10. For the updating connection weight and bias for each

output unit y_k, bias and the weight will be update. Then bias and the weight will become new bias

and new weight.

(new) =

(old) +

(1.18)

11. After that, first until p-unit in hidden layer will be

update the bias and the weight.

(new) =

(old) +

(1.19)

12. After complete, the test condition will be stop

2.2.3 Normalization

and Denormalization Data

Normalization data is type of process to convert the

original data into the data that have value range from 0 until 1. The target for normalize the data in

Artificial Neural Network (ANN) in order to make easy to recognize the pattern

of data. The formula used for

normalization data is :

(1.20)

Explaination :

x’ = Normalized data

x = Input data

a = smallest value in input data

b = largest value in input data

After process training and testing, the output value

from Artificial Neural Network (ANN) need to return back to the original

value. This process know as

denormalization data. The formula used

for denormalization data is :

(1.21)

Explaination :

x = the original value

x’ = value of the test

a = smallest value of all data

b = largest value of all data

2.3 Previous Research About Traffic

Congestion

Nowadays,

many kind of method that have been used to predict the traffic congestion in

the big city. One of the method that has

been used for traffic congestion prediciton is Backpropagation method from

Artificial Neural Network (ANN). Research

by Xiaojian(2009) used Backpropagation for traffic flow, by making a model

construction based on crossroad/intersection to maximize the flow of crossroad

traffic and the waiting time. The

outcome showed the Artificial Neural Network (ANN) is stable after reaching 300

cycle training without clear accuracy change.

But, time for training is same as index training with prediction

accuracy 85% that can help traffic flow and able to optimize the time.

Another

research about traffic congestion is done by More(2016) using Jordan Neural Network

to predict and control traffic congestion.

That research proposes the significance of Jordan sequential

network for prediction of

future values, depending on the current value aggregate past values and also

guarantees prediction of traffic

flow with accuracy of about 92-98% using Jordan’s Sequential

Network. The result is quite satisfying for every

analysis over different parameter of Jordan Neural Network certain points are

concluded as :

1.

Accuracy obtained is maximum at 0.5 learning rate.

2.

Structure of neural network determines the output, thus hidden layer should contain

square of the number of

input neurons.

3.

Considering the error factor instead of iterations gives almost 98% accuracy for most of the

datasets.

Thus, overall objective of accurate road traffic

prediction is

met using Jordan’s Neural Network with proper structure.

2.3

Summary

This chapter covered briefly about prediction of

traffic congestion and Artificial Neural Network. The prediction traffic application for

similiar problem like traffic congestion is able to give good estimation and

prediction for traffic congestion with highly efficient result. Backpropagation and Jordan Neural Network,

fill out the target and satisfies every analysis from any parameter. This chapter also, has examined and explained

about the basic concepts of Artificial Neural Network (ANN) with method that

being used is backpropagation. Some of

the concept from Artificial Neural Network (ANN) like architecture of ANN, the

activation function are explained. Normalization

and denormalization data make easy understand for the pattern of the data. Decision made based on this chapter will be

used to planned ahead for development of the system based on the methodology

that will be discussed in the next chapter.