Let me start with an example. Suppose you have built a ML model under supervised learning which can detect faces from your family members. You simply feed all the family photos to your model. And, at the end, you would like to know how much your model learnt from your training process. After training your model, you will find a function that maps input (image) to output labels(family members). If we represent your ML model with a function 'f',we can write this process as:

  y = f(x),
where  x = input  (your family photos)
y = output (model prediction) (predictiong whether the photo is your father, mother or sister)



Suppose, you use 50 images for your training process and use the same images to see whether the model can predict the label correctly or not. If you found that 10 out of 50 images are not predicted well, you can say : 20% training error. If you think about the intuition behind this calculation, it is saying that 'the model is not doing well with the training data itself'. That means it underfits the data. It does not want to carry all the information available in your data.

*NOTE: If your model 'f' is a linear model than it will be really hard for your model to classify the photos using linear function.

Sijan Bhandari on
In :
# -*- coding: utf-8 -*-
# @Author: Sijan
# @Date:   2019-04-01

from random import randint

def step_function(result):
"""
Simple linear function which will be activated if the value is greater than 0.
"""
if result > 0:
return 1
return 0

class Perceptron:
"""
Perceptron class defines a neuron with attributes : weights, bias and learning rate.
"""

def __init__(self, input_size):
self.learning_rate = 0.5
self.bias = randint(0, 1)
self.weights = [randint(0, 1) for _ in range(input_size)]

def feedforward(perceptron, node_input):
"""
Implements product between input and weights
"""
node_sum = 0
node_sum += perceptron.bias

for index, item in enumerate(node_input):
# print('input node is', item)
node_sum += item * perceptron.weights[index]

return step_function(node_sum)

"""
Adjust weightage based on error. It simply scales input values towards right direction.

"""
for index, item in enumerate(node_input):
perceptron.weights[index] += item * error * perceptron.learning_rate

perceptron.bias += error * perceptron.learning_rate

def train(perceptron, inputs, outputs):
"""
Trains perceptron for given inputs.
"""
for training_input, training_output in zip(inputs, outputs):
actual_output = feedforward(perceptron, training_input)
desired_output = training_output
error = desired_output - actual_output

def predict(perceptron, test_input, test_output):
"""
Predicts new inputs.
"""
prediction = feedforward(perceptron, test_input)

# if test_input == test_output:
print('input :%s gives output :%s' % (test_input, prediction))
print('input :%s has true output :%s' % (test_input, test_output))

if __name__ == '__main__':

train_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
train_outputs = [0, 0, 0, 1]

# train perceptron
perceptron = Perceptron(2)
epochs = 10

for _ in range(epochs):
train(perceptron, train_inputs, train_outputs)

# test perceptron
test_input = (1,1)
test_output = 1
print('...................................')
print('...................................')
predict(perceptron, test_input, test_output)

weight after adjustment [0.0, 0.0]
...................................
...................................
input :(1, 1) gives output :1
input :(1, 1) has true output :1

In [ ]:


Sijan Bhandari on

Randomness is not a property of a phenomenon. It is simply an unpredectability of occurence of events around you. It occurs in different scenarios of our life. For example, while roaming around street, you found a coin; now you would certainly look for other coin around that spot. But, there will be not any certainty or possibility or pattern of finding one. Other examples are - tossing coin / dice, fluctuating market prices for common goods.

In the field of Mathematics and probability, we assign some numerical value for identifying each of this random outcome. i.e. we use probability to quantify randomness. And, probability of certain event is calculated by the relative frequency of that event in the experiment.

In probability, the current occurrence / selection you do for your experiment is an event. For example,fliping a coin is an event.And, the act of tossing the coin is called independent trail. If you do number of trails, it is called an experiment. And, all the possible outcomes of an experiment is called sample space. So, we can say that an event is also a subset of sample space.

Another example : Suppose you need to choose a point from an interval (10, 100). Your selection E = (12, 34) is an event.

Sijan Bhandari on

In this post, we are going to devise a measurement tool (perceptron model) in order to classify : whether a person is infected by a diseases or not.

In binary terms, the output will be

       {
1   if infected
0   not infected
}



To build inputs for our neural network, we take readings from the patients and we will treat readings as follows :

  body temperature = {
1   if body temperator > 99'F
-1   if body temperator = 99'F
}

heart rate = {
1   if heart rate > 60 to 100
-1   if heart rate = 60 to 100
}

blood pressure = {
1   if heart rate > 120/80
-1   if heart rate = 120/80
}



So, input from each patient will be represented as a three dimensional vector:

  input = (body temperatur, heart rate, blood pressure)


So, a person can now be represented as :

(1, -1, 1)
i.e (body temperator > 99'F, heart rate = 60 to 100, heart rate > 120/80)

Let us create two inputs with desired output value

      x1 = (1, 1, 1), d1 = 1 (infected)
x2 = (-1, -1, -1), d2 = 0 (not infected)


Let us take initial values for weights and biases: weights, w0 = (-1, 0.5, 0) bias, b0 = 0.5

And, activation function:

         A(S)   = {
1 if S >=0
0 otherwise
}

##### STEP 1¶

Feed x1 = (1, 1, 1) into the network.

weighted_sum:

S = (-1, 0.5, 0) * (1, 1, 1)^T + 0
= -1 + 0.5 + 0 + 0
= -0.5



When passed through activation function A(-0.5) = 0 = y1 We passed an infected input vector, but our perceptron classified it as not infected. Let's calculate the error term:

             e = d1 - y1 = 1 - 0 = 1


Update weight as:

             w1 = w0 + e * x1 = (-1, 0.5, 0) + 1 * (1, 1, 1) = (0, 1.5, 1)


And, update bias as:

             b1 = b0 + e = 1
##### STEP 2¶

Now, we feed second input (-1, -1, -1) into our network.

weighted_sum :

S = w1 * x2^T + b1
= (0, 1.5, 1) * (-1, -1, -1)^T + 1
= -1.5 - 1 + 1
= -1.5


When passed through activation function A(-1.5) = 0 = y2 We passed an not infected input vector, and our perceptron successfully classified it as not infected.

##### STEP 3¶

Since, our first input is mis-classified, so we will go for it.

weighted_sum :

S = w1 * x1^T + b1
= (0, 1.5, 1) * (1, 1, 1)^T + 1
= 1.5 + 1 + 1
= 3.5


When passed through activation function A(3.5) = 1 = y3 We passed an infected input vector, and our perceptron successfully classified it as infected.

Here, both input vectors are correctly classified. i.e algorithm is converged to a solution point.

In [ ]:



Perceptron is simply an artificial neuron capable of solving linear classification problems. It is made up of single layer feed-forward neural network.

A percentron can only takes binary input values and signals binary output for decision making. The output decision (either0 or 1), is based on the value of weighted sum of inputs and weights.

Mathematically perceptron can be defined as :

output O(n)=
{    0 if ∑wixi + $\theta$ <= 0
1 if ∑wixi + $\theta$ > 0
}

$\theta$ = threshold / bias Deep learning, in simpler version is a learning mechanisms for Neural networks. And, Neural networks are computational model mimicing human nervous system which are capable of learning. Like interconnected neurons in human brains, the neural network is also connected by different nodes. It receives signals as a set of inputs, perform calcuations and signals output based on some activation value. Here are some list of problems, that deep learning can solve

1. Classification : object and speech recongnistion, classify sentiments from text
2. Clustering : Fraud detection