A group of researchers[1] at the University Of Tubingen, Germany, developed an algorithm that could morph an image to resemble a painting in the style of the great masters. Couple of months back, an App developed by Alexey Moiseenkov and his team took the world by storm with over 7.5 million downloads in one month. This interesting facet of Machine Learning has been possible through the field called Neural Network – or more precisely, Convolutional Neural Network. The applications of Deep learning are being perfected with every passing day, which means – machines can now not only read addresses in a post office or hand written cheques in Banks, but also paint pictures like the great masters or read out aloud to help blind use Facebook. This is fascinating, more so because the complexity involved in functions like vision and pattern recognition, which were hitherto considered beyond the purview of artificial intelligence are soon being peeled open – literally, layer by layer to demonstrate the ability of machines in mastering them. The very notion that differentiates the inanimate from the living – our sensory perception of stimuli is being challenged. Having said that, we are still far from achieving perfection in creating a machine that can emulate the eye or brain but the progress that is being made is stupendous.

So, how exactly do the machines read hand written digits or recognise a style of painting and apply it to the provided picture? The best shot of achieving any of this is in understanding how humans would do it, although not all humans are truly gifted in the sense to simulate a Picasso or a Da Vinci.

To provide a very basic level of understanding, I will use the analogy used in a tutorial from www.neuralnetworksanddeeplearning.com. To be able to mathematically or logically simulate through programming what the 140 million neurons coupled with millions of ganglion and tens of millions of connections between them do is by no means a cheap feat. This, nevertheless, is not our endeavour, because to be able to code to recognize a set of hand written digits in itself poses problems such as identifying segregation between numbers , understanding the strokes, and then predicting the number  all of which complicate the process. Our best bet hence, is to simulate our thinking process and help the machine learn on its own. Provide the machine with a huge training set, with tens of thousands of samples and allow the machine to get closer to reading the numbers by just minimizing the Error function. To provide a perspective, of how we do this, let us understand two types of Neurons – the Perceptron and the Sigmoid Neurons.

To simplify it, let us assume that we are to decide whether to go and watch a movie this Saturday or not, which is a binary response of a YES/NO. The factors that we are willing to consider to help us decide are: Proximity of the theatre to our home, the weather condition, whether our partner is willing to accompany us, and the timing. Let us assume now that each of the condition has a binary deciding factor which means that if the distance is over 10 Km, I will go, else I won’t; If the Partner accompanies , I will go, else I won’t; If the timing is between 4 and 8 PM I will go, else I won’t; If it rains I won’t go else I will. So, here in the above example, assuming that there are separate weights that you would like to attribute to each of the input, say if the distance is less than 10 KM, no matter what the other conditions are, you would be willing to go and watch or if the timing is not between 4 and 8, even if the other conditions are satisfied, you most likely won’t watch, it becomes a binary function output, wherein, if the product of the weight and the input are above a certain value, you will go, else you won’t. This is how a simple perceptron would work.

Now twist this scenario a bit, where each of the condition can have several continuous possibilities between 0 and 1. Eg. If the distance is 1 km, the possibility of your going is .9, and it is .8 if the distance is 2 km and keeps reducing further until .1 if the distance is 20 km, this becomes a continuous input variable, and is vaguely the concept of a Sigmoid Function. This allows a smooth curve with a small change in input to result in a small change in output, the advantage of which is that we can modify the weights accordingly to decide the output until a perfect output is reached.

As can be seen from the graph, a sigmoid function tends to a perceptron when the value of the input variables becomes large positive number (when it tends to 1) or large negative number (when it tends to 0) and hence becomes binary. There could be several layers between the input and output which helps in deciding the heuristics of the process. I have not introduced the concept of bias and would suggest a read through Neural Networks tutorial for a deeper appreciation of this method.

Next, we define an error function, which is the squared difference between output received from the neural net and the actual output as per the training set. The objective is then to minimise this error function to increase accuracy. The more the training set, the lesser the error, and the better the accuracy.

In most such problems, where we have a training set, which qualifies the desired output, the machine learning happens by systematically applying the concept of gradient descent. In other words , Gradient descent is a way to minimize an objective functionJ(θ)J(θ) parameterized by a model’s parameters θ∈Rdθ∈Rd by updating the parameters in the opposite direction of the gradient of the objective function ∇θJ(θ)∇θJ(θ) w.r.t. to the parameters. This can be done by defining a cost function and finding the first differential to predict the slope of the cost function and minimizing it to the extent possible by moving in the direction of the global minima.

This concept, coupled with the concept of Convolutional Neural Network and Feed forward algorithm is used to identify the patterns (Style) of the sample art piece which is to be emulated. Identifying a picture is done through breaking down the picture in sub structures or features, which are either already taught to the system through supervised learning or are understood using the concepts of layering through deep learning. Convolutional Neural networks, which draws inspiration from the research done on cat’s visual cortex, exploits spatially local coherence by enforcing a local connectivity pattern between adjacent layers. Spatially contiguous visual fields are used as first level inputs to the hidden layers, which confines the learnt filters to have local patterns.

The structure of the original photo is also recognized through Feed Forwarding Convolutional neural nets, and eventually, the both are combined to produce an image that has the structure of the image with the style of the Master Artist.

Before concluding, I want to throw in my two cents of how this can be used in the Human Capital Management space.

On boarding: A lot of tasks in on-boarding are extremely transactional and boring but very vital. A future application can perhaps directly read important details like PAN number, TFN or social security number etc. from the submitted documents and input them into the ERP system directly without any human intervention thus eliminating human fatigue and errors- Read eliminate form filling.

Engagement Analysis: Sentiment analysis and Psychological predictions using Deep Learning help classify (Unsupervised learning) employees as engaged or not based on various input parameters of how they interact digitally.

Although it might seem intrusive at first, but Deep learning can be used to understand patterns in managers (aggressive emails/ pushy nature), approval frequencies, motivational or appreciation mails etc and the pattern of attrition to understand and provide tips to managers on motivating and leading their resources better.

Most of the other applications of machine learning as highlighted in my previous post, can also be used in tandem with Deep learning in the fields of Hiring, Performance appraisals, Social collaboration, Learning management etc.

[1] A Neural Algorithm of Artistic Style Leon A. Gatys,∗ Alexander S. Ecker, Matthias Bethge

One Comment

Leave a Reply
  1. Prsima an App on both android and ios platform is one of the example of using AI and Deep learning to replicate the pattern and converting a normal picture to the one made by an artist of some era.

Leave a Reply

Your email address will not be published. Required fields are marked *