The Algorithms

Entheos uses many different algorithms to create art. Here I describe the algorithm I used to generate the early and well known original Entheos pieces. I’m going to do my best to explain how that algorithm works, and how it compares to other popular algorithms like Google’s Deep Dream, Style Transfer, and Filter Activation algorithms. At a high level there are many similarities between all of theses different image enhancement algorithms. At their core they all leverage the idea of using gradient ascent to gradually change each individual pixel in a image, until seemingly another image emerges. Although they are similar in many ways, there are also many differences, primarily that the subset of the network that the model uses to extract features, or in our case enhance features, is quite different, and structured in different ways with different approaches.

In order to understand how this works you must first understand the architecture of the network.

Diagram showing the architecture of a standard convolutional neural network commonly used for image classification.

This is the basic architecture of a CNN or a convolutional neural network. Don’t let the big words scare you! It’s actually not that scary at all. And, if you read through this a little further, you might even be able to impress your friends or family at a dinner party and explain how an AI network is able to classify images and seemingly know the difference between a dog and a cat.

A CNN is basically made up of layers. An input image is passed through each layer and processed until the resulting vector gets ultimately sent to a set of nodes, one for each of the objects the CNN is trained to classify. Whichever node or class gets the highest score is the winner and that is the prediction the model makes.

Inside each layer, there are what we call feature detectors or filters. The model learns to create these feature detectors on its own. At the lower levels of the models architecture, these filters detect basic things like edges colors or shapes but as we move up the layers of the network the features that the filters can detect become increasingly complex.

The process by which a filter passes over an image and detects features is called a convolution, and each unique filter is meant to detect a specific type of feature. (Although we may not fully understand exactly what it is the model has learned to detect to make its predictions)

Diagram illustrating the passing of a convolutional filter over a pixel region of an image.

Each layer has lots of filters and as we move up the layers of the network’s architecture, generally the number of filters increases. You could say that when a filter passes over an image and detects something it “likes”, it lights up and passes a higher score for that filter up the networks architecture, and along a path to the actual classifier. Generally speaking, neural nets were meant to be modeled after the way the human mind functions. In the brain, sets of neurons fire together in webs of connections to form thoughts. Although AI models are vastly different in comparison to the human mind, in this same way, webs of scores get passed along to the higher architectures of the model, and give the network the ability to make classifications.

Ok, so what does all this have to do with generating art?

Well Let's take a closer look at these filters, and might possibly be some of the features they are meant to detect....

Filter Activation is an algorithm that allows us to gain intuition about what a particular filter may be trying to detect.

It works in somewhat of a reverse process from the way the architecture of the CNN was intended to be used for classification. Instead of taking an image of a cat or a dog and sending it through the network, lets take and image of white noise (static) and then manually light up an individual filter and ask the model to enhance the image based on the activation of just that one filter or feature detector.

This is an image of randomly generated pixel values; the seed image for this filter activation algorithm.

The images above are generated from one of the first few layers of the network. As you can see, from activating filters at this lower level of the architecture, the model appears to only be detecting basic edges, shapes and colors, maybe some simple patterns.

As we move up the architecture the patterns become more complex.

From some of the filters in layer 14 we start to see some really interesting stuff. It’a almost as if textures start to appear.

These images were generated form filters in layer 19 of this model.

And, at the highest levels of the architecture, the images the filters can produce can get downright bizarre. The idea is that as each one of these filters passes over the image, if the image correlates with the feature it was meant to detect, the filter lights up and passes that score on down the path to the classifier. As an AI artist we can leverage this technique to make artwork. So why do some of the images like those produced in Google’s Deep Dream algorithm appear to have animals showing up in them? Well, Google’s Deep Dream algorithm works in a similar way to filter activation, except instead of activating an individual filter, they activate an entire layer of filters. — And, at the highest levels of the architecture, the images the filters can produce can get downright bizarre.
The idea is that as each one of these filters passes over the image, if the image correlates with the feature it was meant to detect, the filter passes a higher score on down the path to the classifier. As an AI artist we can leverage this technique to make artwork.
So why do some of the images like those produced in Google’s Deep Dream algorithm appear to have animals showing up in them? Well, Google’s Deep Dream algorithm works in a similar way to filter activation, except instead of activating an individual filter, they activate an entire layer of filters.

So if a filter is meant to detect just a single feature, even at a high level, like an eye or a nose or a mouth, and a layer is the collective sum of all of those scores, then maybe layers are better suited for summing those features together to form structures like faces. In other words, Google’s Deep Dream algorithm is activating all the filters in the layer at once. Googles Algorithm, although there are many variations, inputs an image and then selects a layer from the architecture to use for image enhancement. The higher the layer the more complex the features it will hallucinate. And that's just it, these models were trained to classify animals like dogs and birds and cats and random objects like buildings, that's why when they "hallucinate" to form an image they tend to see dogs and cats and birds.'

So how does the algorithm I use work?

My algorithm is what I call a Hybrid Filter Activation Algorithm. Instead of activating an individual filter or for that matter the entire layer, I activate some subset of chosen filters, given a particular preference, with the intention of generating art. At a high level that is the basic idea. The filters become like a palette. Of course there are many subtle differences which would be much too hard to explain briefly here but the idea being that the art is generated with a subset of filters chosen in specific ways to generate beautiful art.

This is where the fun of interacting with the algorithm comes into play.

Using some unique seeding and and creative iterations through different feature sets, we can generate beautiful pieces of artwork! This is a dynamic process and the algorithms we use are always changing. Stay tuned to this channel to get updates on the process, new algorithms, and the latest generations of artwork soon to be released!

The Algorithms

Entheos AI Generated Art -