Skip to content

Illustration of GAN: Generative Adversarial Network for Machine Learning

2017-4-18
This article has been translated using machine translator. It may not perfectly capture the nuances of the original text. I appreciate your understanding in this matter.

Artificial intelligence and machine learning are all big doors with a lot of subordinate details, and today we are only discussing Generative Adversarial Networks, which has made a huge contribution to the field of automatic content generation in less than three years.

What can GAN do now?

  1. Graffiti

As long as you doodle and the scene you have in mind will be automatically calculated, for example, we only need to draw two strokes of green, and the grassland will come out. [^IED]

We continue to add the silhouette of the mountains and the whiteness of the snowy mountains.

This can also be used in a design similar to a shoe bag.

It's still graffiti,

However, the resulting image is Van Gogh style [^NDD]

You can even use doodling for a quick make-up change [^PED]

You can also color the line drawing, just a simple doodle is needed, the left picture is the input, the right picture is the result of the program running [^CLA]

  1. Generation

When watching the next generated image, be sure to bring the following A few prerequisites:

  • All images start with a random number, so you can generally see pictures with average noise at the beginning of generation;
  • All training processes are unattended, and even the classification of training data is not available;
  • The original paper on the algorithm was published in 2014, less than three years ago;

Can generate flowers[^FLR]

Can generate a girl's face [^GLF]

Can generate a renovated house floor plan [^LOT]

Can generate pictures of Pokemon[^PKM]

You can even generate a corresponding image based on the text description you entered[^TIS]

  1. Completion

The incomplete part of the picture can be automatically generated, as shown below, the face area in the middle is deleted, and the incomplete part is regenerated by the algorithm [^ICT]

  1. Optimize the instance

A typical optimization example is shown here, but since the algorithm principle has not been officially published in this example, it is listed here only to show that the algorithm can be competent for such problems.

is said to be the world's first race car created by machine learning itself

In the first step, the engineers deployed the sensors in various parts of the now-mature racing frame, as shown below: [^HKR]

In the second step, through the calculation of artificial intelligence, the following frame structure that can never be designed by humans was designed, achieving the lightest and strongest effect. Of course, to create such a framework, you have to rely on 3D printing, the same cutting-edge technology.

In the final step, the designers designed a cool shell for the car.

Second, the basic principle

I will only briefly talk about the generation process of generative adversarial networks.

As shown in the figure below, a simple system with a single point. [^SGA]

The green dot represents the target feature obtained through training, while the dark dot is a randomly generated starting position, and as the calculation deepens, the parameters in the network are continuously adjusted to fit the target feature, and finally meet the target feature.

Understanding this simple example, let's look at this slightly more complex example: [^MGA]

The blue dots represent the target feature group obtained after training, and the red dots are randomly generated, and as the calculation progresses, the parameters in the network are constantly adjusted to fit the target features and find the position with the smallest standard deviation. You may ask why the dots coincide exactly, in order to generate randomness and diversity of images, and if they do, then all images are the same.

Third, what kind of investment is required to implement GAN?

Why artificial intelligence has developed so rapidly in recent years is due to the rapid development of hardware and the improvement of software platforms, the maturity of NVidia's CUDA platform, which makes development simpler and more controllable, and the maturity of a series of basic platforms such as open-source neural networks such as TensorFlow and Torch, so that latecomers can stand on the shoulders of giants and continue to move forward.

For individuals, the easiest configuration to experience generative adversarial networking is a modern computer with the latest NVidia graphics cards. Speed, here is a data to refer to:

50,000 picture sets are trained on 8 Titans (NVidia's highest-end graphics card, a certain East price of 10,000 + RMB), which takes 5 days [^SOG]

Note: Different specific algorithms, speed and accuracy vary widely, the above data can only show that there is currently such an algorithm, and students who do machine learning will inevitably encounter the need to verify the above algorithm, so the speed still has important reference value.

In terms of software, the most convenient experience is to install the Linux operating system directly on the machine, because Windows virtual machines generally have a graphics interface layer to solve the problem of graphics card sharing, making it temporarily impossible to call CUDA, only the native Linux operating system in the non-virtual machine can directly access the graphics card to execute CUDA commands, while the core virtual machine under Linux has no such limitation, so Docker can be used.

Potential applications of machine learning in the industrial field

So far, there has been no clear application of machine learning generation networks in the industrial field on the network, but we can imagine that using such machine learning generation network technology, we can do the following:

  • Automatically generate plant and pipeline design drawings
  • Automatic generation of configuration logic

What are the shortcomings of our industry?

  • Basic talents and hardware environment for artificial intelligence and machine learning
  • Feasible plant and pipeline design drawings and configuration logic summarized from past experience
  • Ensure proper validation logic

It all takes a company with this experience, a data expert who is good at analysis, and engineers who are good at organizing data and logical thoughts.

References

[^LOT]: DCGAN Japanese floor plan https://www.youtube.com/watch?v=SwvM7f5CUjs [^PKM]: DCGAN with Pokemon GO https://www.youtube.com/watch?v=rs3aI7bACGc [^GLF]: Generating faces by DCGAN https://www.youtube.com/watch?v=Svk0SxyNdr8 [^MGA]: Generative Adversarial Example https://www.youtube.com/watch?v=CILzNj2MP3s [^HKR]: Hack Rod https://www.youtube.com/watch?v=0ebsf2BMYm8&t=12s [^IED]: Image Editing with Generative Adversarial Networks https://www.youtube.com/watch?v=pqkpIfu36Os&t=126s [^SGA]: Simple Generative Adversarial Network (GAN) https://www.youtube.com/watch?v=ebMei6bYeWw [^SOG]: https://www.zhihu.com/question/54414709 [^FLR]: WGAN flowers https://www.youtube.com/watch?v=e50WBRManWU [^PED]: Neural Photo Editing with Introspective Adversarial Networks https://www.youtube.com/watch?v=FDELBFSeqQs [^CLA]: Automatic coloring and shading of manga-style lineart http://kvfrans.com/coloring-and-shading-line-art-automatically-through-conditional-gans/ [^ICT]: Image Completion with Deep Learning in TensorFlow https://github.com/bamos/dcgan-completion.tensorflow [^TIS]: Generative Adversarial Text to Image Synthesis https://arxiv.org/pdf/1605.05396v2.pdf [^NDD]: Neural doodle https://github.com/alexjc/neural-doodle