This document discusses multi-modal embeddings and generative models. It begins by covering common generative architectures like VAEs, DBNs, RNNs and CNNs. It then discusses specific applications including text generation with RNNs, image generation using techniques like DeepDream and style transfer, and audio generation using LSTMs and mixture density networks. The document advocates for creative AI as a "brush" for rapid experimentation in human-machine collaboration.
12. Alex Graves (2014) Generating Sequences With
Recurrent Neural Networks
Wanna Play ?
Handwriting Prediction
13. Alex Graves (2014) Generating Sequences With
Recurrent Neural Networks
Wanna Play ?
Handwriting Prediction
(we’re skipping the density mixture network details for now)
14.
15. Wanna Play ?
Text generation
15
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural
Networks (blog)
16. Wanna Play ?
Text generation
16
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural
Networks (blog)
17.
18.
19. Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural
Networks (blog)
20. Andrej Karpathy, Justin Johnson, Li Fei-Fei (2015) Visualizing and
Understanding Recurrent Networks
25. Turn Convnet Around: “Deep Dream”
Image -> NN -> What do you (think) you see
-> Whats the (text) label
Image -> NN -> What do you (think) you see ->
feed back activations ->
optimize image to “fit” to the ConvNets
“hallucination” (iteratively)
26. Google, Inceptionism: Going Deeper into Neural Networks
Turn Convnet Around: “Deep Dream”
see also: www.csc.kth.se/~roelof/deepdream/
27. Turn Convnet Around: “Deep Dream”
see also: www.csc.kth.se/~roelof/deepdream/
Google, Inceptionism: Going Deeper into Neural Networks
30. Multifaceted Feature Visualization
Anh Nguyen, Jason Yosinski, Jeff Clune (2016)
Multifaceted Feature Visualization: Uncovering the
Different Types of Features Learned By Each Neuron in
Deep Neural Networks
33. Preferred stimuli generation
Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune (2016) AI Neuroscience: Understanding Deep
Neural Networks by Synthetically Generating the Preferred Stimuli for Each of Their Neurons
34.
35. Inter-modal: Style Transfer (“Style Net” 2015)
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015.
A Neural Algorithm of Artistic Style (GitXiv)
36.
37. Inter-modal: Image Analogies (2001)
A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Salesin.
(2001) Image Analogies, SIGGRAPH 2001 Conference Proceedings.
A. Hertzmann (2001) Algorithms for Rendering in Artistic Styles
Ph.D thesis. New York University. May, 2001.
39. Inter-modal: Style Transfer (“Style Net” 2015)
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015.
A Neural Algorithm of Artistic Style (GitXiv)
48. Inter-modal: Style Transfer+MRF (2016)
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
49. Inter-modal: Style Transfer+MRF (2016)
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
50. Inter-modal: Pretrained Style Transfer (2016)
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016, Dmitry Ulyanov, Vadim Lebedev, Andrea
Vedaldi, Victor Lempitsky
500x speedup! (avg min loss from 10s to 20ms)
51. Inter-modal: Pretrained Style Transfer #2 (2016)
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks, Chuan Li, Michael Wand, 2016
similarly 500x speedup
52. Inter-modal: Perceptual Loss ST (2016)
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
53. Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
59. Synthesise textures (random weights)
- which activation function should i use?
- pooling?
- min nr of units?
Experiment:
https://nucl.ai/blog/extreme-style-machines/
Random Weights (kinda like “Extreme Learning Machines”)
60. Synthesise textures (random weights)
totally random initialised weights:
https://nucl.ai/blog/extreme-style-machines/
70. • Image Analogies, 2001, A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Sales
• A Neural Algorithm of Artistic Style, 2015. Leon A. Gatys, Alexander S. Ecker,
Matthias Bethge
• Combining Markov Random Fields and Convolutional Neural Networks for Image
Synthesis, 2016, Chuan Li, Michael Wand
• Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, 2016, Alex J.
Champandard
• Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016,
Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky
• Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin
Johnson, Alexandre Alahi, Li Fei-Fei
• Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial
Networks, 2016, Chuan Li, Michael Wand
• @DeepForger
70
“Style Transfer” papers
71. “A stop sign is flying in blue skies.”
“A herd of elephants flying in the blue skies.”
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan
Salakhutdinov, 2015. Generating Images from Captions
with Attention (arxiv) (examples)
Caption -> Image generation
80. 80
Audio Generation: Raw
Gated Recurrent Unit (GRU)stanford cs224d project
Aran Nayebi, Matt Vitelli (2015) GRUV: Algorithmic Music Generation using Recurrent Neural Networks
81. • LSTM improvements
• Recurrent Batch Normalization http://
gitxiv.com/posts/MwSDm6A4wPG7TcuPZ/
recurrent-batch-normalization
• also: hidden-to-hidden transition (earlier only
input-to-hidden transformation of RNNs)
• faster convergence and improved generalization.
LSTM improvements
92. python has a wide range of deep
learning-related libraries available
Deep Learning with Python
Low level
High level
deeplearning.net/software/theano
caffe.berkeleyvision.org
tensorflow.org/
lasagne.readthedocs.org/en/latest
and of course:
keras.io
95. Questions?
love letters? existential dilemma’s? academic questions? gifts?
find me at:
www.csc.kth.se/~roelof/
roelof@kth.se
@graphific
Oh, and soon we’re looking for Creative AI enthusiasts !
- job
- internship
- thesis work
in
AI (Deep Learning)
&
Creativity
97. Creative AI > a “brush” > rapid experimentation
human-machine collaboration
98. Creative AI > a “brush” > rapid experimentation
(YouTube, Paper)
99. Creative AI > a “brush” > rapid experimentation
(YouTube, Paper)
100. Creative AI > a “brush” > rapid experimentation
(Vimeo, Paper)
101. 101
Generative Adverserial Nets
Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus, 2015.
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (GitXiv)
102. 102
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
103. 103
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
104. 104
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
”turn” vector created from four averaged samples of faces looking left
vs looking right.