The document appears to be slides from DeNA TechCon 2020. It discusses several topics relating to AI and computer vision, including DRIVE CHART which is DeNA's AI platform, work on object detection, optical flow estimation, and the use of neural networks like CNNs and RNNs in areas like computer vision. It also references prior work on generative adversarial networks and improving discriminators and generators in these networks.
3. DeNA TechCon 2020
#denatechcon
(CV: Computer Vision)
dog
2.2. Prior art
Much of the work on GAN architectures has focuse
on improving the discriminator by, e.g., using multip
discriminators [18, 47, 11], multiresolution discriminatio
[60, 55], or self-attention [63]. The work on generator sid
has mostly focused on the exact distribution in the input l
tent space [5] or shaping the input latent space via Gaussia
mixture models [4], clustering [48], or encouraging conve
ity [52].
Recent conditional generators feed the class identifi
through a separate embedding network to a large numb
of layers in the generator [46], while the latent is still pr
vided though the input layer. A few authors have considere
feeding parts of the latent code to multiple generator laye
[9, 5]. In parallel work, Chen et al. [6] “self modulate” th
generator using AdaINs, similarly to our work, but do n
consider an intermediate latent space or noise inputs.
*
*Karras et al., “A Style-Based Generator Architecture for GeneraCve,” in Proc. of CVPR 2019.
cat
16. DeNA TechCon 2020
#denatechcon
•
•
Screen coordinate system
Camera coordinate system
Head coordinate
system
Figure 1: System configuration for data collection
LCD monitor, and these cameras capture images in a syn-
chronized manner via a software trigger controlled by the
host computer. Intrinsic and extrinsic camera parameters
are calibrated beforehand, and the 3D position of the moni-
Midpoints of
3D facial landmarks
Figure 2: Definition of head pose. The head coordinate sys
tem is defined based on a triangle connecting three mid
points of the eyes and mouth.
poses of the subjects. As illustrated in Fig. 2, the head coor
Y. Sugano et al., ”Learning-by-Synthesis for Appearance-Based 3D Gaze EsBmaBon,”
in Proc. of CVPR, 2014
Andreas et al., "Wearable EOG goggles: Eye-based interacBon
in everyday environments," in Proc. of CHI 2009.