concentrated around 1, then we can prevent the exploding/vanishing gradients. [Pennington, Schoenholz, Ganguli, AISTATS2018, Benoit Collins & TH, CIMP2022] If we set the initialization of parameters to be Haar orthgonal and choose appropriate activation function, then we can make the DNN to achieve the dynamical isometry. 19