Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FWNetAE: Spatial Representation Learning for Fu...

teddy
December 09, 2019

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning

Artificial Intelligence for 3D Big Spatial Data Processing (AI3D 2019), Co-located with IEEE ISM 2019, San Diego, California, USA, December 9-11, 2019

teddy

December 09, 2019
Tweet

More Decks by teddy

Other Decks in Science

Transcript

  1. FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep

    Learning Takayuki Shinohara, Haoyi Xiu and Masashi Matsuoka Tokyo Institute of Technology 2019.12.09 Wyndham San Diego Bay Side San Diego, California, USA Second International Workshop on Artificial Intelligence for 3D Big Spatial Data Processing (AI3D 2019)
  2. Outline 1. Background and Objective 2. Related Study 3. Proposed

    Method 4. Experimental Results 5. Conclusion and Future Study 2
  3. Airborne laser scanning(ALS) nApplications • Digital Terrain Models (DTMs), Digital

    Surface Models (DSMs) • Urban planning • Natural disaster management • Forestry • Facility monitoring nLaser Scanners • Only the first and last pulse • Multi pulse laser scanner • Full-waveform laser scanner 4
  4. Full waveform laser scanners nAdvantages • Recording the entire reflected

    signal “waveform” discretely. • Providing not only 3D point clouds, but also additional information regarding the target properties. The shape and power of the backscatter waveform are related to the geometry and reflection characteristics of the surface. 5 Cited from “Urban land cover classification using airborne LiDAR data: A review”
  5. Automatic analysis method nCostly and time consuming in manual processing

    • ALS presents significant advantages for large-area observation. • Manual processing to extract spatial information from point clouds and their waveform is costly and time consuming. => Automatic data analysis methods are necessary. nAutomatic analysis method for full waveform data • Divided into two method ‣ 3D point cloud and manmade waveform features ‣ Raw waveform In this study, We investigate raw waveform analysis. 6
  6. Raw waveform analysis nSelf organization map(E. Maset et.al., 2015) •

    The first data driven feature extraction method. nDeep learning(Zorzi et.al., 2019) • The first method for combination of raw waveform and 2D grid. The main limitation: • Each waveform data are learned individually. ‣ Difficult to dealing with spatially irregular data. ‣ Most deep learning method are developed for regular data such as Image, Audio and Language. 7 Research Question: Do deep neural networks learn spatially irregular full waveform LiDAR data?
  7. To address question nDealing with spatially irregular data • Point

    Net(Qi et.al, 2017) ‣ One of the deep learning method for spatially irregular data nInvestigating the power of deep learning method • Auto Encoder ‣ One of the representation learning. ‣ Data driven feature extraction method. Objective: Using a deep learning method, a new representation learning method for spatially irregular raw full-waveform data. 8
  8. Representation learning nWhat is representation learning? • Discovering the representations

    from raw data automatically. nAuto Encoder(AE) • The AE is an architecture that learns to reconstruct an input. • Reconstructing an input via a latent vector. • A latent vector have the low dimensional essential information. 10 If training is successful, bottleneck layer provides a low-dimensional latent vector for each input data. Input Reconstructed Input Latent Vector Encode Decode
  9. Full Waveform data analysis nPoint cloud data with manmade features

    • Rule-based • Man-made features and classic machine learning ÞFull-waveform laser scanners are highly advantageous for point cloud classification. Relying on manmade features that are sent to statistical classifiers or simple machine-learning algorithms. nRaw waveform data • Unsupervised classification ‣ Self organization map • Supervised classification ‣ Deep learning-based approach 11
  10. Deep learning-based approach n2 stage classification(S. Zorzi et al., 2019)

    • 1st stage: Waveform analysis by 1D CNN ‣ Individual feature extraction for waveform. ⁃ Highly miss classification. • 2nd stage: Spatial analysis by 2D CNN ‣ Raw classification results are converted to grid data with height information. ‣ Prediction the class of each grid. Individual waveform learning is diseffective Grid based spatial learning is effective Þ Spatial learning method for raw waveform data are needed. 12 Waveform Spatial
  11. Spatial deep learning method nPoint Net(Qi et.al., 2017) • The

    first method for raw point cloud(set of x, y, z) data. • Point Net can deal with spatially irregular data. 13 We extended to spatially irregular raw full waveform data analysis Point Net(Qi et.al., 2017)
  12. Problem definition nReconstruction of input data using Auto Encoder 15

    Encoder: full-waveform data into latent vector z Decoder: transforms the latent vector z to input data   Encoder Latent vector z Decoder   Each x,y have waveform Input waveform Reconstructed waveform Input data Reconstructed data x x y y Reconstruct special distribution and its waveform
  13. Proposed network (FWNetAE) nAuto Encoder for full waveform data •

    Encoder: Point Net based • Decoder: simple Multi Layer Perceptron(MLP) 16 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets
  14. Point Net based encoder 1/3 nThe first block • Computes

    local features for each dataset. • Some 1D convolution layer. 17 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets
  15. Point Net based encoder 2/3 nThe second block • T-nets

    causes the points to be independent from rigid transformations. • We can get robust features for object angle. 18 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets
  16. Point Net based encoder 3/3 nThe third block • Computes

    global features (Latent Vector) over all the data by a max pooling layer as a symmetric function. 19 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s
  17. Decoder nMLP based architecture • Fully connected layers to produce

    reconstructed data the same as those of input data. 20 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s
  18. loss function nLoss function for reconstruction • FWNetAE aims at

    reconstructing target reconstructed data ! , given latent vector produced from encoding input data . • Spatial matching loss • Waveform reconstruction 21 %&'() = 1 2N . /01 2 3 / − 6 / 7 + / − 6 / 7), 1 <'=)>?@A = 1 2N . /01 2 . B01 C /,B − ̂ /,B 7 . 2 Optimization process Minimize these function
  19. Dataset nDublin City Dataset • Published by NYU • Point

    density ‣ about 300 points/m2 • Used area ‣ One of the flight path nTraining data • Sample size: ‣ Train, Val, Test: 300,000, 100,000 100,000 • Input dimension: 2,048x62 (x, y, waveforms) ‣ We selected data including 60 returns, to simplify the problem. 23 Used data observed from flight path
  20. Reconstruction results nSpatial reconstruction 24 A matching shape was observed.

    Mean error for all of test data was 0.051(normalized value) 27 Failure case
  21. Reconstruction results nWaveform reconstruction 25 28 Failure case A matching

    shape was observed. Mean error for all of test data was 0.29
  22. Latent space visualization nComparison of some method 26 PCA Proposed

    method Learnable function Without learning Nonspatial AE Spatial learning method is effective for feature extraction
  23. Conclusion and future study nConclusion • This paper presents a

    novel representation learning method for spatially distributed full-waveform data observed from an ALS using an AE-based architecture called FWNetAE. • The results demonstrate a generalization error for invisible test data. • Moreover, the FWNetAE encoded a meaningful latent vector and the decoders reconstructed the spatial geometry and its waveform value from the encoded latent vector. • However, the PointNet-based encoders could not deal with various input dimension and extract features at various resolutions. nFuture Study • Modern Hieratical learning: PointNet++, Dynamic Graph CNN • Application for Supervised Learning 28
  24. Making input data nK-nn • K depend on GPU memory.

    ‣ In this study, K is 2048. ‣ If K is big value, we can consider large context. ⁃ Context is very important. 30 Random sample Near sample
  25. Spatial Deep Learning method nEuclidian data • Image • Audio

    signal • Natural Language Full waveform lidar data are on of the non-Euclid data 31 nNon-Euclidian data • Graph • Point Cloud
  26. T-Net nTransform 32 If the same object are input, Itʼs

    difficult to deal with rotation. ÞT-net can provide rotation invariant features.