FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep
Learning Takayuki Shinohara, Haoyi Xiu and Masashi Matsuoka Tokyo Institute of Technology 2019.12.09 Wyndham San Diego Bay Side San Diego, California, USA Second International Workshop on Artificial Intelligence for 3D Big Spatial Data Processing (AI3D 2019)

Outline 1. Background and Objective 2. Related Study 3. Proposed
Method 4. Experimental Results 5. Conclusion and Future Study 2

Background and Objective 3

Airborne laser scanning(ALS) nApplications • Digital Terrain Models (DTMs), Digital
Surface Models (DSMs) • Urban planning • Natural disaster management • Forestry • Facility monitoring nLaser Scanners • Only the first and last pulse • Multi pulse laser scanner • Full-waveform laser scanner 4

Full waveform laser scanners nAdvantages • Recording the entire reflected
signal “waveform” discretely. • Providing not only 3D point clouds, but also additional information regarding the target properties. The shape and power of the backscatter waveform are related to the geometry and reflection characteristics of the surface. 5 Cited from “Urban land cover classification using airborne LiDAR data: A review”

Automatic analysis method nCostly and time consuming in manual processing
• ALS presents significant advantages for large-area observation. • Manual processing to extract spatial information from point clouds and their waveform is costly and time consuming. => Automatic data analysis methods are necessary. nAutomatic analysis method for full waveform data • Divided into two method ‣ 3D point cloud and manmade waveform features ‣ Raw waveform In this study, We investigate raw waveform analysis. 6

Raw waveform analysis nSelf organization map(E. Maset et.al., 2015) •
The first data driven feature extraction method. nDeep learning(Zorzi et.al., 2019) • The first method for combination of raw waveform and 2D grid. The main limitation: • Each waveform data are learned individually. ‣ Difficult to dealing with spatially irregular data. ‣ Most deep learning method are developed for regular data such as Image, Audio and Language. 7 Research Question: Do deep neural networks learn spatially irregular full waveform LiDAR data?

To address question nDealing with spatially irregular data • Point
Net(Qi et.al, 2017) ‣ One of the deep learning method for spatially irregular data nInvestigating the power of deep learning method • Auto Encoder ‣ One of the representation learning. ‣ Data driven feature extraction method. Objective: Using a deep learning method, a new representation learning method for spatially irregular raw full-waveform data. 8

Related Study 9

Representation learning nWhat is representation learning? • Discovering the representations
from raw data automatically. nAuto Encoder(AE) • The AE is an architecture that learns to reconstruct an input. • Reconstructing an input via a latent vector. • A latent vector have the low dimensional essential information. 10 If training is successful, bottleneck layer provides a low-dimensional latent vector for each input data. Input Reconstructed Input Latent Vector Encode Decode

Full Waveform data analysis nPoint cloud data with manmade features
• Rule-based • Man-made features and classic machine learning ÞFull-waveform laser scanners are highly advantageous for point cloud classification. Relying on manmade features that are sent to statistical classifiers or simple machine-learning algorithms. nRaw waveform data • Unsupervised classification ‣ Self organization map • Supervised classification ‣ Deep learning-based approach 11

Deep learning-based approach n2 stage classification(S. Zorzi et al., 2019)
• 1st stage: Waveform analysis by 1D CNN ‣ Individual feature extraction for waveform. ⁃ Highly miss classification. • 2nd stage: Spatial analysis by 2D CNN ‣ Raw classification results are converted to grid data with height information. ‣ Prediction the class of each grid. Individual waveform learning is diseffective Grid based spatial learning is effective Þ Spatial learning method for raw waveform data are needed. 12 Waveform Spatial

Spatial deep learning method nPoint Net(Qi et.al., 2017) • The
first method for raw point cloud(set of x, y, z) data. • Point Net can deal with spatially irregular data. 13 We extended to spatially irregular raw full waveform data analysis Point Net(Qi et.al., 2017)

Proposed Method 14

Problem definition nReconstruction of input data using Auto Encoder 15
Encoder: full-waveform data into latent vector z Decoder: transforms the latent vector z to input data Encoder Latent vector z Decoder Each x,y have waveform Input waveform Reconstructed waveform Input data Reconstructed data x x y y Reconstruct special distribution and its waveform

Proposed network (FWNetAE) nAuto Encoder for full waveform data •
Encoder: Point Net based • Decoder: simple Multi Layer Perceptron(MLP) 16 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Point Net based encoder 1/3 nThe first block • Computes
local features for each dataset. • Some 1D convolution layer. 17 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Point Net based encoder 2/3 nThe second block • T-nets
causes the points to be independent from rigid transformations. • We can get robust features for object angle. 18 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Point Net based encoder 3/3 nThe third block • Computes
global features (Latent Vector) over all the data by a max pooling layer as a symmetric function. 19 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s

Decoder nMLP based architecture • Fully connected layers to produce
reconstructed data the same as those of input data. 20 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s

loss function nLoss function for reconstruction • FWNetAE aims at
reconstructing target reconstructed data ! , given latent vector produced from encoding input data . • Spatial matching loss • Waveform reconstruction 21 %&'() = 1 2N . /01 2 3 / − 6 / 7 + / − 6 / 7), 1 <'=)>?@A = 1 2N . /01 2 . B01 C /,B − ̂ /,B 7 . 2 Optimization process Minimize these function

Experimental Results 22

Dataset nDublin City Dataset • Published by NYU • Point
density ‣ about 300 points/m2 • Used area ‣ One of the flight path nTraining data • Sample size: ‣ Train, Val, Test: 300,000, 100,000 100,000 • Input dimension: 2,048x62 (x, y, waveforms) ‣ We selected data including 60 returns, to simplify the problem. 23 Used data observed from flight path

Reconstruction results nSpatial reconstruction 24 A matching shape was observed.
Mean error for all of test data was 0.051(normalized value) 27 Failure case

Reconstruction results nWaveform reconstruction 25 28 Failure case A matching
shape was observed. Mean error for all of test data was 0.29

Latent space visualization nComparison of some method 26 PCA Proposed
method Learnable function Without learning Nonspatial AE Spatial learning method is effective for feature extraction

Conclusion and Future Study 27

Conclusion and future study nConclusion • This paper presents a
novel representation learning method for spatially distributed full-waveform data observed from an ALS using an AE-based architecture called FWNetAE. • The results demonstrate a generalization error for invisible test data. • Moreover, the FWNetAE encoded a meaningful latent vector and the decoders reconstructed the spatial geometry and its waveform value from the encoded latent vector. • However, the PointNet-based encoders could not deal with various input dimension and extract features at various resolutions. nFuture Study • Modern Hieratical learning: PointNet++, Dynamic Graph CNN • Application for Supervised Learning 28

supplemental 29

Making input data nK-nn • K depend on GPU memory.
‣ In this study, K is 2048. ‣ If K is big value, we can consider large context. ⁃ Context is very important. 30 Random sample Near sample

Spatial Deep Learning method nEuclidian data • Image • Audio
signal • Natural Language Full waveform lidar data are on of the non-Euclid data 31 nNon-Euclidian data • Graph • Point Cloud

T-Net nTransform 32 If the same object are input, Itʼs
difficult to deal with rotation. ÞT-net can provide rotation invariant features.

FWNetAE: Spatial Representation Learning forFu...

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning

teddy

More Decks by teddy

Other Decks in Science

Featured

Transcript