Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data rectangling in R: a journey from JSON to CSV

Data rectangling in R: a journey from JSON to CSV

Avatar for Muhammad Aswan Syahputra

Muhammad Aswan Syahputra

August 16, 2019
Tweet

More Decks by Muhammad Aswan Syahputra

Other Decks in Technology

Transcript

  1. About me Sensory scien st @ Sensolu on.ID Instructor @

    R Academy Telkom University Ini ator of Komunitas R Indonesia (est. 13 August 2016)  : sensehubr, nusandata, bandungjuara, prakiraan, etc  : sensehub, thermostats, aquastats, bcrp, bandungjuara, etc  : aswansyahputra  : aswansyahputra_  : aswansyahputra  : aswansyahputra 2 / 20 Know your neighbours! Who are you? What do you do with data? How do you describe your experience with R? 3 / 20 Know your neighbours! Who are you? What do you do with data? How do you describe your experience with R? 03:00 3 / 20 Let's play with some basics! 4 / 20 (x1 "useR! Yogyakarta") #> [1] "useR! Yogyakarta" (x2 TRUE) #> [1] TRUE (x3 1.43) #> [1] 1.43 (x4 1L:5L) #> [1] 1 2 3 4 5 Can you guess the type of x1, x2, x3, and x4? How about their length? Let's play with some basics! 4 / 20 (x1 "useR! Yogyakarta") #> [1] "useR! Yogyakarta" (x2 TRUE) #> [1] TRUE (x3 1.43) #> [1] 1.43 (x4 1L:5L) #> [1] 1 2 3 4 5 Can you guess the type of x1, x2, x3, and x4? How about their length? typeof(x1) #> [1] "character" length(x1) #> [1] 1 typeof(x2) #> [1] "logical" length(x2) #> [1] 1 typeof(x3) #> [1] "double" length(x3) #> [1] 1 What about x4? Let's play with some basics! 4 / 20 How to combine x1, x2, x3, and x4 without losing their properties? 5 / 20 It seems off, doesn't it? Can you explain? The use of c() (xs_c c(x1, x2, x3, x4)) #> [1] "useR! Yogyakarta" "TRUE" #> [4] "1" "2" #> [7] "4" "5" 6 / 20 It seems off, doesn't it? Can you explain? Let's check it! length(xs_c) #> [1] 8 typeof(xs_c) #> [1] "character" length ❌, type ❓What's happening? The use of c() (xs_c c(x1, x2, x3, x4)) #> [1] "useR! Yogyakarta" "TRUE" #> [4] "1" "2" #> [7] "4" "5" 6 / 20 (xs_list list(x1, x2, x3, x4)) #> [[1]] #> [1] "useR! Yogyakarta" #> #> [[2]] #> [1] TRUE #> #> [[3]] #> [1] 1.43 #> #> [[4]] #> [1] 1 2 3 4 5 Hmm, not so familiar but it seems to be what we wanted, right? The use of list() 7 / 20 (xs_list list(x1, x2, x3, x4)) #> [[1]] #> [1] "useR! Yogyakarta" #> #> [[2]] #> [1] TRUE #> #> [[3]] #> [1] 1.43 #> #> [[4]] #> [1] 1 2 3 4 5 Hmm, not so familiar but it seems to be what we wanted, right? Let's also check it! length(xs_list) #> [1] 4 typeof(xs_list) #> [1] "list" length ✅, type ❓ What is list? The use of list() 7 / 20 Hold on! 8 / 20 Hold on! How to check if the type and length of each element are preserved? 8 / 20 The good old for loop types_xs_c vector("character", length = length(xs_c)) for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" 9 / 20 The good old for loop types_xs_c vector("character", length = length(xs_c)) for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" lengths_xs_c vector("integer", length = length(xs_c)) for (i in seq_along(xs_c)) { lengths_xs_c[[i]] length(xs_c[[i]]) } lengths_xs_c #> [1] 1 1 1 1 1 1 1 1 9 / 20 The good old for loop types_xs_c vector("character", length = length(xs_c)) for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" lengths_xs_c vector("integer", length = length(xs_c)) for (i in seq_along(xs_c)) { lengths_xs_c[[i]] length(xs_c[[i]]) } lengths_xs_c #> [1] 1 1 1 1 1 1 1 1 How would you perform the same procedure for xs_list? Save your results as types_xs_list and lengths_xs_list! 9 / 20 Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 10 / 20 Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 10 / 20 Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... 10 / 20 Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... library(purrr) map_chr(xs_list, typeof) #> [1] "character" "logical" "double" "integer" map_int(xs_list, length) #> [1] 1 1 1 5 10 / 20 Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... library(purrr) map_chr(xs_list, typeof) #> [1] "character" "logical" "double" "integer" map_int(xs_list, length) #> [1] 1 1 1 5 So much simpler and be er, isn't it? 10 / 20 list resembles JSON very much! 11 / 20 list resembles JSON very much! Have a look at following comparison using subset of billionaires data 11 / 20 Raw JSON file cd data raw cat billionaires_small.json #> [ #> { #> "wealth": { #> "worth in billions": [3.6], #> "how": { #> "category": ["Traded Sectors"], #> "from emerging": [true], #> "industry": ["Consumer"], #> "was political": [false], #> "inherited": [true], #> "was founder": [true] #> }, #> "type": ["founder non finance"] #> }, #> "company": { #> "sector": ["agricultural products"] #> "founded": [1929], #> "type": ["new"], #> "name": ["J.R. Simplot Company"], #> "relationship": ["founder"] #> }, 12 / 20 Raw JSON file When imported to R cd data raw cat billionaires_small.json #> [ #> { #> "wealth": { #> "worth in billions": [3.6], #> "how": { #> "category": ["Traded Sectors"], #> "from emerging": [true], #> "industry": ["Consumer"], #> "was political": [false], #> "inherited": [true], #> "was founder": [true] #> }, #> "type": ["founder non finance"] #> }, #> "company": { #> "sector": ["agricultural products"] #> "founded": [1929], #> "type": ["new"], #> "name": ["J.R. Simplot Company"], #> "relationship": ["founder"] #> }, str(billionaires_small, max.level = 3) #> List of 3 #> $ List of 7 #> $ wealth List of 3 #> $ worth in billions: num 3.6 #> $ how List of 6 #> $ type : chr "founder #> $ company List of 5 #> $ sector : chr "agricultural #> $ founded : int 1929 #> $ type : chr "new" #> $ name : chr "J.R. Simplot #> $ relationship: chr "founder" #> $ rank : int 115 #> $ location List of 4 #> $ gdp : num 1.06e+13 #> $ region : chr "North America #> $ citizenship : chr "United States #> $ country code: chr "USA" #> $ year : int 2001 #> $ demographics:List of 2 #> $ gender: chr "male" #> $ age : int 92 12 / 20 How to extract the element(s) of a list? 13 / 20 From a billionaire, extract info library(purrr) pluck(billionaires_small, 1) # you can also #> $wealth #> $wealth$`worth in billions` #> [1] 3.6 #> #> $wealth$how #> $wealth$how$category #> [1] "Traded Sectors" #> #> $wealth$how$`from emerging` #> [1] TRUE #> #> $wealth$how$industry #> [1] "Consumer" #> #> $wealth$how$`was political` #> [1] FALSE #> #> $wealth$how$inherited #> [1] TRUE 14 / 20 From a billionaire, extract info library(purrr) pluck(billionaires_small, 1) # you can also #> $wealth #> $wealth$`worth in billions` #> [1] 3.6 #> #> $wealth$how #> $wealth$how$category #> [1] "Traded Sectors" #> #> $wealth$how$`from emerging` #> [1] TRUE #> #> $wealth$how$industry #> [1] "Consumer" #> #> $wealth$how$`was political` #> [1] FALSE #> #> $wealth$how$inherited #> [1] TRUE pluck(billionaires_small, 1, "name") # you c #> [1] "John Simplot" pluck(billionaires_small, 1, "rank") #> [1] 115 pluck(billionaires_small, 1, "wealth", "wort #> [1] 3.6 14 / 20 From some billionaires, extract info map(billionaires_small, pluck, "name") #> [[1]] #> [1] "John Simplot" #> #> [[2]] #> [1] "Banyong Lamsam" #> #> [[3]] #> [1] "Richard Farmer" map(billionaires_small, pluck, "wealth", "wo #> [[1]] #> [1] 3.6 #> #> [[2]] #> [1] 2.5 #> #> [[3]] #> [1] 1.8 15 / 20 Awesome, map() provides a shortcut! Bye pluck()~ From some billionaires, extract info map(billionaires_small, pluck, "name") #> [[1]] #> [1] "John Simplot" #> #> [[2]] #> [1] "Banyong Lamsam" #> #> [[3]] #> [1] "Richard Farmer" map(billionaires_small, pluck, "wealth", "wo #> [[1]] #> [1] 3.6 #> #> [[2]] #> [1] 2.5 #> #> [[3]] #> [1] 1.8 (billionaire_names map_chr(billionaires_s #> [1] "John Simplot" "Banyong Lamsam" "Ri (billionaire_ranks map_int(billionaires_s #> [1] 115 143 272 (billionaire_worth map_dbl(billionaires_s #> [1] 3.6 2.5 1.8 15 / 20 Yeay, we can extract some infos 16 / 20 Yeay, we can extract some infos But, now they are scattered 16 / 20 Of course you can combine them later using data.frame() or tibble(), but... 17 / 20 data.frame( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth, stringsAsFactors = FALSE ) #> name rank worth_in_billions #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Of course you can combine them later using data.frame() or tibble(), but... 17 / 20 data.frame( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth, stringsAsFactors = FALSE ) #> name rank worth_in_billions #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 library(tibble) tibble( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth ) #> # A tibble: 3 x 3 #> name rank worth_in_billions #> <chr> <int> <dbl> #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Of course you can combine them later using data.frame() or tibble(), but... 17 / 20 Why don't we contain the list in dataframe/tibble in the first place? 18 / 20 Let's embrace list column library(tibble) billionaires_small_df billionaires_small %>% enframe() billionaires_small_df #> # A tibble: 3 x 2 #> name value #> <int> <list> #> 1 1 <named list [7]> #> 2 2 <named list [7]> #> 3 3 <named list [7]> Why don't we contain the list in dataframe/tibble in the first place? 18 / 20 Let's embrace list column library(tibble) billionaires_small_df billionaires_small %>% enframe() billionaires_small_df #> # A tibble: 3 x 2 #> name value #> <int> <list> #> 1 1 <named list [7]> #> 2 2 <named list [7]> #> 3 3 <named list [7]> Now we can make use of dplyr, ain't it cool? library(dplyr) billionaires_small_df %>% mutate( name = map_chr(value, "name"), rank = map_int(value, "rank"), worth_in_billions = map_dbl( value, list("wealth", "worth in billions")) ) %>% select(-value) #> # A tibble: 3 x 3 #> name rank worth_in_billions #> <chr> <int> <dbl> #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Why don't we contain the list in dataframe/tibble in the first place? 18 / 20 Let's practice! Open your RStudio, then install usethis package Once succeed, run usethis use_course("aswansyahputra/kpdr_jogja") Follow the instruc ons and new RStudio session will be automa cally opened Please open hands on.Rmd and read the instruc ons thoroughly 19 / 20 R Indonesia w w w .r-in d o n e s ia .id Thank you!  t.me/GNURIndonesia  r-indonesia.id  [email protected] 20 / 20 R Indonesia w w w .r-in d o n e s ia .id Data rectangling: a journey from JSON to CSV Muhammad Aswan Syahputra 1 / 20
  2. About me Sensory scien st @ Sensolu on.ID Instructor @

    R Academy Telkom University Ini ator of Komunitas R Indonesia (est. 13 August 2016)  : sensehubr, nusandata, bandungjuara, prakiraan, etc  : sensehub, thermostats, aquastats, bcrp, bandungjuara, etc  : aswansyahputra  : aswansyahputra_  : aswansyahputra  : aswansyahputra 2 / 20
  3. Know your neighbours! Who are you? What do you do

    with data? How do you describe your experience with R? 3 / 20
  4. Know your neighbours! Who are you? What do you do

    with data? How do you describe your experience with R? 03:00 3 / 20
  5. (x1 "useR! Yogyakarta") #> [1] "useR! Yogyakarta" (x2 TRUE) #>

    [1] TRUE (x3 1.43) #> [1] 1.43 (x4 1L:5L) #> [1] 1 2 3 4 5 Can you guess the type of x1, x2, x3, and x4? How about their length? Let's play with some basics! 4 / 20
  6. (x1 "useR! Yogyakarta") #> [1] "useR! Yogyakarta" (x2 TRUE) #>

    [1] TRUE (x3 1.43) #> [1] 1.43 (x4 1L:5L) #> [1] 1 2 3 4 5 Can you guess the type of x1, x2, x3, and x4? How about their length? typeof(x1) #> [1] "character" length(x1) #> [1] 1 typeof(x2) #> [1] "logical" length(x2) #> [1] 1 typeof(x3) #> [1] "double" length(x3) #> [1] 1 What about x4? Let's play with some basics! 4 / 20
  7. It seems off, doesn't it? Can you explain? The use

    of c() (xs_c c(x1, x2, x3, x4)) #> [1] "useR! Yogyakarta" "TRUE" #> [4] "1" "2" #> [7] "4" "5" 6 / 20
  8. It seems off, doesn't it? Can you explain? Let's check

    it! length(xs_c) #> [1] 8 typeof(xs_c) #> [1] "character" length ❌, type ❓What's happening? The use of c() (xs_c c(x1, x2, x3, x4)) #> [1] "useR! Yogyakarta" "TRUE" #> [4] "1" "2" #> [7] "4" "5" 6 / 20
  9. (xs_list list(x1, x2, x3, x4)) #> [[1]] #> [1] "useR!

    Yogyakarta" #> #> [[2]] #> [1] TRUE #> #> [[3]] #> [1] 1.43 #> #> [[4]] #> [1] 1 2 3 4 5 Hmm, not so familiar but it seems to be what we wanted, right? The use of list() 7 / 20
  10. (xs_list list(x1, x2, x3, x4)) #> [[1]] #> [1] "useR!

    Yogyakarta" #> #> [[2]] #> [1] TRUE #> #> [[3]] #> [1] 1.43 #> #> [[4]] #> [1] 1 2 3 4 5 Hmm, not so familiar but it seems to be what we wanted, right? Let's also check it! length(xs_list) #> [1] 4 typeof(xs_list) #> [1] "list" length ✅, type ❓ What is list? The use of list() 7 / 20
  11. Hold on! How to check if the type and length

    of each element are preserved? 8 / 20
  12. The good old for loop types_xs_c vector("character", length = length(xs_c))

    for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" 9 / 20
  13. The good old for loop types_xs_c vector("character", length = length(xs_c))

    for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" lengths_xs_c vector("integer", length = length(xs_c)) for (i in seq_along(xs_c)) { lengths_xs_c[[i]] length(xs_c[[i]]) } lengths_xs_c #> [1] 1 1 1 1 1 1 1 1 9 / 20
  14. The good old for loop types_xs_c vector("character", length = length(xs_c))

    for (i in seq_along(xs_c)) { types_xs_c[[i]] typeof(xs_c[[i]]) } types_xs_c #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" lengths_xs_c vector("integer", length = length(xs_c)) for (i in seq_along(xs_c)) { lengths_xs_c[[i]] length(xs_c[[i]]) } lengths_xs_c #> [1] 1 1 1 1 1 1 1 1 How would you perform the same procedure for xs_list? Save your results as types_xs_list and lengths_xs_list! 9 / 20
  15. Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES

    = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 10 / 20
  16. Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES

    = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 10 / 20
  17. Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES

    = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... 10 / 20
  18. Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES

    = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... library(purrr) map_chr(xs_list, typeof) #> [1] "character" "logical" "double" "integer" map_int(xs_list, length) #> [1] 1 1 1 5 10 / 20
  19. Let me introduce you to functional vapply(xs_c, typeof, character(1), USE.NAMES

    = FALSE) #> [1] "character" "character" "character" "character" "character" "character" #> [7] "character" "character" vapply(xs_c, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 1 1 1 1 1 vapply(xs_list, typeof, character(1), USE.NAMES = FALSE) #> [1] "character" "logical" "double" "integer" vapply(xs_list, length, integer(1), USE.NAMES = FALSE) #> [1] 1 1 1 5 Ok, it surely looks simpler but s ll... library(purrr) map_chr(xs_list, typeof) #> [1] "character" "logical" "double" "integer" map_int(xs_list, length) #> [1] 1 1 1 5 So much simpler and be er, isn't it? 10 / 20
  20. list resembles JSON very much! Have a look at following

    comparison using subset of billionaires data 11 / 20
  21. Raw JSON file cd data raw cat billionaires_small.json #> [

    #> { #> "wealth": { #> "worth in billions": [3.6], #> "how": { #> "category": ["Traded Sectors"], #> "from emerging": [true], #> "industry": ["Consumer"], #> "was political": [false], #> "inherited": [true], #> "was founder": [true] #> }, #> "type": ["founder non finance"] #> }, #> "company": { #> "sector": ["agricultural products"] #> "founded": [1929], #> "type": ["new"], #> "name": ["J.R. Simplot Company"], #> "relationship": ["founder"] #> }, 12 / 20
  22. Raw JSON file When imported to R cd data raw

    cat billionaires_small.json #> [ #> { #> "wealth": { #> "worth in billions": [3.6], #> "how": { #> "category": ["Traded Sectors"], #> "from emerging": [true], #> "industry": ["Consumer"], #> "was political": [false], #> "inherited": [true], #> "was founder": [true] #> }, #> "type": ["founder non finance"] #> }, #> "company": { #> "sector": ["agricultural products"] #> "founded": [1929], #> "type": ["new"], #> "name": ["J.R. Simplot Company"], #> "relationship": ["founder"] #> }, str(billionaires_small, max.level = 3) #> List of 3 #> $ List of 7 #> $ wealth List of 3 #> $ worth in billions: num 3.6 #> $ how List of 6 #> $ type : chr "founder #> $ company List of 5 #> $ sector : chr "agricultural #> $ founded : int 1929 #> $ type : chr "new" #> $ name : chr "J.R. Simplot #> $ relationship: chr "founder" #> $ rank : int 115 #> $ location List of 4 #> $ gdp : num 1.06e+13 #> $ region : chr "North America #> $ citizenship : chr "United States #> $ country code: chr "USA" #> $ year : int 2001 #> $ demographics:List of 2 #> $ gender: chr "male" #> $ age : int 92 12 / 20
  23. From a billionaire, extract info library(purrr) pluck(billionaires_small, 1) # you

    can also #> $wealth #> $wealth$`worth in billions` #> [1] 3.6 #> #> $wealth$how #> $wealth$how$category #> [1] "Traded Sectors" #> #> $wealth$how$`from emerging` #> [1] TRUE #> #> $wealth$how$industry #> [1] "Consumer" #> #> $wealth$how$`was political` #> [1] FALSE #> #> $wealth$how$inherited #> [1] TRUE 14 / 20
  24. From a billionaire, extract info library(purrr) pluck(billionaires_small, 1) # you

    can also #> $wealth #> $wealth$`worth in billions` #> [1] 3.6 #> #> $wealth$how #> $wealth$how$category #> [1] "Traded Sectors" #> #> $wealth$how$`from emerging` #> [1] TRUE #> #> $wealth$how$industry #> [1] "Consumer" #> #> $wealth$how$`was political` #> [1] FALSE #> #> $wealth$how$inherited #> [1] TRUE pluck(billionaires_small, 1, "name") # you c #> [1] "John Simplot" pluck(billionaires_small, 1, "rank") #> [1] 115 pluck(billionaires_small, 1, "wealth", "wort #> [1] 3.6 14 / 20
  25. From some billionaires, extract info map(billionaires_small, pluck, "name") #> [[1]]

    #> [1] "John Simplot" #> #> [[2]] #> [1] "Banyong Lamsam" #> #> [[3]] #> [1] "Richard Farmer" map(billionaires_small, pluck, "wealth", "wo #> [[1]] #> [1] 3.6 #> #> [[2]] #> [1] 2.5 #> #> [[3]] #> [1] 1.8 15 / 20
  26. Awesome, map() provides a shortcut! Bye pluck()~ From some billionaires,

    extract info map(billionaires_small, pluck, "name") #> [[1]] #> [1] "John Simplot" #> #> [[2]] #> [1] "Banyong Lamsam" #> #> [[3]] #> [1] "Richard Farmer" map(billionaires_small, pluck, "wealth", "wo #> [[1]] #> [1] 3.6 #> #> [[2]] #> [1] 2.5 #> #> [[3]] #> [1] 1.8 (billionaire_names map_chr(billionaires_s #> [1] "John Simplot" "Banyong Lamsam" "Ri (billionaire_ranks map_int(billionaires_s #> [1] 115 143 272 (billionaire_worth map_dbl(billionaires_s #> [1] 3.6 2.5 1.8 15 / 20
  27. data.frame( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth,

    stringsAsFactors = FALSE ) #> name rank worth_in_billions #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Of course you can combine them later using data.frame() or tibble(), but... 17 / 20
  28. data.frame( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth,

    stringsAsFactors = FALSE ) #> name rank worth_in_billions #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 library(tibble) tibble( name = billionaire_names, rank = billionaire_ranks, worth_in_billions = billionaire_worth ) #> # A tibble: 3 x 3 #> name rank worth_in_billions #> <chr> <int> <dbl> #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Of course you can combine them later using data.frame() or tibble(), but... 17 / 20
  29. Let's embrace list column library(tibble) billionaires_small_df billionaires_small %>% enframe() billionaires_small_df

    #> # A tibble: 3 x 2 #> name value #> <int> <list> #> 1 1 <named list [7]> #> 2 2 <named list [7]> #> 3 3 <named list [7]> Why don't we contain the list in dataframe/tibble in the first place? 18 / 20
  30. Let's embrace list column library(tibble) billionaires_small_df billionaires_small %>% enframe() billionaires_small_df

    #> # A tibble: 3 x 2 #> name value #> <int> <list> #> 1 1 <named list [7]> #> 2 2 <named list [7]> #> 3 3 <named list [7]> Now we can make use of dplyr, ain't it cool? library(dplyr) billionaires_small_df %>% mutate( name = map_chr(value, "name"), rank = map_int(value, "rank"), worth_in_billions = map_dbl( value, list("wealth", "worth in billions")) ) %>% select(-value) #> # A tibble: 3 x 3 #> name rank worth_in_billions #> <chr> <int> <dbl> #> 1 John Simplot 115 3.6 #> 2 Banyong Lamsam 143 2.5 #> 3 Richard Farmer 272 1.8 Why don't we contain the list in dataframe/tibble in the first place? 18 / 20
  31. Let's practice! Open your RStudio, then install usethis package Once

    succeed, run usethis use_course("aswansyahputra/kpdr_jogja") Follow the instruc ons and new RStudio session will be automa cally opened Please open hands on.Rmd and read the instruc ons thoroughly 19 / 20
  32. R Indonesia w w w .r-in d o n e

    s ia .id Thank you!  t.me/GNURIndonesia  r-indonesia.id  [email protected] 20 / 20