Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TokyoR109.pdf

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for kilometer kilometer
October 07, 2023

 TokyoR109.pdf

第109回Tokyo.Rでトークした際のスライド資料です。

Avatar for kilometer

kilometer

October 07, 2023
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. ① ここにRのコードを書く ② 選択して実⾏(⌘↩) ③ 実⾏結果が表⽰ スクリプト → 名前をつけて保存 コンソール

    環境変数などが表⽰ ファイル/プロット/ヘルプなど ⾃動で反映 RStudioの操作
  2. の基本 a <- 5 x <- 1:5 y <- a

    * x 代⼊演算⼦ オブジェクト コード
  3. の始め⽅ おすすめパッケージ達 ・ tidyverse (データ科学総合ツール) ・ data.table (大きいデータ取り扱うなら) ・ cmdstanr

    (ベイズ統計やるなら) ・ patchwork (データ可視化やるなら) 最初から全部いれる必要はないよ!
  4. ベクトル (vector) x <- c(5:10) ## x ## [1] 5

    6 7 8 9 10 ## ## x[3] ## [1] 7 ## ## x[c(2, 5, 1)] ## [1] 6 9 5
  5. リスト (list) list( c(1:3), letters[1:3], seq(3, 5, by = 1))

    ## [[1]] ## [1] 1 2 3 ## ## [[2]] ## [1] "a" "b" "c" ## ## [[3]] ## [1] 3 4 5
  6. 名前付きリスト (named list) list( x = c(1:3), y = letters[1:3],

    z = seq(3, 5, by = 1)) ## $x ## [1] 1 2 3 ## ## $y ## [1] "a" "b" "c" ## ## $z ## [1] 3 4 5
  7. データフレーム (data.frame) data.frame( x = c(1:3), y = letters[1:3], z

    = seq(3, 5, by = 1)) ## x y z ## 1 1 a 3 ## 2 2 b 4 ## 3 3 c 5
  8. data.frame( x = c(1:3), y = letters[1:3], z = seq(3,

    5, by = 1)) ## x y z ## 1 1 a 3 ## 2 2 b 4 ## 3 3 c 5 observa(on variable データフレーム (data.frame)
  9. 1JQFBMHFCSB X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) f(X) f(X, y) g(f(X)) f(y, X) %>% {magri(r} 「dplyr再⼊⾨(基本編)」yutanihila8on h"ps://speakerdeck.com/yutannihila6on/dplyrzai-ru-men-ji-ben-bian
  10. 横⻑データ (wide format data) .cols <- c("name", "height", "mass", "birth_year")

    dat_wide <- starwars[1:4, .cols] ## > dat_wide ## # A tibble: 4 × 4 ## name height mass birth_year ## <chr> <int> <dbl> <dbl> ## 1 Luke Skywalker 172 77 19 ## 2 C-3PO 167 75 112 ## 3 R2-D2 96 32 33 ## 4 Darth Vader 202 136 41.9
  11. 縦⻑データ (long format data) ## > dat_long ## # A

    tibble: 12 × 3 ## name parameter value ## <chr> <chr> <dbl> ## 1 Luke Skywalker height 172 ## 2 Luke Skywalker mass 77 ## 3 Luke Skywalker birth_year 19 ## 4 C-3PO height 167 ## 5 C-3PO mass 75 ## 6 C-3PO birth_year 112 ## 7 R2-D2 height 96 ## 8 R2-D2 mass 32 ## 9 R2-D2 birth_year 33 ## 10 Darth Vader height 202 ## 11 Darth Vader mass 136 ## 12 Darth Vader birth_year 41.9
  12. 横⻑→縦⻑データ (wide → long) tidyr::pivot_longer( data = dat_wide, cols =

    !name, names_to = "parameter", values_to = "value" ) 縦⻑→横⻑データ (long → wide) tidyr::pivot_wider( data = dat_long, names_from = "parameter", values_from = "value" )
  13. Suture paIern formaKon in ammonites and the unknown rear mantle

    structure Inoue, S., Konodo, S., ScienKfic Reports, (6) 33689 (2016), DOI: 10.1038/srep33689.
  14. Suture paIern formaKon in ammonites and the unknown rear mantle

    structure Inoue, S., Konodo, S., ScienKfic Reports, (6) 33689 (2016), DOI: 10.1038/srep33689. マダコ from Wikipedia JP CC BY-SA 3.0 DEED
  15. 集合𝑋 集合𝑌 要素𝑥 要素𝑦 写像 𝑓: 𝑋 → 𝑌もしくは𝑓: 𝑥

    ⟼ 𝑦 (始集合・定義域) (終集合・終域) 【写像】 ある集合の要素を他の集合のただ1つの要素に 対応づける規則
  16. 地図空間 ⽣物種名空間 名空間 ⾦銭価値空間 (円) ⾦銭価値空間 (ドル) コーヒー ¥290 $2.53

    [緯度, 経度] Homo sapiens 実存 写像 写像 写像 写像 写像 写像 情報 【写像】 ある集合の要素を他の集合のただ1つの要素に対応づける規則
  17. ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥! 𝑦"

    𝑥" 𝑋 𝑌 𝑥! 𝑥" 𝑦! 𝑦" EBUB mapping σʔλՄࢹԽ
  18. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" σʔλՄࢹԽ ࣸ૾ mapping x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ৹ඒతνϟωϧ
  19. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" σʔλՄࢹԽ ࣸ૾ mapping x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ৹ඒతνϟωϧ ggplot(data = my_data) + aes(x = X, y = Y)) + goem_point() HHQMPUʹΑΔ࡞ਤ
  20. 初めてのggplot2 dat <- data.frame( tag = rep(c("a", "b"), each =

    2), X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2) ) ggplot2::ggplot() + ggplot2::geom_point( data = dat, mapping = ggplot2::aes(x = X, y = Y) )
  21. 初めてのggplot2 dat <- data.frame( tag = rep(c("a", "b"), each =

    2), X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2) ) ggplot2::ggplot() + ggplot2::geom_point( data = dat, mapping = ggplot2::aes(x = X, y = Y) ) ඳը։࢝Λએݴ ه߸Ͱͭͳ͙ 様々な審美的チャネル(aesthetic channels)を指定できる
  22. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) ggplot() + geom_point(data = dat, mapping = aes(x = X, y = Y)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ॳΊ͔ͯΒ൪໨ͷHHQMPU
  23. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot() + geom_point(data = dat, mapping = aes(x =

    X, y = Y)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ggplot(data = dat, mapping = aes(x = X, y = Y)) + geom_point() + geom_path() ggplot(data = dat) + aes(x = X, y = Y) + geom_point() + geom_path() ڞ௨ͷࢦఆΛHHQMPU ؔ਺ͷதͰߦ͍ɺҎԼলུ͢Δ͜ͱ͕Մೳ NBQQJOHͷ৘ใ͕ॻ͔ΕͨBFT ؔ਺ΛHHQMPU ؔ਺ͷ֎ʹஔ͘͜ͱ΋Ͱ͖Δ
  24. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot() + geom_point(data = dat, mapping = aes(x =

    X, y = Y, color = tag)) + geom_path(data = dat, mapping = aes(x = X, y = Y)) ggplot(data = dat) + aes(x = X, y = Y) + # 括り出すのは共通するものだけ geom_point(mapping = aes(color = tag)) + geom_path() ϙΠϯτͷ৭ͷNBQQJOHΛࢦఆ
  25. HHQMPUίʔυͷॻ͖ํͷ৭ʑ ggplot(data = dat) + aes(x = X, y =

    Y) + geom_point(aes(color = tag)) + geom_path() ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(aes(color = tag)) ͋ͱ͔Β ͰॏͶͨཁૉ͕લ໘ʹඳը͞ΕΔ
  26. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 4, height = 3, dpi = 150)
  27. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 4, height = 3, dpi = 150) αΠζ͸σϑΥϧτͰ͸Πϯν୯ҐͰࢦఆ
  28. library(tidyverse) dat <- data.frame(tag = rep(c("a", "b"), each = 2),

    X = c(1, 3, 5, 7), Y = c(3, 9, 4, 2)) g <- ggplot(data = dat) + aes(x = X, y = Y) + geom_path() + geom_point(mapping = aes(color = tag)) HHQMPUը૾ͷอଘ ggsave(filename = "fig/demo01.png", plot = g, width = 10, height = 7.5, dpi = 150, units = "cm") # "cm", "mm", "in"を指定可能
  29. ෳ਺ͷܥྻΛඳը͢Δ > head(anscombe) x1 x2 x3 x4 y1 y2 y3

    y4 1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 ggplot(data = anscombe) + geom_point(aes(x = x1, y = y1)) + geom_point(aes(x = x2, y = y2), color = "Red") + geom_point(aes(x = x3, y = y3), color = "Blue") + geom_point(aes(x = x4, y = y4), color = "Green") ͜Ε·Ͱͷ஌ࣝͰؤுΔͱ͜͏ͳΔ
  30. HHQMPUʹΑΔσʔλՄࢹԽ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" SBXEBUB 写像 aesthetic channels ৹ඒతνϟωϧ ՄࢹԽʹదͨ͠EBUBܗࣜ 変形 ਤͷͭͷ৹ඒతνϟωϧ͕ σʔλͷͭͷม਺ʹରԠ͍ͯ͠Δ
  31. > head(anscombe) x1 x2 x3 x4 y1 y2 y3 y4

    1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 > head(anscombe_long) key x y 1 1 10 8.04 2 2 10 9.14 3 3 10 7.46 4 4 8 6.58 5 1 8 6.95 6 2 8 8.14 ggplot(data = anscombe_long) + aes(x = x, y = y, color = key) + geom_point() ৹ඒతνϟωϧ Y࣠ Z࣠ ৭ ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ ݟ௨͠ྑ͘γϯϓϧʹՄࢹԽͰ͖Δ
  32. > head(anscombe) x1 x2 x3 x4 y1 y2 y3 y4

    1 10 10 10 8 8.04 9.14 7.46 6.58 2 8 8 8 8 6.95 8.14 6.77 5.76 3 13 13 13 8 7.58 8.74 12.74 7.71 4 9 9 9 8 8.81 8.77 7.11 8.84 5 11 11 11 8 8.33 9.26 7.81 8.47 6 14 14 14 8 9.96 8.10 8.84 7.04 > head(anscombe_long) key x y 1 1 10 8.04 2 2 10 9.14 3 3 10 7.46 4 4 8 6.58 5 1 8 6.95 6 2 8 8.14 ৹ඒతνϟωϧ Y࣠ Z࣠ ৭ ʹରԠ͢Δม਺ʹͳΔΑ͏มܗ anscombe_long <- pivot_longer(data = anscombe, cols = everything(), names_to = c(".value", "key"), names_pattern = "(.)(.)") ԣ௕σʔλ ॎ௕σʔλ
  33. ggplot(data = anscombe_long) + aes(x = x, y = y,

    color = key) + geom_point() ggplot(data = anscombe_long) + aes(x = x, y = y, color = key) + geom_point() + facet_wrap(facets = . ~ key, nrow = 1) ਫ४ͰਤΛ෼ׂ͢Δ
  34. 𝑋 𝑌 𝑦! 𝑥! 𝑦" 𝑥" 𝑋 𝑌 𝑥! 𝑥"

    𝑦! 𝑦" σʔλՄࢹԽ ࣸ૾ mapping x axis, y axis, color, fill, shape, linetype, alpha… aesthetic channels ৹ඒతνϟωϧ ggplot(data = my_data) + aes(x = X, y = Y)) + goem_point() HHQMPUʹΑΔ࡞ਤ
  35. HHQMPUʹΑΔσʔλՄࢹԽ ࣮ଘ ࣸ૾ʢ؍࡯ʣ σʔλ ࣸ૾ʢσʔλՄࢹԽʣ άϥϑ 𝑋 𝑌 𝑦! 𝑥!

    𝑦" 𝑥" SBXEBUB 写像 aesthetic channels ৹ඒతνϟωϧ ՄࢹԽʹదͨ͠EBUBܗࣜ 変形 ਤͷͭͷ৹ඒతνϟωϧ͕ σʔλͷͭͷม਺ʹରԠ͍ͯ͠Δ