Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TokyoR#103_DataProcessing

kilometer
January 21, 2023

 TokyoR#103_DataProcessing

第103回Tokyo.Rでしゃべった際の資料です。

kilometer

January 21, 2023
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. #103
    @kilometer00
    2023.01.21
    BeginneR Session
    Data processing &
    visualization

    View Slide

  2. Who!?
    Who?

    View Slide

  3. Who!?
    名前: 三村 @kilometer
    職業: ポスドク (こうがくはくし)
    専⾨: ⾏動神経科学(霊⻑類)
    脳イメージング
    医療システム⼯学
    R歴: ~ 10年ぐらい
    流⾏: ペーパークラフト

    View Slide

  4. 宣伝!!(書籍の翻訳に参加しました。)
    絶賛販売中!

    View Slide

  5. BeginneR Session

    View Slide

  6. BeginneR

    View Slide

  7. Before After
    BeginneR Session
    BeginneR BeginneR

    View Slide

  8. BeginneR Advanced Hoxo_m
    If I have seen further it is by standing on the
    shoulders of Giants.
    -- Sir Isaac Newton, 1676

    View Slide

  9. #103
    @kilometer00
    2023.01.21
    BeginneR Session
    Data processing &
    visualization

    View Slide

  10. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    Data Science


    View Slide

  11. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    preprocessing
    Data processing
    Data science
    Data Science

    View Slide

  12. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    preprocessing
    Data science
    Data
    Observa?on Hypothesis feedback
    Data processing
    Data Science

    View Slide

  13. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    preprocessing
    Data science
    Data
    Observation Hypothesis feedback
    Data processing
    Narra/ve of data

    View Slide

  14. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    preprocessing
    Data science
    Data
    Observa?on Hypothesis
    Narra/ve of data
    feedback
    Data processing

    View Slide

  15. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    Data processing
    Data Science

    View Slide

  16. raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    Nested form
    pivot_wider()
    Plot
    group_nest() unnest()
    {ggplot2}
    {patchwork}
    Image Files
    ggsave()
    Data Processing

    View Slide

  17. data.frame
    tibble
    raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    Nested form
    pivot_wider()
    Plot
    group_nest() unnest()
    {ggplot2}
    {patchwork}
    Image Files
    ggsave()
    Data Processing

    View Slide

  18. data.frame

    View Slide

  19. vector
    in Excel

    View Slide

  20. vector
    in R
    in Excel
    pre post > pre
    [1] 1 2 3 4 5
    > post
    [1] 5 10 15 20 25

    View Slide

  21. vector
    vec1 vec2 vec3 > vec1
    [1] 1 2 3 4 5
    > vec2
    [1] 1 2 3 4 5
    > vec3
    [1] 1 2 3 4 5

    View Slide

  22. vector
    vec1 vec2 > vec1
    [1] 1 2 3 4 5
    > vec2
    [1] 1 2 3 4 5

    View Slide

  23. > ?seq
    vector
    seq{base}
    Sequence Generation
    Description
    Generate regular sequences. seq is a standard
    generic with a default method. …
    Usage
    seq(...)
    ## Default S3 method:
    seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)),
    length.out = NULL, along.with = NULL, ...)

    View Slide

  24. vector
    vec1 vec2 vec3 > vec1
    [1] 1 2 3 1 2 3
    > vec2
    [1] 1 1 2 2 3 3
    > vec3
    [1] 1 1 2 2 3 3 1 1 2 2 3 3

    View Slide

  25. vector
    vec1 > vec1
    [1] 11 12 13 14 15
    > vec1[1]
    [1] 11
    > vec1[3:5]
    [1] 13 14 15
    > vec1[c(1:2, 5)]
    [1] 11 12 15

    View Slide

  26. list
    list1 > list1
    [[1]]
    [1] 1 2 3 4 5 6
    [[2]]
    [1] 11 12 13 14 15
    [[3]]
    [1] "a" "b" "c"

    View Slide

  27. list
    list1 > list1[[1]]
    [1] 1 2 3 4 5 6
    > list1[[3]][2:3]
    [1] "b" "c"
    > list1[[2]] * 3
    [1] 33 36 39 42 45

    View Slide

  28. named list
    list2 > list2
    $A
    [1] 1 2 3 4 5 6
    $B
    [1] 11 12 13 14 15
    $C
    [1] "a" "b" "c"

    View Slide

  29. > list2$A
    [1] 1 2 3 4 5 6
    > list2$C[2:3]
    [1] "b" "c"
    > list2$B * 3
    [1] 33 36 39 42 45
    named list
    list2

    View Slide

  30. list1 > class(list1)
    [1] "list"
    > names(list1)
    NULL
    list2 > class(list2)
    [1] "list"
    > names(list2)
    [1] "A" "B" "C"
    named list
    list

    View Slide

  31. list3 > class(list3)
    [1] "list"
    > names(list3)
    [1] "A" "B"
    df1 > class(df1)
    [1] "data.frame"
    > names(df1)
    [1] "A" "B"
    named list & data.frame

    View Slide

  32. > str(list3)
    List of 2
    $ A: int [1:3] 1 2 3
    $ B: int [1:3] 11 12 13
    > str(df1)
    'data.frame': 3 obs. of 2 variables:
    $ A: int 1 2 3
    $ B: int 11 12 13
    list3 df1 named list & data.frame

    View Slide

  33. > list3
    $A
    [1] 1 2 3
    $B
    [1] 11 12 13
    > df1
    A B
    1 1 11
    2 2 12
    3 3 13
    named list & data.frame

    View Slide

  34. data.frame vs. matrix
    A B
    1 1 11
    2 2 12
    3 3 13
    [,1] [,2]
    [1,] 1 11
    [2,] 2 12
    [3,] 3 13
    df1 > str(mat1)
    int [1:3, 1:2] 1 2 3 11 12 13
    > str(df1)
    'data.frame': 3 obs. of 2 vars.:
    $ A: int 1 2 3
    $ B: int 11 12 13
    mat1

    View Slide

  35. data.frame
    variables
    observa*on

    View Slide

  36. data.frame
    tibble
    raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    Nested form
    pivot_wider()
    Plot
    group_nest() unnest()
    {ggplot2}
    {patchwork}
    Image Files
    ggsave()
    Data Processing

    View Slide

  37. > anscombe
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    7 6 6 6 8 7.24 6.13 6.08 5.25
    8 4 4 4 19 4.26 3.10 5.39 12.50
    9 12 12 12 8 10.84 9.13 8.15 5.56
    10 7 7 7 8 4.82 7.26 6.42 7.91
    11 5 5 5 8 5.68 4.74 5.73 6.89
    Wide form data

    View Slide

  38. > df
    tag x1 x2 x3 x4 y1 y2 y3 y4
    1 1 10 10 10 8 8.04 9.14 7.46 6.58
    2 2 8 8 8 8 6.95 8.14 6.77 5.76
    3 3 13 13 13 8 7.58 8.74 12.74 7.71
    4 4 9 9 9 8 8.81 8.77 7.11 8.84
    5 5 11 11 11 8 8.33 9.26 7.81 8.47
    6 6 14 14 14 8 9.96 8.10 8.84 7.04
    Wide form data
    df rownames_to_column(
    anscombe,
    var = "tag"
    )

    View Slide

  39. Wide form → Long form data
    df_long_1 pivot_longer(
    data = df,
    cols = !tag
    )
    df_long_2 pivot_longer(
    data = df,
    cols = !tag,
    names_to = c(".value",
    "key"),
    names_pattern = c("(.)(.)")

    View Slide

  40. Long form → Wide form data
    pivot_wider(
    data = df_long_1,
    values_from = value,
    names_from = name
    )
    pivot_wider(
    data = df_long_2,
    values_from = c(x, y),
    names_from = name
    )

    View Slide

  41. data.frame / tibble
    raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    pivot_wider()
    Plot
    {ggplot2}
    Image Files
    ggsave()
    Data Processing

    View Slide

  42. raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    pivot_wider()
    Plot
    {ggplot2}
    Image Files
    ggsave()
    Data Processing
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    data.frame / -bble

    View Slide

  43. vignette("dplyr")

    View Slide

  44. It (dplyr) provides simple “verbs” to help
    you translate your thoughts into code.
    func>ons that correspond to the most
    common data manipula>on tasks
    Introduc6on to dplyr
    h"ps://cran.r-project.org/web/packages/dplyr/vigne"es/dplyr.html
    WFSCT {dplyr}

    View Slide

  45. dplyrは、あなたの考えをコードに翻訳
    するための【動詞】を提供する。
    データ操作における基本のキを、
    シンプルに実⾏できる関数 (群)
    Introduction to dplyr
    https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
    WFSCT {dplyr}
    ※ かなり意訳

    View Slide

  46. 1. mutate()
    2. filter()
    3. select()
    4. group_by()
    5. summarize()
    6. left_join()
    7. arrange()
    Data.frame manipula?on

    View Slide

  47. 1. mutate()
    2. filter()
    3. select()
    4. group_by()
    5. summarize()
    6. left_join()
    7. arrange()
    Data.frame manipula?on
    0. %>%

    View Slide

  48. 1JQFBMHFCSB
    X %>% f
    X %>% f(y)
    X %>% f %>% g
    X %>% f(y, .)
    f(X)
    f(X, y)
    g(f(X))
    f(y, X)
    %>% {magri8r}
    「dplyr再⼊⾨(基本編)」yutanihilaCon
    h"ps://speakerdeck.com/yutannihila6on/dplyrzai-ru-men-ji-ben-bian

    View Slide





  49. lift
    take
    pour
    put
    Bring milk from the kitchen!

    View Slide


  50. lift
    Bring milk from the kitchen!
    lift(Robot, glass, table) -> Robot'
    take

    take(Robot', fridge, milk) -> Robot''

    View Slide

  51. Bring milk from the kitchen!
    Robot' Robot'' Robot''' result result %
    lift(glass, table) %>%
    take(fridge, milk) %>%
    pour(milk, glass) %>%
    put(glass, table)
    by using pipe,
    # ①
    # ②
    # ③
    # ④
    # ①
    # ②
    # ③
    # ④

    View Slide

  52. The Pdyverse style guides
    h"ps://style.;dyverse.org/syntax.html#object-names
    "There are only two hard things in Computer Science:
    cache invalida:on and naming things"

    View Slide

  53. Bring milk from the kitchen!
    Robot' Robot'' Robot''' result result %
    lift(glass, table) %>%
    take(fridge, milk) %>%
    pour(milk, glass) %>%
    put(glass, table)
    by using pipe,
    # ①
    # ②
    # ③
    # ④
    # ①
    # ②
    # ③
    # ④

    View Slide

  54. Robot' Robot'' Robot''' result result %
    lift(glass, table) %>%
    take(fridge, milk) %>%
    pour(milk, glass) %>%
    put(glass, table)
    by using pipe,
    # ①
    # ②
    # ③
    # ④
    # ①
    # ②
    # ③
    # ④
    Thinking Reading
    Bring milk from the kitchen!

    View Slide

  55. Programing
    Write
    Run
    Read
    Think
    Write
    Run
    Read
    Think
    Communicate
    Share

    View Slide

  56. 1JQFBMHFCSB
    X %>% f
    X %>% f(y)
    X %>% f %>% g
    X %>% f(y, .)
    f(X)
    f(X, y)
    g(f(X))
    f(y, X)
    %>% {magri8r}
    「dplyr再⼊⾨(基本編)」yutanihilation
    https://speakerdeck.com/yutannihilation/dplyrzai-ru-men-ji-ben-bian

    View Slide

  57. 1. mutate()
    2. filter()
    3. select()
    4. group_by()
    5. summarize()
    6. left_join()
    7. arrange()
    Data.frame manipula?on
    0. %>%

    View Slide

  58. WFSCT {dplyr}
    mutate # カラムの追加
    +
    mutate(dat, C = fun(A, B))

    View Slide

  59. WFSCT {dplyr}
    mutate # カラムの追加
    +
    dat %>% mutate(C = fun(A, B))

    View Slide

  60. WFSCT {dplyr}
    filter # 行の絞り込み
    dat %>% filter(tag %in% c(1, 3, 5))

    View Slide

  61. ブール演算⼦ Boolean Algebra
    A == B A != B
    George Boole
    1815 - 1864
    A | B A & B
    A %in% B
    # equal to # not equal to
    # or # and
    # is A in B?
    wikipedia

    View Slide

  62. "a" != "b"
    # is A in B?
    ブール演算⼦ Boolean Algebra
    [1] TRUE
    1 %in% 10:100
    # is A in B?
    [1] FALSE

    View Slide

  63. George Boole
    1815 - 1864
    A Class-Room Introduc;on to Logic
    h"ps://niyamaklogic.wordpress.com/c
    ategory/laws-of-thoughts/
    Mathematician
    Philosopher
    &

    View Slide

  64. WFSCT {dplyr}
    select # カラムの選択
    dat %>% select(tag, B)

    View Slide

  65. WFSCT {dplyr}
    select # カラムの選択
    dat %>% select("tag", "B")

    View Slide

  66. WFSCT {dplyr}
    select # カラムの選択
    dat %>% select("tag", "B")
    dat %>% select(tag, B)

    View Slide

  67. WFSCT {dplyr}
    # Select help func>ons
    starts_with("s") ends_with("s")
    contains("se") matches("^.e")
    one_of(c(”tag", ”B"))
    everything()
    hEps://kazutan.github.io/blog/2017/04/dplyr-select-memo/
    「dplyr::selectの活⽤例メモ」kazutan

    View Slide

  68. 1. mutate()
    2. filter()
    3. select()
    4. group_by()
    5. summarize()
    6. left_join()
    7. arrange()
    Data.frame manipula?on
    0. %>%




    View Slide

  69. (SBNNBSPGEBUBNBOJQVMBUJPO
    By constraining your options,
    it helps you think about your data
    manipulation challenges.
    Introduc6on to dplyr
    hEps://cran.r-project.org/web/packages/dplyr/vigneEes/dplyr.html

    View Slide

  70. 選択肢を制限することで、
    データ解析のステップを
    シンプルに考えられますヨ。
    (めっちゃ意訳)
    Introduc6on to dplyr
    hEps://cran.r-project.org/web/packages/dplyr/vigneEes/dplyr.html
    ※ まさに意訳
    (SBNNBSPGEBUBNBOJQVMBUJPO

    View Slide

  71. より多くの制約を課す事で、
    魂の⾜枷から、より⾃由になる。
    Igor Stravinsky
    И8горь Ф Страви́нский
    The more constraints one imposes,
    the more one frees one's self of the
    chains that shackle the spirit.
    1882 - 1971
    ※ 割と意訳

    View Slide

  72. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    Data Science


    View Slide

  73. Programing
    Write
    Run
    Read
    Think
    Write
    Run
    Read
    Think
    Communicate
    Share

    View Slide

  74. Text Image
    Information
    Intention
    Data
    decode
    encode
    Data analysis
    feedback

    View Slide

  75. Text
    Image
    First, A. Next, B.
    Then C. Finally D.
    >me
    Intention
    encode
    "Frozen" structure
    A B C D Nme
    value
    α
    β

    View Slide

  76. Encode
    Apple
    (Real)
    Apple
    (Information)
    Decode

    View Slide

  77. Divergence
    Real
    Info.
    Data Apple = 1
    Encoding

    View Slide

  78. Apple
    Encode
    Fruit
    Red
    1
    (image)
    Real Information

    View Slide

  79. Apple
    Encode
    Fruit
    Red
    1
    (image)
    Real Information
    channel

    View Slide

  80. 写像 (mapping)
    𝑓: 𝑋 → 𝑌
    𝑋 𝑌
    ある情報の集合の要素を、別の情報の集合の
    ただ1つの要素に対応づけるプロセス

    View Slide

  81. 写像 (mapping)
    𝑚𝑎𝑝𝑝𝑖𝑛𝑔
    𝐷𝑎𝑡𝑎 𝐼𝑚𝑎𝑔𝑒

    View Slide

  82. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    可視化 ⊂ 写像
    mapping

    View Slide

  83. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    可視化 ⊂ 写像
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels

    View Slide

  84. Apple
    Encode
    Fruit
    Red
    1
    (image)
    Real Information
    channel

    View Slide

  85. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels
    data
    ggplot2 package

    View Slide

  86. ggplot2
    # install.packages("tidyverse")
    library(tidyverse)
    dat Attach package
    Simple example
    > dat
    a b
    1 1 8
    2 2 9
    3 3 10

    View Slide

  87. dat ggplot(data = dat) +
    geom_point(mapping = aes(x = a, y = b))
    ggplot2

    View Slide

  88. dat ggplot(data = dat) +
    geom_point(mapping = aes(x = a, y = b))
    𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels
    data
    ggplot2

    View Slide

  89. dat ggplot(data = dat) +
    geom_point(mapping = aes(x = a, y = b)) +
    geom_path(mapping = aes(x = a, y = b))

    View Slide

  90. dat ggplot(data = dat,
    mapping = aes(x = a, y = b)) +
    geom_point() +
    geom_path()
    inheritance

    View Slide

  91. dat ggplot(data = dat) +
    aes(x = a, y = b) +
    geom_point() +
    geom_path()

    View Slide

  92. dat ggplot(data = dat) +
    aes(x = a, y = b) +
    geom_point() +
    geom_path()

    View Slide

  93. dat g ggplot(data = dat) +
    aes(x = a, y = b)) +
    geom_point()
    g +
    geom_path()

    View Slide

  94. Anscombe's quartet
    > anscombe
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    7 6 6 6 8 7.24 6.13 6.08 5.25
    8 4 4 4 19 4.26 3.10 5.39 12.50
    9 12 12 12 8 10.84 9.13 8.15 5.56
    10 7 7 7 8 4.82 7.26 6.42 7.91
    11 5 5 5 8 5.68 4.74 5.73 6.89

    View Slide

  95. ggplot(data = anscombe) +
    aes(x = x1, y = y1) +
    geom_point()
    Anscombe's quartet

    View Slide

  96. > anscombe
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    7 6 6 6 8 7.24 6.13 6.08 5.25
    8 4 4 4 19 4.26 3.10 5.39 12.50
    9 12 12 12 8 10.84 9.13 8.15 5.56
    10 7 7 7 8 4.82 7.26 6.42 7.91
    11 5 5 5 8 5.68 4.74 5.73 6.89
    x
    y
    mapping
    Anscombe's quartet

    View Slide

  97. > anscombe
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    4 9 9 9 8 8.81 8.77 7.11 8.84
    5 11 11 11 8 8.33 9.26 7.81 8.47
    6 14 14 14 8 9.96 8.10 8.84 7.04
    7 6 6 6 8 7.24 6.13 6.08 5.25
    8 4 4 4 19 4.26 3.10 5.39 12.50
    9 12 12 12 8 10.84 9.13 8.15 5.56
    10 7 7 7 8 4.82 7.26 6.42 7.91
    11 5 5 5 8 5.68 4.74 5.73 6.89
    a
    > anscombe_long
    # A tibble: 44 x 3
    key x y

    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    7 3 8 6.77
    8 4 8 5.76
    Wide form Long form

    View Slide

  98. > anscombe
    x1 x2 x3 x4 y1 y2 y3 y4
    1 10 10 10 8 8.04 9.14 7.46 6.58
    2 8 8 8 8 6.95 8.14 6.77 5.76
    3 13 13 13 8 7.58 8.74 12.74 7.71
    anscombe_long pivot_longer(data = anscombe,
    cols = everything(),
    names_pattern = "(.)(.)",
    names_to = c(".value", "key"))
    Wide -> Long form

    View Slide

  99. anscombe_long pivot_longer(data = anscombe,
    cols = everything(),
    names_pattern = "(.)(.)",
    names_to = c(".value", "key"))
    > anscombe_long
    # A tibble: 44 x 3
    key x y

    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    7 3 8 6.77
    8 4 8 5.76
    Anscombe's quartet

    View Slide

  100. anscombe_long pivot_longer(data = anscombe,
    cols = everything(),
    names_pattern = "(.)(.)",
    names_to = c(".value", "key"))
    > anscombe_long
    # A tibble: 44 x 3
    key x y

    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    7 3 8 6.77
    8 4 8 5.76
    Anscombe's quartet
    g_anscomb ggplot(data = anscombe_long) +
    aes(x = x, y = y, color = key) +
    geom_point()

    View Slide

  101. anscombe_long pivot_longer(data = anscombe,
    cols = everything(),
    names_pattern = "(.)(.)",
    names_to = c(".value", "key"))
    g_anscomb ggplot(data = anscombe_long)+
    aes(x = x, y = y, color = key)+
    geom_point()
    > anscombe_long
    # A tibble: 44 x 3
    key x y

    1 1 10 8.04
    2 2 10 9.14
    3 3 10 7.46
    4 4 8 6.58
    5 1 8 6.95
    6 2 8 8.14
    7 3 8 6.77
    8 4 8 5.76
    Anscombe's quartet

    View Slide

  102. g_anscomb +
    facet_wrap( ~ key)
    Anscombe's quartet

    View Slide

  103. g_anscomb +
    facet_wrap( ~ key) +
    theme(legend.position = "none")
    Anscombe's quartet

    View Slide

  104. まとめ

    View Slide

  105. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    Data Science


    View Slide

  106. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    preprocessing
    Data science
    Data
    ObservaPon Hypothesis
    Narra/ve of data
    feedback
    Data processing

    View Slide

  107. data.frame / -bble
    raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    pivot_wider()
    Plot
    {ggplot2}
    Image Files
    ggsave()
    Data Processing

    View Slide

  108. raed_csv()
    write_csv()
    Table Data
    Wide form Long form
    pivot_longer()
    pivot_wider()
    Plot
    {ggplot2}
    Image Files
    ggsave()
    Data Processing
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    Long form
    data.frame / -bble

    View Slide

  109. It (dplyr) provides simple “verbs” to help
    you translate your thoughts into code.
    func>ons that correspond to the most
    common data manipula>on tasks
    Introduc6on to dplyr
    h"ps://cran.r-project.org/web/packages/dplyr/vigne"es/dplyr.html
    WFSCT {dplyr}

    View Slide

  110. 1. mutate()
    2. filter()
    3. select()
    4. group_by()
    5. summarize()
    6. left_join()
    7. arrange()
    Data.frame manipula:on

    View Slide

  111. import Tidy Transform
    Visualize
    Model
    Communicate
    Modified from “R for Data Science”, H. Wickham, 2017
    Data Science


    View Slide

  112. 𝑋
    𝑌
    𝑦!
    𝑥!
    𝑦"
    𝑥"
    𝑋 𝑌
    𝑥!
    𝑥"
    𝑦!
    𝑦"
    mapping
    x axis, y axis, color, fill,
    shape, linetype, alpha…
    aesthetic channels
    data
    ggplot2 package

    View Slide

  113. dat ggplot(data = dat) +
    aes(x = a, y = b) +
    geom_point() +
    geom_path()

    View Slide

  114. Enjoy!!
    KMT©

    View Slide