Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SappoRo.R_roundrobin
Search
kilometer
March 18, 2023
Programming
0
120
SappoRo.R_roundrobin
第10回Sapporo.Rで喋った際のスライドです。
kilometer
March 18, 2023
Tweet
Share
More Decks by kilometer
See All by kilometer
TokyoR#111_ANOVA
kilometer
2
830
TokyoR109.pdf
kilometer
1
440
TokyoR#108_NestedDataHandling
kilometer
0
750
TokyoR#107_R_GeoData
kilometer
0
390
TokyoR#104_DataProcessing
kilometer
1
660
TokyoR#103_DataProcessing
kilometer
0
850
TokyoR#102_RMarkdown
kilometer
1
610
TokyoR#101_RegressionAnalysis
kilometer
0
350
TokyoR#99_Divergence
kilometer
1
330
Other Decks in Programming
See All in Programming
TypeScript Graph でコードレビューの心理的障壁を乗り越える
ysk8hori
2
1.1k
macOS でできる リアルタイム動画像処理
biacco42
9
2.4k
Figma Dev Modeで変わる!Flutterの開発体験
watanave
0
140
見せてあげますよ、「本物のLaravel批判」ってやつを。
77web
7
7.8k
카카오페이는 어떻게 수천만 결제를 처리할까? 우아한 결제 분산락 노하우
kakao
PRO
0
110
CSC509 Lecture 12
javiergs
PRO
0
160
ふかぼれ!CSSセレクターモジュール / Fukabore! CSS Selectors Module
petamoriken
0
150
型付き API リクエストを実現するいくつかの手法とその選択 / Typed API Request
euxn23
8
2.2k
最新TCAキャッチアップ
0si43
0
190
PHP でアセンブリ言語のように書く技術
memory1994
PRO
1
170
ヤプリ新卒SREの オンボーディング
masaki12
0
130
Streams APIとTCPフロー制御 / Web Streams API and TCP flow control
tasshi
2
350
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
33
1.9k
Why Our Code Smells
bkeepers
PRO
334
57k
Teambox: Starting and Learning
jrom
133
8.8k
Measuring & Analyzing Core Web Vitals
bluesmoon
4
130
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.4k
Speed Design
sergeychernyshev
25
620
Become a Pro
speakerdeck
PRO
25
5k
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
A Philosophy of Restraint
colly
203
16k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
8
900
Building Applications with DynamoDB
mza
90
6.1k
The Pragmatic Product Professional
lauravandoore
31
6.3k
Transcript
SappoRo.R #10 @kilometer00 2023.03.18 らくらく総当たり組み合わせ
Who!? Who?
Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング
医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: アンキロサウルス
宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!
宣伝2!! R⾔語の地域コミュニティ@東京です。 定期的にR⾔語に関する勉強会を開催しています。 次回は4⽉22⽇!! 初⼼者特集回です!!
総当たり組み合わせ Round-robin そう あ あ く
dat_nest <- palmerpenguins::penguins %>% dplyr::group_nest(species) データを畳み込む > dat_nest # A
tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] (息を吐くように)
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
base::expand.grid()関数 > dat_nest # A tibble: 3 × 2 species
data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] dat_nest$species
grid <- dat_nest$species %>% expand.grid(., .) base::expand.grid()関数 > grid Var1
Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo
dplyr::left_join()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap
Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo > dat_nest # A tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] ①対応づけて結合 ②対応づけて結合
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" )
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" ) ① ②
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) > dat_rr_rename # A tibble: 9 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) key <- "species" x <- stringr::str_c(key, ".x") y <- stringr::str_c(key, ".y") dat_rr_rename <- dat_rr %>% rename(!!x := Var1) %>% rename(!!y := Var2) 別解 {rlang}パッケージの演算⼦
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
grid <- dat_nest$species %>% expand.grid(., .) %>% subset(unclass(Var1) < unclass(Var2))
%>% tibble::as_tibble() base::subset()関数 > grid # A tibble: 3 × 2 Var1 Var2 <fct> <fct> 1 Adelie Chinstrap 2 Adelie Gentoo 3 Chinstrap Gentoo
という変換を パッケージにしました。 devtools::install_github( "kilometer0101/roundrobin" ) (4回⼿打ちしたら⾯倒臭くなったので)
roundrobin::roundrobin()関数 # A tibble: 9 × 4 species.x species.y data.x
data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species")
library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species", combination = TRUE) roundrobin::roundrobin()関数
# A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7]
使ってみますか。
library(tidyverse) library(palmerpenguins) library(roundrobin) dat <- palmerpenguins::penguins %>% na.omit() %>% #
NA除去 mutate_at( vars(c(contains("mm"), contains("g"))), ~ (. - mean(.)) / sd(.) # 標準化 ) %>% select(species, contains("mm"), contains("g")) 前処理
> dat # A tibble: 333 × 5 species bill_length_mm
bill_depth_mm flipper_length_mm body_mass_g <fct> <dbl> <dbl> <dbl> <dbl> 1 Adelie -0.895 0.780 -1.42 -0.568 2 Adelie -0.822 0.119 -1.07 -0.506 3 Adelie -0.675 0.424 -0.426 -1.19 4 Adelie -1.33 1.08 -0.568 -0.940 5 Adelie -0.858 1.74 -0.782 -0.692 6 Adelie -0.931 0.323 -1.42 -0.723 7 Adelie -0.876 1.24 -0.426 0.581 8 Adelie -0.529 0.221 -1.35 -1.25 9 Adelie -0.986 2.05 -0.711 -0.506 10 Adelie -1.72 2.00 -0.212 0.240 # … with 323 more rows # i Use `print(n = ...)` to see more rows 前処理
dat_long <- dat %>% rowid_to_column("id") %>% pivot_longer( cols = !species,
names_to = "parameter", values_to = "value" ) %>% group_by(parameter) %>% ungroup() .y <- dat_long %>% ungroup() %>% group_by(species) %>% summarise( mean_id = mean(id), min_id = min(id) ) dat_long %>% ggplot() + aes(parameter, id) + geom_tile(aes(fill = value)) + geom_hline( yintercept = max(dat_long$id) ) + geom_hline(data = .y, aes(yintercept = min_id)) + scale_y_continuous( breaks = .y$mean_id, labels = .y$species, expand = c(0, 0)) + theme( axis.title = element_blank(), axis.text.x = element_text( angle = 30, hjust = 1 ) ) 可視化コード (ちょちょいのちょい)
可視化
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,4]>> <list<tibble[,4]>> 1 Adelie Adelie [146 × 4] [146 × 4] 2 Chinstrap Adelie [68 × 4] [146 × 4] 3 Gentoo Adelie [119 × 4] [146 × 4] 4 Adelie Chinstrap [146 × 4] [68 × 4] 5 Chinstrap Chinstrap [68 × 4] [68 × 4] 6 Gentoo Chinstrap [119 × 4] [68 × 4] 7 Adelie Gentoo [146 × 4] [119 × 4] 8 Chinstrap Gentoo [68 × 4] [119 × 4] 9 Gentoo Gentoo [119 × 4] [119 × 4] 総当たり組み合わせ dat_rr <- dat %>% roundrobin(key = "species", rename = FALSE)
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) > dat_rr_mahaD # A tibble: 9 × 5 Var1 Var2 data.x data.y mahaD2 <fct> <chr> <list<tibble[,4]>> <list<tibble[,4]>> <list> 1 Adelie vs. Adelie [146 × 4] [146 × 4] <dbl [146]> 2 Chinstrap vs. Adelie [68 × 4] [146 × 4] <dbl [68]> 3 Gentoo vs. Adelie [119 × 4] [146 × 4] <dbl [119]> 4 Adelie vs. Chinstrap [146 × 4] [68 × 4] <dbl [146]> 5 Chinstrap vs. Chinstrap [68 × 4] [68 × 4] <dbl [68]> 6 Gentoo vs. Chinstrap [119 × 4] [68 × 4] <dbl [119]> 7 Adelie vs. Gentoo [146 × 4] [119 × 4] <dbl [146]> 8 Chinstrap vs. Gentoo [68 × 4] [119 × 4] <dbl [68]> 9 Gentoo vs. Gentoo [119 × 4] [119 × 4] <dbl [119]>
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) dat_mahaD <- dat_rr_mahaD %>% select(!data.y) %>% unnest(everything())
> dat_mahaD # A tibble: 999 × 7 Var1 Var2
bill_length_mm bill_…¹ flipp…² body_…³ mahaD2 <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Adelie vs. Adelie -0.895 0.780 -1.42 -0.568 2.84 2 Adelie vs. Adelie -0.822 0.119 -1.07 -0.506 1.95 3 Adelie vs. Adelie -0.675 0.424 -0.426 -1.19 4.26 4 Adelie vs. Adelie -1.33 1.08 -0.568 -0.940 3.32 5 Adelie vs. Adelie -0.858 1.74 -0.782 -0.692 5.57 6 Adelie vs. Adelie -0.931 0.323 -1.42 -0.723 2.47 7 Adelie vs. Adelie -0.876 1.24 -0.426 0.581 5.94 8 Adelie vs. Adelie -0.529 0.221 -1.35 -1.25 5.27 9 Adelie vs. Adelie -0.986 2.05 -0.711 -0.506 7.75 10 Adelie vs. Adelie -1.72 2.00 -0.212 0.240 15.2 # … with 989 more rows, and abbreviated variable names # ¹bill_depth_mm, ²flipper_length_mm, ³body_mass_g # ℹ Use `print(n = ...)` to see more rows 例えばマハラノビス距離
例えばマハラノビス距離 ggplot(dat_mahaD) + aes(mahaD2, color = Var1, fill = Var1)
+ geom_density(alpha = 0.5) + facet_wrap(~Var2)
総当たり組み合わせ Round-robin そう あ あ く devtools::install_github( "kilometer0101/roundrobin" )
Enjoy!