Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SappoRo.R_roundrobin
Search
kilometer
March 18, 2023
Programming
0
150
SappoRo.R_roundrobin
第10回Sapporo.Rで喋った際のスライドです。
kilometer
March 18, 2023
Tweet
Share
More Decks by kilometer
See All by kilometer
TokyoR#111_ANOVA
kilometer
2
880
TokyoR109.pdf
kilometer
1
470
TokyoR#108_NestedDataHandling
kilometer
0
810
TokyoR#107_R_GeoData
kilometer
0
440
TokyoR#104_DataProcessing
kilometer
1
700
TokyoR#103_DataProcessing
kilometer
0
900
TokyoR#102_RMarkdown
kilometer
1
660
TokyoR#101_RegressionAnalysis
kilometer
0
410
TokyoR#99_Divergence
kilometer
1
390
Other Decks in Programming
See All in Programming
goにおける コネクションプールの仕組み を軽く掘って見た
aronokuyama
0
140
PHPer's Guide to Daemon Crafting Taming and Summoning
uzulla
2
1k
ローコードサービスの進化のためのモノレポ移行
taro28
1
330
Kubernetesで実現できるPlatform Engineering の現在地
nwiizo
2
1.7k
S3静的ホスティング+Next.js静的エクスポート で格安webアプリ構築
iharuoru
0
190
Modern Angular:Renovation for Your Applications @angularDays 2025 Munich
manfredsteyer
PRO
0
130
なぜselectはselectではないのか
taiyow
2
310
RailsでCQRS/ESをやってみたきづき
suzukimar
2
1.5k
Windows版PHPのビルド手順とPHP 8.4における変更点
matsuo_atsushi
0
370
プログラミング教育のコスパの話
superkinoko
0
120
AI Agentを利用したAndroid開発について
yuchan2215
0
210
私の愛したLaravel 〜レールを超えたその先へ〜
kentaroutakeda
12
3.5k
Featured
See All Featured
Art, The Web, and Tiny UX
lynnandtonic
298
20k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
118
51k
Build The Right Thing And Hit Your Dates
maggiecrowley
34
2.6k
Bash Introduction
62gerente
611
210k
Docker and Python
trallard
44
3.3k
Rails Girls Zürich Keynote
gr2m
94
13k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
12k
GraphQLの誤解/rethinking-graphql
sonatard
70
10k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
8
700
Agile that works and the tools we love
rasmusluckow
328
21k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
46
2.4k
Adopting Sorbet at Scale
ufuk
75
9.3k
Transcript
SappoRo.R #10 @kilometer00 2023.03.18 らくらく総当たり組み合わせ
Who!? Who?
Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング
医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: アンキロサウルス
宣伝!!(書籍の翻訳に参加しました。) 絶賛販売中!
宣伝2!! R⾔語の地域コミュニティ@東京です。 定期的にR⾔語に関する勉強会を開催しています。 次回は4⽉22⽇!! 初⼼者特集回です!!
総当たり組み合わせ Round-robin そう あ あ く
dat_nest <- palmerpenguins::penguins %>% dplyr::group_nest(species) データを畳み込む > dat_nest # A
tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] (息を吐くように)
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
base::expand.grid()関数 > dat_nest # A tibble: 3 × 2 species
data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] dat_nest$species
grid <- dat_nest$species %>% expand.grid(., .) base::expand.grid()関数 > grid Var1
Var2 1 Adelie Adelie 2 Chinstrap Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo
dplyr::left_join()関数 > grid Var1 Var2 1 Adelie Adelie 2 Chinstrap
Adelie 3 Gentoo Adelie 4 Adelie Chinstrap 5 Chinstrap Chinstrap 6 Gentoo Chinstrap 7 Adelie Gentoo 8 Chinstrap Gentoo 9 Gentoo Gentoo > dat_nest # A tibble: 3 × 2 species data <fct> <list<tibble[,7]>> 1 Adelie [152 × 7] 2 Chinstrap [68 × 7] 3 Gentoo [124 × 7] ①対応づけて結合 ②対応づけて結合
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" )
dplyr::left_join()関数 dat_rr <- grid %>% tibble::as_tibble() %>% dplyr::left_join( dat_nest %>%
dplyr::rename(Var1 = "species"), by = "Var1" ) %>% dplyr::left_join( dat_nest %>% dplyr::rename(Var2 = "species"), by = "Var2" ) ① ②
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) > dat_rr_rename # A tibble: 9 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7]
dplyr::rename()関数 dat_rr_rename <- dat_rr %>% rename(species.x = Var1) %>% rename(species.y
= Var2) key <- "species" x <- stringr::str_c(key, ".x") y <- stringr::str_c(key, ".y") dat_rr_rename <- dat_rr %>% rename(!!x := Var1) %>% rename(!!y := Var2) 別解 {rlang}パッケージの演算⼦
# A tibble: 9 × 4 species.x species.y data.x data.y
<fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] 総当たり組み合わせ # A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7] 組み合わせ(combination) (round-robin)
grid <- dat_nest$species %>% expand.grid(., .) %>% subset(unclass(Var1) < unclass(Var2))
%>% tibble::as_tibble() base::subset()関数 > grid # A tibble: 3 × 2 Var1 Var2 <fct> <fct> 1 Adelie Chinstrap 2 Adelie Gentoo 3 Chinstrap Gentoo
という変換を パッケージにしました。 devtools::install_github( "kilometer0101/roundrobin" ) (4回⼿打ちしたら⾯倒臭くなったので)
roundrobin::roundrobin()関数 # A tibble: 9 × 4 species.x species.y data.x
data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Adelie [152 × 7] [152 × 7] 2 Chinstrap Adelie [68 × 7] [152 × 7] 3 Gentoo Adelie [124 × 7] [152 × 7] 4 Adelie Chinstrap [152 × 7] [68 × 7] 5 Chinstrap Chinstrap [68 × 7] [68 × 7] 6 Gentoo Chinstrap [124 × 7] [68 × 7] 7 Adelie Gentoo [152 × 7] [124 × 7] 8 Chinstrap Gentoo [68 × 7] [124 × 7] 9 Gentoo Gentoo [124 × 7] [124 × 7] library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species")
library(roundrobin) palmerpenguins::penguins %>% roundrobin(key = "species", combination = TRUE) roundrobin::roundrobin()関数
# A tibble: 3 × 4 species.x species.y data.x data.y <fct> <fct> <list<tibble[,7]>> <list<tibble[,7]>> 1 Adelie Chinstrap [152 × 7] [68 × 7] 2 Adelie Gentoo [152 × 7] [124 × 7] 3 Chinstrap Gentoo [68 × 7] [124 × 7]
使ってみますか。
library(tidyverse) library(palmerpenguins) library(roundrobin) dat <- palmerpenguins::penguins %>% na.omit() %>% #
NA除去 mutate_at( vars(c(contains("mm"), contains("g"))), ~ (. - mean(.)) / sd(.) # 標準化 ) %>% select(species, contains("mm"), contains("g")) 前処理
> dat # A tibble: 333 × 5 species bill_length_mm
bill_depth_mm flipper_length_mm body_mass_g <fct> <dbl> <dbl> <dbl> <dbl> 1 Adelie -0.895 0.780 -1.42 -0.568 2 Adelie -0.822 0.119 -1.07 -0.506 3 Adelie -0.675 0.424 -0.426 -1.19 4 Adelie -1.33 1.08 -0.568 -0.940 5 Adelie -0.858 1.74 -0.782 -0.692 6 Adelie -0.931 0.323 -1.42 -0.723 7 Adelie -0.876 1.24 -0.426 0.581 8 Adelie -0.529 0.221 -1.35 -1.25 9 Adelie -0.986 2.05 -0.711 -0.506 10 Adelie -1.72 2.00 -0.212 0.240 # … with 323 more rows # i Use `print(n = ...)` to see more rows 前処理
dat_long <- dat %>% rowid_to_column("id") %>% pivot_longer( cols = !species,
names_to = "parameter", values_to = "value" ) %>% group_by(parameter) %>% ungroup() .y <- dat_long %>% ungroup() %>% group_by(species) %>% summarise( mean_id = mean(id), min_id = min(id) ) dat_long %>% ggplot() + aes(parameter, id) + geom_tile(aes(fill = value)) + geom_hline( yintercept = max(dat_long$id) ) + geom_hline(data = .y, aes(yintercept = min_id)) + scale_y_continuous( breaks = .y$mean_id, labels = .y$species, expand = c(0, 0)) + theme( axis.title = element_blank(), axis.text.x = element_text( angle = 30, hjust = 1 ) ) 可視化コード (ちょちょいのちょい)
可視化
> dat_rr # A tibble: 9 × 4 Var1 Var2
data.x data.y <fct> <fct> <list<tibble[,4]>> <list<tibble[,4]>> 1 Adelie Adelie [146 × 4] [146 × 4] 2 Chinstrap Adelie [68 × 4] [146 × 4] 3 Gentoo Adelie [119 × 4] [146 × 4] 4 Adelie Chinstrap [146 × 4] [68 × 4] 5 Chinstrap Chinstrap [68 × 4] [68 × 4] 6 Gentoo Chinstrap [119 × 4] [68 × 4] 7 Adelie Gentoo [146 × 4] [119 × 4] 8 Chinstrap Gentoo [68 × 4] [119 × 4] 9 Gentoo Gentoo [119 × 4] [119 × 4] 総当たり組み合わせ dat_rr <- dat %>% roundrobin(key = "species", rename = FALSE)
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) > dat_rr_mahaD # A tibble: 9 × 5 Var1 Var2 data.x data.y mahaD2 <fct> <chr> <list<tibble[,4]>> <list<tibble[,4]>> <list> 1 Adelie vs. Adelie [146 × 4] [146 × 4] <dbl [146]> 2 Chinstrap vs. Adelie [68 × 4] [146 × 4] <dbl [68]> 3 Gentoo vs. Adelie [119 × 4] [146 × 4] <dbl [119]> 4 Adelie vs. Chinstrap [146 × 4] [68 × 4] <dbl [146]> 5 Chinstrap vs. Chinstrap [68 × 4] [68 × 4] <dbl [68]> 6 Gentoo vs. Chinstrap [119 × 4] [68 × 4] <dbl [119]> 7 Adelie vs. Gentoo [146 × 4] [119 × 4] <dbl [146]> 8 Chinstrap vs. Gentoo [68 × 4] [119 × 4] <dbl [68]> 9 Gentoo vs. Gentoo [119 × 4] [119 × 4] <dbl [119]>
例えばマハラノビス距離 dat_rr_mahaD <- dat_rr %>% mutate(mahaD2 = map2( data.x, data.y,
# yに対するxの距離 ~ mahalanobis(.x, colMeans(.y), cov(.y)) )) %>% mutate(Var2 = str_c("vs. ", Var2)) dat_mahaD <- dat_rr_mahaD %>% select(!data.y) %>% unnest(everything())
> dat_mahaD # A tibble: 999 × 7 Var1 Var2
bill_length_mm bill_…¹ flipp…² body_…³ mahaD2 <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Adelie vs. Adelie -0.895 0.780 -1.42 -0.568 2.84 2 Adelie vs. Adelie -0.822 0.119 -1.07 -0.506 1.95 3 Adelie vs. Adelie -0.675 0.424 -0.426 -1.19 4.26 4 Adelie vs. Adelie -1.33 1.08 -0.568 -0.940 3.32 5 Adelie vs. Adelie -0.858 1.74 -0.782 -0.692 5.57 6 Adelie vs. Adelie -0.931 0.323 -1.42 -0.723 2.47 7 Adelie vs. Adelie -0.876 1.24 -0.426 0.581 5.94 8 Adelie vs. Adelie -0.529 0.221 -1.35 -1.25 5.27 9 Adelie vs. Adelie -0.986 2.05 -0.711 -0.506 7.75 10 Adelie vs. Adelie -1.72 2.00 -0.212 0.240 15.2 # … with 989 more rows, and abbreviated variable names # ¹bill_depth_mm, ²flipper_length_mm, ³body_mass_g # ℹ Use `print(n = ...)` to see more rows 例えばマハラノビス距離
例えばマハラノビス距離 ggplot(dat_mahaD) + aes(mahaD2, color = Var1, fill = Var1)
+ geom_density(alpha = 0.5) + facet_wrap(~Var2)
総当たり組み合わせ Round-robin そう あ あ く devtools::install_github( "kilometer0101/roundrobin" )
Enjoy!