Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
DRL 組み合わせ最適化
Search
newzy
November 24, 2021
Research
1
87
DRL 組み合わせ最適化
newzy
November 24, 2021
Tweet
Share
Other Decks in Research
See All in Research
Scale-Aware Recognition in Satellite images Under Resource Constraints
satai
3
260
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
satai
3
160
Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery
satai
3
380
ノンパラメトリック分布表現を用いた位置尤度場周辺化によるRTK-GNSSの整数アンビギュイティ推定
aoki_nosse
0
310
EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote Sensing
satai
3
250
データサイエンティストの採用に関するアンケート
datascientistsociety
PRO
0
780
Cross-Media Information Spaces and Architectures
signer
PRO
0
220
近似動的計画入門
mickey_kubo
4
890
Type Theory as a Formal Basis of Natural Language Semantics
daikimatsuoka
1
170
CHaserWeb:ブラウザ上で動作する対戦型プログラミング学習環境の提案と評価 / i2025-inoue
yumulab
0
160
Adaptive fusion of multi-modal remote sensing data for optimal sub-field crop yield prediction
satai
3
160
データサイエンティストの就労意識~2015→2024 一般(個人)会員アンケートより
datascientistsociety
PRO
0
570
Featured
See All Featured
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
30
2.4k
Faster Mobile Websites
deanohume
307
31k
[RailsConf 2023] Rails as a piece of cake
palkan
55
5.6k
What's in a price? How to price your products and services
michaelherold
245
12k
Unsuck your backbone
ammeep
671
58k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
34
2.3k
VelocityConf: Rendering Performance Case Studies
addyosmani
329
24k
Build The Right Thing And Hit Your Dates
maggiecrowley
35
2.7k
Producing Creativity
orderedlist
PRO
346
40k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
48
5.4k
Git: the NoSQL Database
bkeepers
PRO
430
65k
Making Projects Easy
brettharned
116
6.2k
Transcript
POMO: Policy Optimization with Multiple Optima for Reinforcement Learning Kwon,
Yeong-Dae, et al. NeurIPS, 2020, vol.33
ཁ •Έ߹Θͤ࠷దԽʹ͓͚ΔɼਂڧԽֶश ͰͷFOEUPFOEͷۙࣅղ๏ɽ •طଘͷਂڧԽֶशख๏ͱൺֱͯ͠ɼ ܭࢉ࣌ؒɾਫ਼ͱʹେ͖͘վળͨ͠ •८ճηʔϧεϚϯͳͲͰݕূɽ 2/26
ಋೖ
Έ߹Θͤ࠷దԽ •८ճηʔϧεϚϯૹܭըɼφοϓβοΫ ͳͲʹද͞ΕΔΑ͏ͳ࠷దͳΈ߹ΘͤΛٻΊΔɽ 4/26 精度 計算時間 厳密解法 最適 遅い 近似解法
最適に 近い 早い https://onl.tw/vzkASMX
ڧԽֶशʢ3FJOGPSDFNFOU-FBSOJOH3-ʣ •3-ɿஞ࣍తͳҙࢥܾఆΛղ͘ख๏ɽ ྦྷੵใु͕࠷େʹͳΔΑ͏ͳํࡦΛݟ͚ͭΔ͜ͱ͕తɽ 5/26 ઃఆͱͯ͠ɼঢ়ଶू߹ɼߦಈू߹ɼใुؔΛ ઃఆ͢Δඞཁ͕͋Δɽ https://onl.tw/98fQVvW
ํࡦϕʔεͷ3&*/'03$& 6/26 •ํࡦ 𝜋 𝑠 ɿঢ়ଶ𝑠ʹ͓͚Δߦಈ𝑎Λग़ྗ͢Δؔ •𝜋! ɿύϥϝʔλ 𝜃ͰύϥϝʔλԽ͞Εͨํࡦ •ํࡦͷߋ৽ࣜɿ𝛼ֶशɼ𝐽
𝜋! తؔ 𝜃 ← 𝜃 + 𝛼∇! 𝐽 𝜋! •ํࡦޯͷࣜɿ𝔼ظɼ𝑅" ऩӹɼ𝑏 𝑠 ϕʔεϥΠϯ ∇! 𝐽 𝜋! = 𝔼#! ∇! log 𝜋! ⋅ 𝑅" − 𝑏 𝑠
ઌߦݚڀ
1PJOUFS/FUXPSLTʢʣ Έ߹Θͤ࠷దԽͰར༻͢ΔωοτϫʔΫ •ॏෳͳ͘બ͠ɼग़ྗύλʔϯྻΛੜ͢Δɽ •ೖྗใ͔Βಛநग़Λߦ͏FODPEFSͱɼFODPEFS ͷग़ྗΛར༻ͯ͑͠ͱͳΔܦ࿏Λग़ྗ͢ΔEFDPEFS͔ ΒͳΔɽ •FODPEFSͱEFDPEFSʹ-45.Λ༻ɽ 8/26
"UUFOUJPO .PEFMʢʣ 1PJOUFS/FUXPSLTͷվྑ൛ •1PJOUFS/FUXPSLTಉ༷ɼ&ODPEFSͱ%FDPEFSΛ༻͢Δ Ϟσϧɽ •-45.ഇࢭ͠ɼ.VMUJIFBE"UUFOUJPOΛ࠾༻ɽ 9/26
ख๏
ຊจͷख๏ͷΞΠσΞ 11/26 ࠷ॳͷߦಈɼޙͷΤʔδΣϯτͷߦಈʹେ͖͘ӨڹΛ༩͑Δɽ Έ߹Θͤ࠷దԽʹΑ͘ݟΒΕΔରশੑΛར༻ɽ
10.0 •3&*/'03$&XJUI#BTFMJOFɿయܕతͳํࡦޯϕʔεͷ 3-ΞϧΰϦζϜΛ༻ɽ •ෳͷҟͳΔ։࢝ߦಈΛࢦఆ͠ɼෳͷߦಈܥྻʢيಓʣ ΛಘΔɽ •ʻ45"35ʼτʔΫϯΛ༻͍ͳ͍ɽ 12/26 従来 POMO
10.0 ∇! 𝐽 𝜃 ≈ 1 𝑁 6 $%& '
𝑅 𝜏$ − 𝑏$ 𝑠 ∇! log 𝑝! 𝜏$ ∣ 𝑠 𝑤ℎ𝑒𝑟𝑒 𝑝! 𝝉$ ∣ 𝑠 ≡ @ "%( ) 𝑝! 𝑎" $ ∣ 𝑠, 𝑎&:"+& $ يಓ 𝝉$ = 𝑎& $ , 𝑎( $ , … , 𝑎) $ GPS 𝑖 = 1,2, … , 𝑁 ڞ༗ϕʔεϥΠϯ 𝑏$(𝑠) = 𝑏TIBSFE (𝑠) = 1 𝑁 6 ,%& ' 𝑅 𝝉, GPS 𝑖 = 1,2, … , 𝑁 13/26
܇࿅෦ͷٖࣅίʔυ 14/26
*OTUBODF"VHNFOUBUJPOɿਪख๏ •ը૾ॲཧͷσʔλΦʔάϝϯςʔγϣϯ͔Βணɽ •ࠓճ͏࠲ඪɼYͷ୯Ґਖ਼ํܗʢୈҰݶʣͷ ͷΛར༻ɽ 15/26 今回使う Instance Augmentation
ਪ෦ͷٖࣅίʔυ 16/26
࣮ݧ
࣮ݧ ࣮ݧ༰ •10.0Λ༻͍ͯɼҎԼͷΛղ͍ͨ݁ՌΛଞͷදతख๏ͱ ൺֱɽ ८ճηʔϧεϚϯ ༰ྔ੍͋Γͷૹܭը φοϓβοΫ
18/26
ֶशۂઢɿ८ճηʔϧεϚϯ 19/26 50地点 100地点
८ճηʔϧεϚϯʢ541ʣ 20/26
८ճηʔϧεϚϯʢ541ʣ 21/26
༰ྔ੍͋Γͷૹܭըʢ$731ʣ 22/26
φοϓβοΫʢ,1ʣ 23/26
࣮ݧͷ·ͱΊ •ҟͳΔઃఆͷͭͷΈ߹Θͤ࠷దԽʹରͯ͠ɼ ಉҰͷ܇࿅ख๏ͱ//ΞʔΩςΫνϟΛ༻͍ͯ༗ͳ݁ՌΛ ಘͨɽ •܇࿅ɾਪख๏ͱͯ͠ͷ10.0ɼਪख๏ͱͯ͠ͷ *OTUBODF"VHNFOUBUJPOͲͪΒޮՌతͳख๏Ͱ͋Δ͜ͱ Λ֬ೝͨ͠ɽ 24/26
·ͱΊ ຊจͰΈ߹Θͤ࠷దԽʹ͓͍ͯɼରশੑΛར༻ ͯ͠3-ͷαϯϓϧޮਫ਼ ਪ࣌ؒΛॖ͢Δख๏Λ հͨ͠ɽ 25/26
ࢀߟจݙ ,XPO :FPOH%BF FUBM10.01PMJDZ0QUJNJ[BUJPOXJUI .VMUJQMF0QUJNBGPS3FJOGPSDFNFOU-FBSOJOH "EWBODFTJO /FVSBM*OGPSNBUJPO1SPDFTTJOH4ZTUFNT
,PPM 8PVUFS )FSLF WBO)PPG BOE.BY8FMMJOH"UUFOUJPO -FBSOUP4PMWF3PVUJOH1SPCMFNT *OUFSOBUJPOBM$POGFSFODF PO-FBSOJOH3FQSFTFOUBUJPOT 7JOZBMT 0SJPM .FJSF 'PSUVOBUP BOE/BWEFFQ+BJUMZ1PJOUFS /FUXPSLT "EWBODFTJO/FVSBM*OGPSNBUJPO1SPDFTTJOH 4ZTUFNT 26/26