Upgrade to Pro — share decks privately, control downloads, hide ads and more …

삼목을 정복하자

삼목을 정복하자

삼목이라 했지만 사실은 틱택토. 미니맥스와 몬테카를로 트리 검색으로 삼목을 정복해봅시다. MCTS(Monte-Carlo Tree Search), Minimax

Leonardo YongUk Kim

March 31, 2016
Tweet

More Decks by Leonardo YongUk Kim

Other Decks in Programming

Transcript

  1. Tic-tac-toe (also known as Noughts and crosses or Xs and

    Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
  2. ׮ܖ૑ ঋח Ѫ • ஶߥܖ࣊օ ׏ۡ ֎౟ਕ௼
 (CNN, Convolutional Neural

    Networks) • ঌ౵Ҋ (AlphaGO) • ӝ҅ ೟ण (Machine Learning) • ӝఋ ੋҕ૑מী ؀ೠ बച ղਊ • ҳӖ ஂস • ݫ੉௼স • ؂झ
  3. ਋ࢶ ਤఃೖ٣ই • Minimax (sometimes MinMax or MM[1]) is a

    decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario.
  4. ੼ࣻ • ੉ӝݶ +10 • ૑ݶ -10 • 3 ಕ੉ૉ

    ੉റ, ݒ ಕ੉ૉ ݃׮ ੼ࣻ 1੼ х੼ೞӝ.
  5. ౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize() Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize()

    Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize() Maxmize()Maxmize()Maxmize() Maxmize() Minimize() Minimize() Minimize() Minimize() Minimize() Minimize()
  6. ഒ੗ ఠ޷օ ֢٘ (҃ӝ ՘)ө૑ ೒ۨ੉.
 ೒ۨ੉ ب઺ী ߑޙೠ Ѫ਷

    ֢٘ ୶оೞ૑ ঋ਺.
 eg. ےؒೞѱ 1000౸݅ ف੗. दޛۨ੉࣌:
  7. ޙઁ੼ • ੹ۚ੄ ࠗ੤ • दޛۨ੉࣌ दр੄ ೙ਃ • ୭੸੄

    ׹੉ ইפ׮. • ഛܫ੸ਵ۽ दޛۨ੉࣌੉ ݆ই ૕ࣻ۾ Ӕࢎ೧૗.
  8. Epsilon greedy • ੐੄੄ εਸ о੿. • 1-ε ഛܫਸ ഝਊ.

    (weightо ֫਷ Ҕਸ ഛੋ) • ε੄ ഛܫ۽ ఐ೷. (౟ܻܳ և൨) • Ҋ੿੸ਵ۽ ఐ೷ਸ ೞח ࠺ਊ. • ୡӝী ఐ೷ਸ ੸ѱ ೣ.
  9. • ఐ೷җ ഝਊী Ӑഋ. • ୡӝী ఐ೷ೞҊ ੉റ ࠁ੿ೞח ध.

    • UCBо ֫਷ Ҕਸ ߑޙ. • UCB = Upper Confidence Bound