Machine translation systems typically rely on pre-segmented words as inputs. However, word segmentation is not always a trivial task (e.g. Japanese), and some arguments made in its favor may be less relevant for modern Neural Machine Translation (NMT). A recent paper presented interesting results for *character-level* translation using a novel "Bi-scale RNN" decoder, which processes inputs at two time-scales. This presentation will cover basics of NMT, explain the structure of the Bi-scale RNN, and discuss some language specific considerations for Japanese translation.