Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RWC 2024 DICOM & ISO/IEC 2022

RWC 2024 DICOM & ISO/IEC 2022

Ruby World Conference 2024

Avatar for seki at druby.org

seki at druby.org

December 05, 2024
Tweet

More Decks by seki at druby.org

Other Decks in Programming

Transcript

  1. 属性の例 (0008,0005) : 文字集合 (0010,0010) : 患者名 (0020,0032) : 画像の場所(人体座標)

    (0028,0010) : 画素数(Rows) (0028,0011) : 画素数(Cols)  9
  2. Ruby風に書いた属性リスト タグごとに値のエンコード方法が違うよ  10 dicom = [ ... [[0x0008, 0x0005],

    "\\ISO 2022 IR 87\\ISO 2022 IR 13"], # charset ... [[0x0020, 0x0032], "-96.7773\\-36.77734\\-676.00"], # Image Position ... [[0x0028, 0x0010], 512], # Rows [[0x0028, 0x0011], 512], # Cols ... ]
  3. 属性のエンコード step1  11 0008 0005 'CS' 001e "\\ISO 2022

    IR 87\\ISO 2022 IR 13" dicom = [ ... [[0x0008, 0x0005], "\\ISO 2022 IR 87\\ISO 2022 IR 13"], # charset ... [[0x0020, 0x0032], "-96.7773\\-36.77734\\-676.00"], # Image Position ... [[0x0028, 0x0010], 512], # Rows [[0x0028, 0x0011], 512], # Cols ... ] タグ : 2つの16bit整数 VR : 値の表現方法を示す2文字 2byte データ長 : 16bit/32bitの整数 値 : VRに従って表現されたる。偶数バイト
  4. 属性のエンコード step2 転送構文 - transfer syntax VRを明示するLittle Endian (Explicit Little)

    他にImplicit Little, Explicit Bigがある Implicit Bigはないみたい メタ情報ブロックにどんな転送構文なのか書いてある メタ情報はExplicit Little... 各社/各世代でいろんなエンコード方法があった名残なのかなー  12 0008 0005 'CS' 001e "\\ISO 2022 IR 87\\ISO 2022 IR 13" 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  5. DICM - DICOMファイルだよマーク タグごとに値のエンコード方法が違うよ  13 00000000 00 00 00

    00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 44 49 43 4d 02 00 00 00 55 4c 04 00 b0 00 00 00 |DICM....UL......| 00000090 02 00 01 00 4f 42 00 00 02 00 00 00 00 01 02 00 |....OB..........| 000000a0 02 00 55 49 1a 00 31 2e 32 2e 38 34 30 2e 31 30 |..UI..1.2.840.10| 000000b0 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 32 00 |008.5.1.4.1.1.2.| 000000c0 02 00 03 00 55 49 3e 00 31 2e 32 2e 33 39 32 2e |....UI>.1.2.392.| 000000d0 32 30 30 30 33 36 2e 39 31 34 32 2e 31 30 30 30 |200036.9142.1000| 000000e0 33 33 30 32 2e 31 30 32 30 34 33 38 30 30 31 2e |3302.1020438001.| 000000f0 33 2e 32 30 32 34 30 37 30 34 31 33 30 30 35 34 |3.20240704130054| 00000100 2e 38 30 30 31 33 02 00 10 00 55 49 14 00 31 2e |.80013....UI..1.| 00000110 32 2e 38 34 30 2e 31 30 30 30 38 2e 31 2e 32 2e |2.840.10008.1.2.| 00000120 31 00 02 00 12 00 55 49 16 00 31 2e 32 2e 33 39 |1.....UI..1.2.39| 00000130 32 2e 32 30 30 30 33 36 2e 39 31 34 32 2e 31 00 |2.200036.9142.1.| 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  6. (0002,0000)はメタ情報全体の長さ VRは'UL',データの長さは4byte,値は0xb0 タグごとに値のエンコード方法が違うよ  14 00000000 00 00 00 00

    00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 44 49 43 4d 02 00 00 00 55 4c 04 00 b0 00 00 00 |DICM....UL......| 00000090 02 00 01 00 4f 42 00 00 02 00 00 00 00 01 02 00 |....OB..........| 000000a0 02 00 55 49 1a 00 31 2e 32 2e 38 34 30 2e 31 30 |..UI..1.2.840.10| 000000b0 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 32 00 |008.5.1.4.1.1.2.| 000000c0 02 00 03 00 55 49 3e 00 31 2e 32 2e 33 39 32 2e |....UI>.1.2.392.| 000000d0 32 30 30 30 33 36 2e 39 31 34 32 2e 31 30 30 30 |200036.9142.1000| 000000e0 33 33 30 32 2e 31 30 32 30 34 33 38 30 30 31 2e |3302.1020438001.| 000000f0 33 2e 32 30 32 34 30 37 30 34 31 33 30 30 35 34 |3.20240704130054| 00000100 2e 38 30 30 31 33 02 00 10 00 55 49 14 00 31 2e |.80013....UI..1.| 00000110 32 2e 38 34 30 2e 31 30 30 30 38 2e 31 2e 32 2e |2.840.10008.1.2.| 00000120 31 00 02 00 12 00 55 49 16 00 31 2e 32 2e 33 39 |1.....UI..1.2.39| 00000130 32 2e 32 30 30 30 33 36 2e 39 31 34 32 2e 31 00 |2.200036.9142.1.| 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  7. この辺りがメタ情報 タグごとに値のエンコード方法が違うよ  15 00000000 00 00 00 00 00

    00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 44 49 43 4d 02 00 00 00 55 4c 04 00 b0 00 00 00 |DICM....UL......| 00000090 02 00 01 00 4f 42 00 00 02 00 00 00 00 01 02 00 |....OB..........| 000000a0 02 00 55 49 1a 00 31 2e 32 2e 38 34 30 2e 31 30 |..UI..1.2.840.10| 000000b0 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 32 00 |008.5.1.4.1.1.2.| 000000c0 02 00 03 00 55 49 3e 00 31 2e 32 2e 33 39 32 2e |....UI>.1.2.392.| 000000d0 32 30 30 30 33 36 2e 39 31 34 32 2e 31 30 30 30 |200036.9142.1000| 000000e0 33 33 30 32 2e 31 30 32 30 34 33 38 30 30 31 2e |3302.1020438001.| 000000f0 33 2e 32 30 32 34 30 37 30 34 31 33 30 30 35 34 |3.20240704130054| 00000100 2e 38 30 30 31 33 02 00 10 00 55 49 14 00 31 2e |.80013....UI..1.| 00000110 32 2e 38 34 30 2e 31 30 30 30 38 2e 31 2e 32 2e |2.840.10008.1.2.| 00000120 31 00 02 00 12 00 55 49 16 00 31 2e 32 2e 33 39 |1.....UI..1.2.39| 00000130 32 2e 32 30 30 30 33 36 2e 39 31 34 32 2e 31 00 |2.200036.9142.1.| 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  8. (0002,0010)が転送構文 1.2.840.10008.1.2.1はExplicitLittle タグごとに値のエンコード方法が違うよ  16 00000000 00 00 00 00

    00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 44 49 43 4d 02 00 00 00 55 4c 04 00 b0 00 00 00 |DICM....UL......| 00000090 02 00 01 00 4f 42 00 00 02 00 00 00 00 01 02 00 |....OB..........| 000000a0 02 00 55 49 1a 00 31 2e 32 2e 38 34 30 2e 31 30 |..UI..1.2.840.10| 000000b0 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 32 00 |008.5.1.4.1.1.2.| 000000c0 02 00 03 00 55 49 3e 00 31 2e 32 2e 33 39 32 2e |....UI>.1.2.392.| 000000d0 32 30 30 30 33 36 2e 39 31 34 32 2e 31 30 30 30 |200036.9142.1000| 000000e0 33 33 30 32 2e 31 30 32 30 34 33 38 30 30 31 2e |3302.1020438001.| 000000f0 33 2e 32 30 32 34 30 37 30 34 31 33 30 30 35 34 |3.20240704130054| 00000100 2e 38 30 30 31 33 02 00 10 00 55 49 14 00 31 2e |.80013....UI..1.| 00000110 32 2e 38 34 30 2e 31 30 30 30 38 2e 31 2e 32 2e |2.840.10008.1.2.| 00000120 31 00 02 00 12 00 55 49 16 00 31 2e 32 2e 33 39 |1.....UI..1.2.39| 00000130 32 2e 32 30 30 30 33 36 2e 39 31 34 32 2e 31 00 |2.200036.9142.1.| 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  9. (0008,0005)は使用する文字集合 ISO 2022 IR 87, IR 13 と IR 6を使う宣言

    タグごとに値のエンコード方法が違うよ  17 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 44 49 43 4d 02 00 00 00 55 4c 04 00 b0 00 00 00 |DICM....UL......| 00000090 02 00 01 00 4f 42 00 00 02 00 00 00 00 01 02 00 |....OB..........| 000000a0 02 00 55 49 1a 00 31 2e 32 2e 38 34 30 2e 31 30 |..UI..1.2.840.10| 000000b0 30 30 38 2e 35 2e 31 2e 34 2e 31 2e 31 2e 32 00 |008.5.1.4.1.1.2.| 000000c0 02 00 03 00 55 49 3e 00 31 2e 32 2e 33 39 32 2e |....UI>.1.2.392.| 000000d0 32 30 30 30 33 36 2e 39 31 34 32 2e 31 30 30 30 |200036.9142.1000| 000000e0 33 33 30 32 2e 31 30 32 30 34 33 38 30 30 31 2e |3302.1020438001.| 000000f0 33 2e 32 30 32 34 30 37 30 34 31 33 30 30 35 34 |3.20240704130054| 00000100 2e 38 30 30 31 33 02 00 10 00 55 49 14 00 31 2e |.80013....UI..1.| 00000110 32 2e 38 34 30 2e 31 30 30 30 38 2e 31 2e 32 2e |2.840.10008.1.2.| 00000120 31 00 02 00 12 00 55 49 16 00 31 2e 32 2e 33 39 |1.....UI..1.2.39| 00000130 32 2e 32 30 30 30 33 36 2e 39 31 34 32 2e 31 00 |2.200036.9142.1.| 00000140 08 00 05 00 43 53 1e 00 5c 49 53 4f 20 32 30 32 |....CS..\ISO 202| 00000150 32 20 49 52 20 38 37 5c 49 53 4f 20 32 30 32 32 |2 IR 87\ISO 2022| 00000160 20 49 52 20 31 33 08 00 08 00 43 53 16 00 4f 52 | IR 13....CS..OR|
  10. DICOMの文字集合 DICOM文書でいろんな文字集合を扱う仕組みがある extensionなし - 文書の中に一種類の文字集合だけ ASCII, Latin-1, utf-8, GB18030... extensionあり

    - 文書の中に複数の文字集合がある ISO/IEC 2022の技術をもとにしてる extensionなしはRubyのencodingに似てるよ  18
  11. ASCII なぜこの並び?  23      

        " # $ % & '  /6- %-& 41  ! 1 A Q  40) %$   " 2 B R  459 %$   # 3 C S  &59 %$   $ 4 D T  &05 %$   % 5 E U  &/2 /",   & 6 F V  "$, 4:/   ' 7 G W  #&- &5#   ( 8 H X  #4 $"/  ) 9 I Y  )5 &.  * : J Z " -' 46#  + ; K [ # 75 &4$  , < L \ $ '' '4  - a M c % $3 (4   . > N ^ & 40 34   / ? O d ' 4* 64  0 @ P %&-
  12. JIS X 0201  24     

         " # $ % & '  /6- %-& 41  ! 1 A Q Ŗ Ŧ Ŷ  40) %$   " 2 B R Ň ŗ ŧ ŷ  459 %$   # 3 C S ň Ř Ũ Ÿ  &59 %$   $ 4 D T ʼn ř ũ Ź  &05 %$   % 5 E U Ŋ Ś Ū ź  &/2 /",   & 6 F V ŋ ś ū Ż  "$, 4:/   ' 7 G W Ō Ŝ Ŭ ż  #&- &5#   ( 8 H X ō ŝ ŭ Ž  #4 $"/  ) 9 I Y Ŏ Ş Ů ž  )5 &.  * : J Z ŏ ş ů ſ " -' 46#  + ; K [ Ő Š Ű ƀ # 75 &4$  , < L \ ő š ű Ɓ $ '' '4  - = M c Œ Ţ Ų Ƃ % $3 (4   . > N ^ œ ţ ų ƃ & 40 34   / ? O ‾ Ŕ Ť Ŵ Ƅ ' 4* 64  0 @ P %&- ŕ ť ŵ ƅ
  13. ISO-8859-1  25       

       " # $ % & '  /6- %-& 41  ! 1 A Q /#41 › ¤ ³ Á Ð  40) %$   " 2 B R e œ ¥ ´ Â Ñ  459 %$   # 3 C S f  ¦ µ Ã Ò  &59 %$   $ 4 D T g ž § ¶ Ä Ó  &05 %$   % 5 E U k ´ ¨ · Å Ô  &/2 /",   & 6 F V = Ÿ © ¸ Æ Õ  "$, 4:/   ' 7 G W ] v ‹ ¹ ‘ Ö  #&- &5#   ( 8 H X j u ª º Ç ×  #4 $"/  ) 9 I Y ¨ † « Ž È ”  )5 &.  * : J Z ˜ ¬ » É Ø " -' 46#  + ; K [ Œ  ­ ¼ Ê Ù # 75 &4$  , < L \ m { ® ½ Ë Ú $ '' '4  - a M c ™ ¡ ¯ ¾ Ì Û % $3 (4   . > N ^  ¢ ° ¿ Í Ü & 40 34   / ? O d š £ ± À Î Ý ' 4* 64  0 @ P %&-  ~ ² – Ï Þ
  14. ISO-8859-2  26       

       " # $ % & '  /6- %-& 41  ! 1 A Q /#41 › 㶋 Đ 㶘 㶟  40) %$   " 2 B R 㵹 㶁 ¥ 㶒  㶠  459 %$   # 3 C S 㵺 㶂 ¦ 㶓 à 㶡  &59 %$   $ 4 D T  “ 㶌 ¶ 㶙 Ó  &05 %$   % 5 E U k ´ ¨ · Å Ô  &/2 /",   & 6 F V 㵻 㶃 㶍 㶔 㶚 㶢  "$, 4:/   ' 7 G W 㵼 㶄 㶎 ¹ 㶛 Ö  #&- &5#   ( 8 H X j 㶅 ª º Ç ×  #4 $"/  ) 9 I Y ¨ † 㶏 㶕 㶜 㶣  )5 &.  * : J Z ß ã ¬ Ⓖ É Ⓢ " -' 46#  + ; K [ 㵽 㶆 㶐 ¼ 㶝 Ù # 75 &4$  , < L \ 㵾 㶇 ® 㶖 Ë 㶤 $ '' '4  - a M c 㵿 㶈 ⒳ ¾ Ⓙ Û % $3 (4   . > N ^  㶉 ° ¿ Í Ü & 40 34   / ? O d á å ± 㶗 Î 㶥 ' 4* 64  0 @ P %&- 㶀 㶊 㶑 – 㶞 㶦
  15. 4つの領域にわけて  28       

       " # $ % & '  /6- %-& 41  ! 1 A Q /#41 › 㶋 Đ 㶘 㶟  40) %$   " 2 B R 㵹 㶁 ¥ 㶒  㶠  459 %$   # 3 C S 㵺 㶂 ¦ 㶓 à 㶡  &59 %$   $ 4 D T  “ 㶌 ¶ 㶙 Ó  &05 %$   % 5 E U k ´ ¨ · Å Ô  &/2 /",   & 6 F V 㵻 㶃 㶍 㶔 㶚 㶢  "$, 4:/   ' 7 G W 㵼 㶄 㶎 ¹ 㶛 Ö  #&- &5#   ( 8 H X j 㶅 ª º Ç ×  #4 $"/  ) 9 I Y ¨ † 㶏 㶕 㶜 㶣  )5 &.  * : J Z ß ã ¬ Ⓖ É Ⓢ " -' 46#  + ; K [ 㵽 㶆 㶐 ¼ 㶝 Ù # 75 &4$  , < L \ 㵾 㶇 ® 㶖 Ë 㶤 $ '' '4  - a M c 㵿 㶈 ⒳ ¾ Ⓙ Û % $3 (4   . > N ^  㶉 ° ¿ Í Ü & 40 34   / ? O d á å ± 㶗 Î 㶥 ' 4* 64  0 @ P %&- 㶀 㶊 㶑 – 㶞 㶦
  16. 4つの領域にわけて  29       

       " # $ % & '  /6- %-& 41  ! 1 A Q /#41 › ¤ ³ Á Ð  40) %$   " 2 B R e œ ¥ ´ Â Ñ  459 %$   # 3 C S f  ¦ µ Ã Ò  &59 %$   $ 4 D T g ž § ¶ Ä Ó  &05 %$   % 5 E U k ´ ¨ · Å Ô  &/2 /",   & 6 F V = Ÿ © ¸ Æ Õ  "$, 4:/   ' 7 G W ] v ‹ ¹ ‘ Ö  #&- &5#   ( 8 H X j u ª º Ç ×  #4 $"/  ) 9 I Y ¨ † « Ž È ”  )5 &.  * : J Z ˜ ¬ » É Ø " -' 46#  + ; K [ Œ  ­ ¼ Ê Ù # 75 &4$  , < L \ m { ® ½ Ë Ú $ '' '4  - a M c ™ ¡ ¯ ¾ Ì Û % $3 (4   . > N ^  ¢ ° ¿ Í Ü & 40 34   / ? O d š £ ± À Î Ý ' 4* 64  0 @ P %&-  ~ ² – Ï Þ
  17. 4つの領域にわけて  30       

       " # $ % & '           " # $ % & ' CL GL CR GR
  18. GLとGRの文字を交換しよう  31       

       " # $ % & '           " # $ % & ' GL GR ASCIIの左 JIS X 201の左 Latin-1の右 カタカナ 漢字
  19. GLとGRの文字を交換しよう  33       

       " # $ % & '           " # $ % & ' GL GR ASCIIの左 JIS X 201の左 Latin-1の右 カタカナ 漢字
  20. 2段階で操作する  34       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 Latin-1の右 カタカナ 漢字 G2 G3 GL GR 指示する designate 呼び出す invoke (shift)
  21. iso-2022-jp  36       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 漢字 G2 G3 GL GR
  22. iso-2022-jp  37       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 漢字 G2 G3 GL GR "෼ࢄRuby".encode('iso-2022-jp') 1b 24 42 4a 2c 3b 36 1b 28 42 52 75 62 79 G0に漢字を指示 GLにG0を呼び出す G0にASCIIを指示 GLにG0を呼び出す 初期状態 G0にASCIIを指示 GLにG0を呼び出す
  23. euc-jp  38       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 カタカナ 漢字 JIS X 0208 G2 G3 GL GR 漢字 JIS X 0212
  24. euc-jp  39       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 カタカナ 漢字 JIS X 0208 G2 G3 GL GR 漢字 JIS X 0212 '෼ࢄſűƄŖ'.encode('euc-jp') ca ac bb b6 8e d9 8e cb 8e de 8e b0 初期状態 G0にASCIIを指示 GLにG0を呼び出す G1に漢字を指示 GRにG1を呼び出す GRにG2を呼び出す 1文字で元に戻る (シングルシフト)
  25. 操作に見えてきた! 文字コードに決まった初期状態にする 文字列にCL/CRの制御文字を使って命令を埋め込む  40 "෼ࢄRuby".encode('iso-2022-jp') 1b 24 42 4a

    2c 3b 36 1b 28 42 52 75 62 79 G0に漢字を指示 GLにG0を呼び出す G0にASCIIを指示 GLにG0を呼び出す 初期状態 G0にASCIIを指示 GLにG0を呼び出す
  26. Extensionなし IRはISOの登録番号らしい。なおISO_IR 13は0x0201カナなのでsjisではない  45 de fi ned term Ruby

    encoding ͳ͠ ascii ISO_IR 100 windows-1252 ISO_IR 101 iso-8859-2 ISO_IR 109 iso-8859-3 ISO_IR 110 iso-8859-4 ISO_IR 144 iso-8859-5 ISO_IR 127 iso-8859-6 ISO_IR 126 iso-8859-7 ISO_IR 138 iso-8859-8 ISO_IR 148 windows-1254 ISO_IR 203 iso-8859-15 ISO_IR 13 shift-jis (ͷҰ෦) ISO_IR 166 tis-620 ISO_IR 192 utf-8 GB18030 GB18030 GBK gbk
  27. 2022 Extension (0008,0005)に文字集合を複数指定する 列挙された文字集合が利用できる 先頭の文字集合が初期状態になる 先頭が省略されたときはIR 6(ASCII)を意味する  46 ["ISO

    2022 IR 13", "ISO 2022 IR 87"] デフォルトはASCII + カナ 漢字も使用するよ ["", ISO 2022 IR 87", "ISO 2022 IR 13"] デフォルトはIR 6(ASCII) 漢字とカナも使用するよ
  28. 2022 Extension 文字集合ごとに使える操作が決められている 先頭に書いてある文字集合の操作が初期状態になる 表は抜粋。defined termは16、操作は17ある  47 de fi

    ned term ESC ISO 2022 IR 6 1b 28 42 G0/GL ASCII ISO 2022 IR 100 1b 2d 41 G1/GR Latin-1ͷӈ 1b 28 42 G0/GL ASCII ISO 2022 IR 13 1b 29 49 G1/GR JIS X 0201 ŜŦŜū 1b 28 4a G0/GL JIS X 0201 ͷࠨ ISO 2022 IR 87 1b 24 42 G0/GL JIS X 0208 ׽ࣈ ISO 2022 IR 149 1b 24 29 43 G1/GR ؖࠃޠ euc-kr 文字列中で使える操作 先頭がIR 13なら、G1 にカタカナ、G0にJIS X 0201のローマ字が初期 値になる
  29. 作戦 じっと見る DICOMの規格書にある患者名の例  49 źŵŦƄ^ŦƁř=ࢁా^ଠ࿠=΍·ͩ^ͨΖ͏ : (0008,0005) ["ISO 2022

    IR 13", "ISO 2022 IR 87"] d4 cf c0 de 5e c0 db b3 3d 1b 24 42 3b 33 45 44 1b 28 4a 5e 1b 24 42 42 40 4f 3a 1b 28 4a 3d 1b 24 42 24 64 24 5e 24 40 1b 28 4a 5e 1b 24 42 24 3f 24 6d 24 26 1b 28 4a
  30. 作戦 (エスケープシーケンス) |(GL*)|(GR*) のセグメント に分けて処理すればよいのでは! ERBみたいなもんか  50 źŵŦƄ^ŦƁř=ࢁా^ଠ࿠=΍·ͩ^ͨΖ͏ :

    (0008,0005) ["ISO 2022 IR 13", "ISO 2022 IR 87"] d4 cf c0 de 5e c0 db b3 3d 1b 24 42 3b 33 45 44 1b 28 4a 5e 1b 24 42 42 40 4f 3a 1b 28 4a 3d 1b 24 42 24 64 24 5e 24 40 1b 28 4a 5e 1b 24 42 24 3f 24 6d 24 26 1b 28 4a GR GL GR エスケープシーケンス GL
  31. できそう! できた Contextという変換器を作って、convertするAPIです  51 charset = "ISO 2022 IR

    100\\ISO 2022 IR 13" str = %w(50 6f 6b e9 6d 6f 6e 20 1b 29 49 ce df b9 d3 dd).map(&:hex).pack('C*') context = DCM_CharSet::Context.new(charset) ctext = context.convert(str) pp ctext [["Pok", #<Encoding:US-ASCII>, "Pok"], ["\xE9", #<Encoding:ISO-8859-1>, "é"], ["mon ", #<Encoding:US-ASCII>, "mon "], ["\xCE\xDF\xB9\xD3\xDD", #<Encoding:CP50221 (dummy)>, "ŴƅşŹƃ"]] ["Pok", "\xE9", "mon ", "\xCE\xDF\xB9\xD3\xDD"] pp ctext.map {|x| [x, x.encoding, x.encode('utf-8')]}
  32. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end コードを説明します  52
  33. else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str)

    [str.force_encoding(@encoding)] end end end module DCM_CharSet class Element def initialize(code_element, escape_sequence, encoding) @code_element = code_element @escape_sequence = escape_sequence.pack('c*') @encoding = encoding end attr_reader :escape_sequence, :encoding, :code_element def _encode(str) str.dup.force_encoding(@encoding) end def encode(str) s = _encode(str) s.instance_variable_set(:@dicom_encoding_element, self) s.freeze s end def inspect "#<#{self.class.to_s}:#{@escape_sequence.inspect} #{@encoding}>" end end module E_shift_to_GR DCM_CharSet::Element  53 Elementはこれ! "GL"か"GR"を示すcode_elementと エスケープシーケンスとRubyのencodingの 3つの属性を持つ Element#encodeで 自分に設定された方法でStringをencodeする これが主な仕事
  34. ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return

    ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end ary end def convert(str) return convert_wo_extensions(str) if @wo_extensions graphic = @default_encoding.map {|e| [e.code_element, e]}.to_h result = [] str.scan(@reg).each do |seg| case seg.bytes.first when 033 e = @allow_encoding[seg] graphic[e.code_element] = e when 0..037, 0200..0237 # CL, CR result << seg when 040..0177 result << graphic['GL'].encode(seg) else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str) [str.force_encoding(@encoding)] end 変換のはじめ / GLとGRの初期化  54 GL, GRごとの変換方法の初期化 { "GL" => DCMCharSet::Element, "GR" => DCMCharSet::Element }  といったHash graphicが「操作」される対象だよ
  35. ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return

    ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end ary end def convert(str) return convert_wo_extensions(str) if @wo_extensions graphic = @default_encoding.map {|e| [e.code_element, e]}.to_h result = [] str.scan(@reg).each do |seg| case seg.bytes.first when 033 e = @allow_encoding[seg] graphic[e.code_element] = e when 0..037, 0200..0237 # CL, CR result << seg when 040..0177 result << graphic['GL'].encode(seg) else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str) [str.force_encoding(@encoding)] end セグメントに分けて処理する  55 セグメントに分けて処理するイテレータ scanがぴったりくるぞ! 正規表現@regは後述 セグメントの種類ごとの分岐
  36. ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return

    ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end ary end def convert(str) return convert_wo_extensions(str) if @wo_extensions graphic = @default_encoding.map {|e| [e.code_element, e]}.to_h result = [] str.scan(@reg).each do |seg| case seg.bytes.first when 033 e = @allow_encoding[seg] graphic[e.code_element] = e when 0..037, 0200..0237 # CL, CR result << seg when 040..0177 result << graphic['GL'].encode(seg) else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str) [str.force_encoding(@encoding)] end セグメントに分けて処理する  56 エスケープシーケンスの場合 graphicの設定を変更する「操作」をする 対応するElementを求めて、 Elementを対応するGL/GRに覚える
  37. ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return

    ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end ary end def convert(str) return convert_wo_extensions(str) if @wo_extensions graphic = @default_encoding.map {|e| [e.code_element, e]}.to_h result = [] str.scan(@reg).each do |seg| case seg.bytes.first when 033 e = @allow_encoding[seg] graphic[e.code_element] = e when 0..037, 0200..0237 # CL, CR result << seg when 040..0177 result << graphic['GL'].encode(seg) else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str) [str.force_encoding(@encoding)] end セグメントに分けて処理する  57 CL/CRのときは変換せずに連結
  38. ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return

    ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end ary end def convert(str) return convert_wo_extensions(str) if @wo_extensions graphic = @default_encoding.map {|e| [e.code_element, e]}.to_h result = [] str.scan(@reg).each do |seg| case seg.bytes.first when 033 e = @allow_encoding[seg] graphic[e.code_element] = e when 0..037, 0200..0237 # CL, CR result << seg when 040..0177 result << graphic['GL'].encode(seg) else result << graphic['GR'].encode(seg) end end result end def convert_wo_extensions(str) [str.force_encoding(@encoding)] end セグメントに分けて処理する  58 graphicの'GL'に設定されている Elementでencodeする 'GR'も同様だよ
  39. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end @regの準備  59
  40. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end @regの準備  60 DICOM文字列の変換器のクラス (0008,0005)の文字集合の設定が引数です parse_charsetで(0008,0005)の設定から文 字集合の名前のArrayに分割する
  41. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end @regの準備  61 extensionなしのケースは割愛
  42. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end @regの準備  62 CharactorSetは文字集合の名前からElementを引くHash(後述) この文書で使用可能なElementを集めて表(@allow_encoding) を作る。エスケープシーケンスからElementを引くHashである
  43. # coding: us-ascii module DCM_CharSet class InvalidCharSet < RuntimeError; end

    class NotAllowDefaultCharSet < InvalidCharSet; end class Context def initialize(dcm_00080005) ary = parse_charset(dcm_00080005) if ary.size == 1 @wo_extensions = true @encoding = CharactorSetWOExtensions[ary.first] else @default_encoding = CharactorSet[ary.first] @wo_extensions = false @allow_encoding = {} ary.each do |x| CharactorSet[x].each do |y| @allow_encoding[y.escape_sequence] = y end end seg = @allow_encoding.keys.map{Regexp.escape(_1)} + ['[\000-\037]+', '[\200-\237]+', '[\040-\177]+', '[\200-\377]+'] @reg = Regexp.new(seg.join('|'), 0) end end def parse_charset(charset) ary = charset ? charset.strip.split('\\').map {|x| x.strip.upcase} : [] return ['ISO_IR 6'] if ary.empty? if ary.size == 1 raise(InvalidCharSet.new(ary.first)) unless CharactorSetWOExtensions.include?(ary.first) return ary end ary[0] = 'ISO 2022 IR 6' if ary[0].empty? raise(NotAllowDefaultCharSet.new(ary[0])) if MultibyteCharactorSet.include?(ary[0]) ary.each do |x| raise(InvalidCharSet.new(x)) unless CharactorSet.include?(x) end @regの準備  63 この文書で使用するエスケープシーケンス、CL, CR, GL, GRの正 規表現を | で連結して、@reg を作る
  44. 'ISO_IR 127' => 'iso-8859-6', 'ISO_IR 126' => 'iso-8859-7', 'ISO_IR 138'

    => 'iso-8859-8', 'ISO_IR 148' => 'windows-1254', # FIXME 'ISO_IR 203' => 'iso-8859-15', 'ISO_IR 13' => 'shift-jis', #FIXME 'ISO_IR 166' => 'tis-620', 'ISO_IR 192' => 'utf-8', 'GB18030' => 'GB18030', 'GBK' => 'gbk' } AsciiElement = Element.new('GL', [0x1B, 0x28, 0x42], 'ascii') CharactorSet = { 'ISO 2022 IR 6' => [AsciiElement], 'ISO 2022 IR 100' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x41], 'iso-8859-1') ], 'ISO 2022 IR 101' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x42], 'iso-8859-2') ], 'ISO 2022 IR 109' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x43], 'iso-8859-3') ], 'ISO 2022 IR 110' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x44], 'iso-8859-4') ], 'ISO 2022 IR 144' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x4C], 'iso-8859-5') CharactorSet  64 DICOMの文字集合の名前から、対 応する操作(Element)のリストを 引くHash
  45. 'ISO_IR 127' => 'iso-8859-6', 'ISO_IR 126' => 'iso-8859-7', 'ISO_IR 138'

    => 'iso-8859-8', 'ISO_IR 148' => 'windows-1254', # FIXME 'ISO_IR 203' => 'iso-8859-15', 'ISO_IR 13' => 'shift-jis', #FIXME 'ISO_IR 166' => 'tis-620', 'ISO_IR 192' => 'utf-8', 'GB18030' => 'GB18030', 'GBK' => 'gbk' } AsciiElement = Element.new('GL', [0x1B, 0x28, 0x42], 'ascii') CharactorSet = { 'ISO 2022 IR 6' => [AsciiElement], 'ISO 2022 IR 100' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x41], 'iso-8859-1') ], 'ISO 2022 IR 101' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x42], 'iso-8859-2') ], 'ISO 2022 IR 109' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x43], 'iso-8859-3') ], 'ISO 2022 IR 110' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x44], 'iso-8859-4') ], 'ISO 2022 IR 144' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x4C], 'iso-8859-5') CharactorSet  65 呼び出し先 エスケープシーケンス Rubyのencoding asciiへの操作はなんども出るのでメモしとく
  46. 'ISO_IR 127' => 'iso-8859-6', 'ISO_IR 126' => 'iso-8859-7', 'ISO_IR 138'

    => 'iso-8859-8', 'ISO_IR 148' => 'windows-1254', # FIXME 'ISO_IR 203' => 'iso-8859-15', 'ISO_IR 13' => 'shift-jis', #FIXME 'ISO_IR 166' => 'tis-620', 'ISO_IR 192' => 'utf-8', 'GB18030' => 'GB18030', 'GBK' => 'gbk' } AsciiElement = Element.new('GL', [0x1B, 0x28, 0x42], 'ascii') CharactorSet = { 'ISO 2022 IR 6' => [AsciiElement], 'ISO 2022 IR 100' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x41], 'iso-8859-1') ], 'ISO 2022 IR 101' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x42], 'iso-8859-2') ], 'ISO 2022 IR 109' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x43], 'iso-8859-3') ], 'ISO 2022 IR 110' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x44], 'iso-8859-4') ], 'ISO 2022 IR 144' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x4C], 'iso-8859-5') CharactorSet  66 ISO 2022 IR 100は GLにasciiを呼び出す操作 GRにIR 100を呼び出す操作 で構成される
  47. Element.new('GR', [0x1B, 0x2D, 0x46], 'iso-8859-7') ], 'ISO 2022 IR 138'

    => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x48], 'iso-8859-8') ], 'ISO 2022 IR 148' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x4D], 'iso-8859-9') ], 'ISO 2022 IR 203' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x62], 'iso-8859-15') ], 'ISO 2022 IR 13' => [ Element.new('GL', [0x1B, 0x28, 0x4A], 'cp50221'), Element.new('GR', [0x1B, 0x29, 0x49], 'cp50221') ], 'ISO 2022 IR 166' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x54], 'tis-620') ], 'ISO 2022 IR 87' => [ Element.new('GL', [0x1B, 0x24, 0x42], 'euc-jp').extend(E_shift_to_GR) ], 'ISO 2022 IR 159' => [ Element.new('GL', [0x1B, 0x24, 0x28, 0x44], 'euc-jp').extend(E_shift_to_GR) ], 'ISO 2022 IR 149' => [ Element.new('GR', [0x1B, 0x24, 0x29, 0x43], 'euc-kr') ], ちょっと苦労したとこ  67 IR 87, 159はGLなんだけど、GRへシフトして euc-jpとして処理することにした
  48. iso-2022-jp  68       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 漢字 G2 G3 GL GR
  49. euc-jp  69       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 カタカナ 漢字 JIS X 0208 G2 G3 GL GR 漢字 JIS X 0212
  50. iso-2022-jp  70       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 漢字 G2 G3 GL GR
  51. GRへ移動してeuc-jpとして扱う  71       

       " # $ % & '           " # $ % & ' G0 G1 ASCIIの左 JIS X 201の左 漢字 G2 G3 GL GR
  52. Element.new('GR', [0x1B, 0x2D, 0x46], 'iso-8859-7') ], 'ISO 2022 IR 138'

    => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x48], 'iso-8859-8') ], 'ISO 2022 IR 148' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x4D], 'iso-8859-9') ], 'ISO 2022 IR 203' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x62], 'iso-8859-15') ], 'ISO 2022 IR 13' => [ Element.new('GL', [0x1B, 0x28, 0x4A], 'cp50221'), Element.new('GR', [0x1B, 0x29, 0x49], 'cp50221') ], 'ISO 2022 IR 166' => [ AsciiElement, Element.new('GR', [0x1B, 0x2D, 0x54], 'tis-620') ], 'ISO 2022 IR 87' => [ Element.new('GL', [0x1B, 0x24, 0x42], 'euc-jp').extend(E_shift_to_GR) ], 'ISO 2022 IR 159' => [ Element.new('GL', [0x1B, 0x24, 0x28, 0x44], 'euc-jp').extend(E_shift_to_GR) ], 'ISO 2022 IR 149' => [ Element.new('GR', [0x1B, 0x24, 0x29, 0x43], 'euc-kr') ], ちょっと苦労したとこ  72 IR 87, 159はGLなんだけど、GRへシフトして euc-jpとして処理することにした
  53. end end end module DCM_CharSet class Element def initialize(code_element, escape_sequence,

    encoding) @code_element = code_element @escape_sequence = escape_sequence.pack('c*') @encoding = encoding end attr_reader :escape_sequence, :encoding, :code_element def _encode(str) str.dup.force_encoding(@encoding) end def encode(str) s = _encode(str) s.instance_variable_set(:@dicom_encoding_element, self) s.freeze s end def inspect "#<#{self.class.to_s}:#{@escape_sequence.inspect} #{@encoding}>" end end module E_shift_to_GR def _encode(str) str.each_byte.map {|x| x > 0x20 ? x | 0x80 : x}.pack('c*').force_encoding(@encoding) end end CharactorSetWOExtensions = { 'ISO_IR 6' => 'ascii', 'ISO_IR 100' => 'windows-1252', 'ISO_IR 101' => 'iso-8859-2', 'ISO_IR 109' => 'iso-8859-3', ちょっと苦労したとこ  73 moduleの中身はこれ。x | 0x80