encoded word

2001年 12月 3日 (月) 19:12:03 JST

In article <86pu5wfwzz.fsf ＠ aqua.ocn.ne.jp>,
  Shuhei KOBAYASHI <shuhei ＠ aqua.ocn.ne.jp> writes:

> rfc2047 から "code-switching techniques" を探してみると,
> 
> |    Some character sets use code-switching techniques to switch between
> |    "ASCII mode" and other modes.  If unencoded text in an 'encoded-word'
> |    contains a sequence which causes the charset interpreter to switch
> |    out of ASCII mode, it MUST contain additional control codes such that
> |    ASCII mode is again selected at the end of the 'encoded-word'.  (This
> |    rule applies separately to each 'encoded-word', including adjacent
> |    'encoded-word's within a single header field.)
> 
> というわけで許されていません.

この記述はこういう説明の時に便利ではあるのですが、本質的ではありません。
(ということを知らないとも思えないけど。)

たとえば、ASCII を指示した G2 を GL に invoke した状態で終わってもいい
ことになります。そういうことを許している charset があると仮定しての話
ですが。

むしろ、

|   The
|   'encoded-text' in each 'encoded-word' must be well-formed according
|   to the encoding specified;

という記述が本質で、指定された charset (この場合は ISO-2022-JP) として
中身が間違っていることが問題であるととらえたほうが適切でしょう。つまり、
ISO-2022-JP では末尾で ASCII にしなければならないという RFC 1468 の

|    Also, the text must end in ASCII.

という記述を利用し、encoded-text で表現されたバイト列がこれに反してい
ることを encoded-word として正しくないということの根拠とする、というわ
けです。
-- 
[田中 哲][たなか あきら][Tanaka Akira]
「ふえろ! わかめちゃん作戦です♡」(Little Worker, 桂遊生丸)