Quoted-Printable bug?

Thu Jun 20 21:09:22 JST 2002

Hi -

Thanks for your ultra-fast response :-)

Yeah, I forgot to mention an essential item: I use the Latin-9
language environment, originally with an escaped multi-byte encoding.

> Well, what charset or characters do you use?  If you use the
> `escaped' character X, put the following lines in the *scratch*
> buffer.
> 
> (let ((c ?X))
>   (list
>    (+ 0 c)
>    (char-charset c)
>    (char-codepoint c)
>    (char-octet c)))
> 
> Then, type C-j in the end of the last line and let us know the
> result.  Note that the major-mode for the *scratch* buffer
> should be `lisp-interaction-mode'.

Aaargh, I don't seem to get the escaped encoding back... If you can't
figure it out based on what I give you below, I'll have to start over,
using an empty .emacs. No matter what I set
enable-multibyte-characters to, I have 8-bit encoding, according to
describe-language-environment. The "echo ... | mimencode" stuff inside
an emacs shell now works for both settings of
enable-multibyte-characters. However, a problem still exists for SEMI:

Original message showing three differently accented, lower-case a's,
enclosed and separated by four non-accented, upper-case A's:

To: Justus.Piater at inrialpes.fr
--text follows this line--
AàAâAáA

With enable-multibyte-characters=t, mime-edit preview shows empty
boxes for the three accented characters.

Message after mime-edit-exit:

To: Justus.Piater at inrialpes.fr
MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
--text follows this line--
A=EF=BF=BDA=EF=BF=BDA=EF=BF=BDA

which is wrong. All three different accented a's are encoded
identically as "=EF=BF=BD".

With enable-multibyte-characters=nil, the result is correctly:

To: Justus.Piater at inrialpes.fr
MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
--text follows this line--
A=E0A=E2A=E1A

When decoded, this gives garbage. I tried similar things using
ISO-8859-1 encoding instead of UTF-8, and that gives a different sort
of garbage.

> Hm, the word AbA? seems to have been corrupted in my mailbox.

Yes. The original sequence consists of three differently accented,
lower-case a's, enclosed and separated by non-accented, upper-case
A's.

> The raw message contains utf-8 and qp encoded "A=E0A=E2A=E2A".
> Isn't it correct?  Or just it was caused by the bug?

No, this sequence is correct.

Let me know how you are doing. If need be, I'll start with a clean
.emacs and try to give you specifics of something reproducible :-/
Clearly, I don't quite yet understand how the coding systems work in
emacs.

Justus

-- 
Justus Piater                             Projet PRIMA, GRAVIR-IMAG
http://www-prima.inrialpes.fr/piater/     INRIA Rhône-Alpes