Python encode

since 2013-01-14

python_unicodedata の関連記事。

0xff5e は iso-2022-jp にできない:

>>> unichr(0xff5e).encode('cp932').decode('cp932')
u'\uff5e'
 
>>> unichr(0x301c).encode('iso-2022-jp').decode('iso-2022-jp')
u'\u301c'
>>> unichr(0xff5e).encode('iso-2022-jp').decode('iso-2022-jp')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'iso2022_jp' codec can't encode character u'\uff5e' in position 0: illegal multibyte sequence

ignore を使うと消えてしまう。

>>> unichr(0xff5e).encode('iso-2022-jp', 'ignore').decode('iso-2022-jp')
u''

0304 COMBINING MACRON もエラーに:

>>> unichr(0x0304).encode('iso-2022-jp').decode('iso-2022-jp')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'iso2022_jp' codec can't encode character u'\u0304' in position 0: illegal multibyte sequence

非BMP文字もエラーに:

>>> u'\U0002667e'.encode('iso-2022-jp').decode('iso-2022-jp')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'iso2022_jp' codec can't encode character u'\ud859' in position 0: illegal multibyte sequence

python_encode.txt · 最終更新: 2013/01/14 13:04 by Takuya Nishimoto
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0