UTF-8 and resistance to change
Wouldn’t you know… While we (european people at least) are trying to make more and more open-source product use UTF-8 by default, so that we can (finally) share the same encoding with people that use character sets so different from us (mostly Middle to Far-East), resistance to change seems to be an important problem in Japan as well (as everywhere else).
On the php-i18n mailing list (see post here), Dietrich Bollmann took the hassle of posting a list of comments he gets from time to time regarding the implementation of UTF-8 inside websites and e-mails, from web developers and web authors. Let’s give him some space for a large quote and then I’ll give you a few personal views about all this (note that “cellars” actually mean “cellulars/mobile phones”):
Here a list with some of the answers I got (as I got them): - Most cellars don't work with UTF-8. ...this is the one most important answer I got as lots of people in Japan use the time they spend in the subway to read and write their email with their cellar. Only some cellars work with UTF-8 , most don't. And I often was told by friends that my email program (I normally use UTF-8 ) "doesn't work correctly" : ) More often I just didn't get any answer at all... Based on this experience it is just natural that people don't switch to UTF-8. And even if more and more of the newer programs also work with UTF-8 , probably it will still take a while until this "tradition" in the Japanese software developer community will change. continuing with the answers: 1. I don't like UTF-8. It is too new, everybody is used to JIS and when using UTF-8 there are always lots of problems. With JIS things work well out of the box. 2. The file size becomes bigger as there are so many different Characters which have to be encoded and Japanese characters are encoded with three bytes in UTF-8. 3. There are too many different versions of UTF-8 which create problems. There is only one version of JIS which and therefor no version problems arise. 4. I only use UTF-8 if absolutely necessary, for example when Chinese and Japanese texts are on the same page. 5. when using UTF-8 the characters do not look nice. 6. There is no need for UTF-8 : Japanese and Ascii is all we need in normal circumstances, why bother about other languages? 7. Similar Characters are grouped together and differences between similar Japanese Characters get lost. 8. Doesn't look good. 9. When only Chinese or only Japanese it looks good, when mixing languages the Characters the page gets ugly.
So, what do you think of that? I couldn’t imagine easily that reaction. In fact, it made me think about the North-American reaction to other charsets than ASCII. It’s a strong resistance to change, expressed in many different ways (particularly strong in point 6).
This list is particularly useful in understanding the problems end-users are facing and how things could be improved at the technical level for them to get a better experience with UTF-8.
There is a difference between coding systems and charsets (and I recommend following the link in the side-menu to understand that a bit better), but to be short I think the charset might influence the fonts used, in that every computer has a long list of charsets available and, for each of these charsets, there is a set of fonts (some of them being usable by several charsets, some of them not), that some people actually drew. Now this means that, if the guys that drew the images for “ttf-mikachan” (for UTF-8 ) are not as good as the “xfonts-jisx0213″ fonts (JIS only?), then people looking at UTF-8 text will think that, because it’s UTF-8, it actually looks worst, although they could just change fonts and it would look nice too. Changing default fonts, however, is not the easiest thing to do on a computer (and it might just be impossible on mobile devices).
So, if you, out there, are looking into implementing UTF-8 on your website or your e-mail client to communicate with Eastern people, please try to take into account the list of reasons why they don’t like it, and hopefully we’ll all end-up with better applications and a full adoption of UTF-8.