Went to see Avatar in 3D tonight. Typical story, but the animation was amazing. The 3D was also better than what has been out there. But darn, for the past 4 hours after exiting the theatre I have been having my vision still be in stereoscopic mode or something. I see everything beyond 1-2 meters as blurry, as if watching the movie without glasses. Hope a good night's rest will cure that.
Google's Japanese IME
Google has released an input method editor (IME) for Japanese in a similar style as their Chinese IME. It can be found on their IME page. It looks to be available for Mac OS X, Windows XP SP2, Vista SP1, and Windows 7.
Microsoft Office 2010, typography, and proofing tools
Microsoft has released Office 2010 as a beta that you can use up to and including October 2010 (scheduled to be released in June 2010). You can download it as either 32 or 64 bit, although it seems the 64 bit download is a bit hidden since many buttons for downloading seem to lead to the default 32 bit download. If you follow the link at the Professional Plus site to 'Get It Now' you should be presented with links to both versions. At the moment Microsoft supports Chinese (Simplified), English, French, German, Japanese, Russian, and Spanish. If you are like me you just use the application in English, but then miss some of the proofing tools for, say, Japanese.
You can download language packs from the Microsoft Download Center. If you change the language to, say, Japanese you are presented with two download links at the bottom for the Japanese language pack. This language pack includes user interface changes for Japanese as well as proofing tools, OCR support, and fonts.
Once the pack is downloaded just run it and you can customize want you want to install. Since I am not interested in the UI aspects of the pack, I selected the top part and toggled selection for all to not install. Then for the entries <span lang="ja">国際フォント</span> (international fonts) and <span lang="ja">文章校正ツール</span> (proofing tools) I made sure to install everything. <span lang="ja">文章校正ツール</span> includes both <span lang="ja">日本語用校正ツール</span> and <span lang="ja">英語用校正ツール</span> and I guess you can most likely skip <span lang="ja">英語用校正ツール</span> since it is already installed. <span lang="ja">国際フォント</span> includes <span lang="ja">標準フォント</span> (standards font), which I am guessing is related to JIS X standards for font encodings.
Basic Windows 7 has 134 fonts installed. A basic English Office 2010 install increases this to 198 fonts installed. Installing the Japanese language pack proofing tools with fonts brings this to 228 fonts installed.
If you press the expansion arrow at the bottom-right of the Home part of the ribbon (or press CTRL-D) you will get the Font dialog. If you select the Advanced tab you can turn on features such as OpenType ligatures. This will mean that with text such as 'fl' or 'ffi' certain parts of the letters will connect instead of showing white space between the letters. This is the same technique used in printed media such as books.
Update: Michael Hendry was kind enough to point out that I was mistaking <span lang="ja">標準</span> (standard/default) with <span lang="ja">基準</span> (standards/JIS/ISO).
Office 2010 Chinese language pack font list
It looks like the Chinese Office 2010 font list is the following (Changzhou SinoType, Founder, Microsoft, Stone):
- FZShuTi
- FZYaoTi
- LiSu
- Microsoft YaHei
- Microsoft YaHei Bold
- STCaiyun
- STFangsong
- STHupo
- STKaiti
- STLiti
- STSong
- STXihei
- STXingkai
- STXinwei
- STZhongsong
- YouYuan
From the language pack make sure to select <span lang="zh-CN">国际字体</span> (international fonts) and <span lang="zh-CN">校对工具</span> (proofing tools). Under <span lang="zh-CN">国际字体</span> we have <span lang="zh-CN">典型字体</span> (typical fonts) and under <span lang="zh-CN">校对工具</span> we have <span lang="zh-CN">简体中文校对工具</span> (Simplified Chinese proofing tools) and <span lang="zh-CN">英语校对工具</span> (English proofing tools).
Office 2010 font list
It seems the Office 2010 font list is the following (English installation):
- Algerian
- Arial Unicode MS
- Baskerville Old Face
- Bauhaus 93
- Bell MT
- Berlin Sans FB
- Bernard MT Condensed
- Bodoni MT Poster Compressed Light
- Book Antiqua
- Bookman Old Style
- Bookshelf Symbol 7
- Britannic Bold
- Broadway
- Brush Script MT Italic
- Californian FB
- Centaur
- Century Gothic
- Century
- Chiller
- Colonna MT
- Cooper Black
- Footlight MT Light
- Freestyle Script
- Garamond
- Haettenschweiler
- Harlow Solid Semi Expanded Italic
- Harrington
- High Tower Text
- Informal Roman
- Jokerman
- Juice ITC
- Kristen ITC
- Kunstler Script
- Latin Wide
- Lucide Bright
- Lucida Calligraphy Italic
- Lucida Fax
- Lucida Handwriting Italic
- Magneto Bold
- Matura MT Script Capitals
- Mistral
- Modern No. 20
- Modern
- Monotype Corsiva Italic
- MS Outlook
- MS Reference Sans Serif
- MS Reference Specialty
- MS Sans Serif
- MS Serif
- MT Extra
- Niagara Engraved
- Niagara Solid
- OCRB Regular
- Old English Text MT
- Onyx
- Parchment
- Playbill
- Poor Richard
- Ravie
- Roman
- Showcard Gothic
- Snap ITC
- Stencil
- Tempus Sans ITC
- Viner Hand ITC
- Vivaldi Italic
- Vladimir Script
- Wingdings 2
- Wingdings 3
Office 2010 Japanese language pack font list
It looks like the Japanese Office 2010 font list is the following (all by RICOH):
- HGGothicE
- HGGothicM Medium
- HGGyoshotai Medium
- HGKyokashotai Medium
- HGMaruGothicMPRO
- HGMinchoB Bold
- HGMinchoE
- HGPGothicE
- HGPGothicM Medium
- HGPGyoshotai Medium
- HGPKyokashotai Medium
- HGPMinchoB Bold
- HGPMinchoE
- HGPSoeiKakugothicUB
- HGPSoeiKakupoptai
- HGPSoeiPresence EB Extra-Bold
- HGSeikaishotaiPRO
- HGSGothicE
- HGSGothicM Medium
- HGSGyoshotai Medium
- HGSKyokashotai Medium
- HGSMinchoB Bold
- HGSMinchoE
- HGSoeiKakugothicUB
- HGSoeiKakupoptai
- HGSoeiPresenceE Extra-Bold
- HGSSoeiKakugothicUB
- HGSSoeiKakupoptai
- HGSSoeiPresence EB Extra-Bold
MathML and SVG in HTML 5 with Firefox
I've been using MathML for a while now for some of my documentation work on 3D graphics. Unfortunately the only way at the moment is to use XHTML 1.1 modular doctype to include either or both of MathML and SVG. In HTML 5 these have become embedded content parts of the specification. So for example, using MathML would be as simple as doing:
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title>MathML test</title>
</head>
<body>
<math>
<mrow>
<mi>y</mi>
<mo>=</mo>
<msup>
<mi>x</mi>
<mn>2</mn>
</msup>
</mrow>
</math>
</body>
</html>
Unfortunately the only browser to support either MathML or (parts of) HTML 5 at this moment is Firefox 3.5. However, the MathML or SVG embedded content did not render under 3.5. After reading John Resig's post about a new HTML parsing engine in Mozilla's Gecko engine I set out to test this engine's support by downloading the latest nightly and setting html5.enable to true in about:config and 'lo and behold, it renders as expected.
JSONP with Werkzeug
So I had implemented a simple JSON data server with Werkzeug for a classroom experiment. Unfortunately in my haste to get everything up and running I totally forgot about the fact that, since we cannot allow uploads to this server of various custom made webpages, using jQuery's $.ajax() everything just fails since it will then be a cross-site scripting request.
So, normally you would do something like the following in order to return JSON data:
return json.dumps(data)
Which would be used with the $.ajax() call in a way like the following:
$.ajax({
type: "POST",
url: "http://example.com/json/something",
data: "parameter=value",
dataType: "json",
error: function(XMLHttpRequest, textStatus, errorThrown){},
success: function(data, msg){}
});
Which is perfectly fine for scripts getting and using the data on the same host/domain. But, as said before, this will fail with warnings similar to: "Access to restricted URI denied" code: "1012" nsresult: "0xdeadc0de (NS_ERROR_DOM_BAD_URI)".
One way out of this is using JSONP. jQuery has a $.getJSON() function, which loads JSON data using a HTTP GET request. Now, the simplistic way to convert your code would be to change it as such:
$.getJSON("http://example.com/json/something",
function(data){}
);
But this causes another issue. Since $.getJSON() GETs the JSON data, but doesn't use eval() on it, but instead pulls the result into script tags, it somehow causes,on Firefox at least, an invalid label error. In order to fix this you need to set up the JSON data server to properly support a callback argument, to use $.getJSON() how it is meant to be used:
$.getJSON("http://example.com/json/something?jsoncallback=?",
function(data){}
);
In the code above the additional parameter jsoncallback will, thanks to jQuery, get the question mark replaced by an alphanumeric string (typically in the form of jsonp followed by a timestamp). This value should be used to wrap the resulting JSON data with. This means you would have to change the initial Python code to something like this:
return request.args.get('jsoncallback') + '(' + json.dumps(data) + ')'
Of course this causes problems when you want to reuse the code for both AJAX use on the same host/domain and use it from outside. So in order to make both work you can test on whether or not the callback parameter is available and return the appropriate data. I came up with this little snippet for that:
def jsonwrapper(self, request, data):
callback = request.args.get('jsoncallback')
if callback:
return callback + '(' + json.dumps(data) + ')'
else:
return json.dumps(data)
Character encoding in mailcap for mutt and w3m
I use mutt on my FreeBSD system to read my mail. To read HTML mail I simply use a .mailcap file with an entry such as
text/html; w3m -dump %s; nametemplate=%s.html; copiousoutput
This in effect dumps the HTML using w3m to a text file in order to safely display it. The problem that I had is that, because some emails that I receive are from a Japanese translators list, they are in Shift_JIS. When dumped w3m doesn't properly detect the Shift_JIS encoding and as such the resulting output becomes garbled.
When I looked at the attachments in the mail with mutt's 'v' command I saw that mutt at least knows the encoding of the attachment, so I figured that there should be a way of using this information with my mailcap. Turns out that there is indeed a way to do so, namely the charset variable. It turns out the mailcap format is a full RFC. RFC 1524 to be exact. Mutt furthermore uses the Content-Type headers to pull any specific settings into mailcap variables. So a Content-Type: text/html; charset=shift_jis means that %{charset} in the mailcap file will be expanded to shift_jis. We can use this with w3m's -I flag to set a proper encoding prior to dumping.
text/html; w3m -I %{charset} -dump %s; nametemplate=%s.html; copiousoutput
As such you can be relatively sure that the dumped text will be in the appropriate encoding. Of course it depends on a properly set Content-Type header, but if you cannot depend on that one you need to dig out the recovery tools already.
Why using 'lorem ipsum' is bad for web site testing
The typesetting and webdesign industry has apparently been using the 'lorem ipsum' text for a while to provide a dummy text in order to test print and layout.
Aside from the fact that the text is a cut off section of Cicero's de finibus bonorum et malorum, it also fails in one huge aspect, namely globalisation.
The text is Latin, latin is the simplest of all characters we have available to us on the world-wide web. If your website is English only then, yes, you are quite done. However for a lot of us we also have to support languages other than English, the easiest of which are Latin-derived scripts.
Latin, and subsequently English, are both written left-to-right. Hebrew and Arabic, to take two prime examples, are written right-to-left (leaving numerals aside for the moment). Of course, this is very important to also test since it means a lot of change is needed for your lay out.
Especially when testing your design for sites that need to display multiple languages on the same page it is pertinent to test with multilingual text. One of the things that should quickly become clear is whether or not a sufficient encoding has been chosen.