07:37 pm, 18 Mar 06
kanji database
kanji.db.bz2: 279kb SQLite3 dump of Jim Breen's KANJIDIC2 (a Japanese kanji dictionary). This includes not only their readings and meanings, but also which grade level the kanji are learned, as well as their frequency rank found by counting frequencies in newspapers. Perfect for studying!
(Man, parsing XML is like pulling teeth.)
Unfortunately,
So I'll basically never be more literate than a middle schooler.
Anyway, here's a peek:
Looks like SQLite's text output doesn't understand doublewidth characters...
(Man, parsing XML is like pulling teeth.)
Unfortunately,
sqlite> select count(*) from kanji where grade < 10;
2232
So I'll basically never be more literate than a middle schooler.
Anyway, here's a peek:
sqlite> select * from kanji where grade is not null order by grade asc, freq asc limit 10;
id literal grade freq on_reading kun_reading meaning ---------- ---------- ---------- ---------- -------------- --------------- --------------- 2160 日 1 1 ニチ; ジツ ひ; -び; -か day; sun; Japan 76 一 1 2 イチ; イツ ひと-; ひと one 1455 人 1 5 ジン; ニン ひと; -り; - person 2177 年 1 6 ネン とし year 1763 大 1 7 ダイ; タイ おお-; おお large; big 1251 十 1 8 ジュウ; ジ とお; と ten 2151 二 1 9 ニ; ジ ふた; ふた. two 2598 本 1 10 ホン もと book; present; 1856 中 1 11 チュウ なか; うち; in; inside; mid 1270 出 1 13 シュツ; ス で.る; -で; exit; leave
Looks like SQLite's text output doesn't understand doublewidth characters...
Anyway, this is really useful! I'd like to write an AJAX kanji learning application using that database, but... I can't. Maybe I should try to learn myself some scripting language.
I don't think it has grade levels for Chinese, though.
Hmm. I guess my monospace font of choice doesn't feature those characters, so my browser substituted the first font in my proportional font choice list. Consequently, your pseudo-table is all out of whack. This font substitution business doesn't really work for monospace!