unicode: introduce UTF-8 character database