UTF-8 Katakana

1
2
3
        def is_katakana(src):
            r = re.search(r'^(\xe3(\x82[\xa1-\xbf]|\x83[\x80-\xb6]|\x83[\xbb-\xbe]))+$',src.encode('utf8'))
            return ( r != None )

More like this

  1. Re-editable FormPreview by hdknr 3 years, 1 month ago
  2. utf8-friendly dumpdata management command for yaml (no escape symbols) by mucius 4 days, 15 hours ago
  3. SQL Log Middleware by joshua 6 years, 2 months ago
  4. A view for downloading attachment by achimnol 3 years, 8 months ago
  5. UnicodeFixer by jeanmachuca 3 years, 5 months ago

Comments

shoma (on April 19, 2010):

my code is:

import unicodedata

def is_katakana(unichr):
  unichr = unicodedata.normalize('NFC', unichr)
  for c in unichr:
    if not unicodedata.name(c).startswith('KATAKANA'):
      return False
  return True

#

(Forgotten your password?)