UTF-8 Katakana

1
2
3
        def is_katakana(src):
            r = re.search(r'^(\xe3(\x82[\xa1-\xbf]|\x83[\x80-\xb6]|\x83[\xbb-\xbe]))+$',src.encode('utf8'))
            return ( r != None )

More like this

  1. UnicodeFixer by jeanmachuca 4 years, 4 months ago
  2. A view for downloading attachment by achimnol 4 years, 7 months ago
  3. gettext parachut for python 2.3 with unicode and utf-8 by angerman 7 years, 1 month ago
  4. htmlentities by pytechd 6 years, 3 months ago
  5. Quick script to convert json data to csv by stephenemslie 3 years, 11 months ago

Comments

shoma (on April 19, 2010):

my code is:

import unicodedata

def is_katakana(unichr):
  unichr = unicodedata.normalize('NFC', unichr)
  for c in unichr:
    if not unicodedata.name(c).startswith('KATAKANA'):
      return False
  return True

#

(Forgotten your password?)