A search interface for data from the Politics of Patents case study (part of Copim WP6): this parses data from the archive of RTF files and provides additional data from the European Patent Office API. https://patents.copim.ac.uk
Вы не можете выбрать более 25 тем Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.

30 lines
1.3KB

  1. #
  2. # This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
  3. #
  4. # Add entries to this file in order to override the statistical model in terms
  5. # of segmentation, readings and part-of-speech tags. Notice that entries do
  6. # not have weights since they are always used when found. This is by-design
  7. # in order to maximize ease-of-use.
  8. #
  9. # Entries are defined using the following CSV format:
  10. # <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
  11. #
  12. # Notice that a single half-width space separates tokens and readings, and
  13. # that the number tokens and readings must match exactly.
  14. #
  15. # Also notice that multiple entries with the same <text> is undefined.
  16. #
  17. # Whitespace only lines are ignored. Comments are not allowed on entry lines.
  18. #
  19. # Custom segmentation for kanji compounds
  20. 日本経済新聞,日本 経済 新聞,ニホン ケイザイ シンブン,カスタム名詞
  21. 関西国際空港,関西 国際 空港,カンサイ コクサイ クウコウ,カスタム名詞
  22. # Custom segmentation for compound katakana
  23. トートバッグ,トート バッグ,トート バッグ,かずカナ名詞
  24. ショルダーバッグ,ショルダー バッグ,ショルダー バッグ,かずカナ名詞
  25. # Custom reading for former sumo wrestler
  26. 朝青龍,朝青龍,アサショウリュウ,カスタム人名