Search Features— International Languages

For an article discussing Unicode and other international language support in dtSearch, click here.

Unicode Support

  • Unicode support allows for indexing and searching of non-English text, including every character set supported by the Unicode standard.
  • In addition to Unicode support, dtSearch offers extensive alphabet customization options.
  • See Unicode FAQ for more technical information.

Language-Neutral Search Options

  • The following search options work automatically on text in any language: fuzzy (adjustable from 0 to 10); natural language with automatic relevancy-ranking; variable term weighting; phrase; boolean (and/or/not); proximity and directed proximity; wildcard; macro; numeric range; and fielded data (alone or combined with full-text searching).

Language Analyzer API Integration

  • The dtSearch Engine includes a language analyzer API that can be used to integrate morphological analyzers and custom or dictionary-based word breakers into the dtSearch Engine indexing process.
  • The dtSearch Engine offers integration with Basis Technology's Rosette Linguistics Platform for enhanced Chinese, Japanese and Korean text retrieval.
  • The dtSearch Engine also includes an API for substituting a non-English language thesaurus for the existing English-language one.