Make search ignore diacritics in database (Arabic)

Overview Forums Discussions Make search ignore diacritics in database (Arabic)

  • This topic is empty.
Viewing 1 post (of 1 total)
  • Author
    Posts
  • #13067 Reply
    BAH
    Guest

    Arabic has diacritics or accents that are actually zero-width characters. They are optionally typed after typing the letter, appearing above or under the character. Online search engines ignore them; Windows Search too. They are ignored in MS Office and LibreOffice unless you specify that you want your search to be diacritic-sensitive. The problem with most content search applications is that they don’t ignore them. As an example, imagine that the vowels in the word “FoRuM” are accents. A database can have “FRM”, “FoRM”, “FRuM”, or “FoRuM”. One should be able to search for FRM and find all those variants in the database. Can this be done with Anytxt? The Unicode numbers of the main zero-width characters are: U+064B, U+064C, U+064D, U+064E, U+064F, U+0650, U+0651, U+0652, U+0670. In addition, there’s a decorative character called kashida (U+0640). It’s also ignored.

Viewing 1 post (of 1 total)
Reply To: Make search ignore diacritics in database (Arabic)
Your information: