Make search ignore diacritics in database (Arabic)

This topic is empty.

Viewing 1 post (of 1 total)

Author

Posts
June 25, 2023 at 4:16 am #13067 Reply

BAH
Guest

Arabic has diacritics or accents that are actually zero-width characters. They are optionally typed after typing the letter, appearing above or under the character. Online search engines ignore them; Windows Search too. They are ignored in MS Office and LibreOffice unless you specify that you want your search to be diacritic-sensitive. The problem with most content search applications is that they don’t ignore them. As an example, imagine that the vowels in the word “FoRuM” are accents. A database can have “FRM”, “FoRM”, “FRuM”, or “FoRuM”. One should be able to search for FRM and find all those variants in the database. Can this be done with Anytxt? The Unicode numbers of the main zero-width characters are: U+064B, U+064C, U+064D, U+064E, U+064F, U+0650, U+0651, U+0652, U+0670. In addition, there’s a decorative character called kashida (U+0640). It’s also ignored.
Author

Posts

Viewing 1 post (of 1 total)