Skip to main content

$$normalize

Replace special characters forms with their simple form equivalent (removing marks by default)

  • Allows post-processing over Java's normalizer algorithm result

Post Operations

  • ROBUST - Try to return the most of similar letters to latin, replaced to their latin equivalent, including:
    • Removing combining diacritical marks (works with NFD/NFKD which leaves the characters decomposed)
    • Stroked (and others which are not composed) (i.e. "ĐŁłŒ" -> "DLlOE")
    • Replacing (with space) and trimming white-spaces

Usage

{ 
"$$normalize": /* String to normalize */,
"form": "NFKD" /* or NFKC / NFD / NFC */,
"post_operation": "ROBUST" /* or NONE */
}
"$$normalize([form],[post_operation]):{input}"
note

Concrete values in the usage example are default values.

Returns

string

Arguments

ArgumentTypeValuesRequired / Default ValueDescription
PrimarystringYesString to normalize
formenumNFKD/ NFKC/NFD/NFCNFKDNormalizer Form (as described in Java's documentation. Default is NFKD; Decompose for compatibility)
post_operationenumROBUST/ NONEROBUSTPost operation to run on result to remove/replace more letters

Examples

Input

Definition

Output

"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize:$"
"This is a funky String abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFKD):$"
"This is a funky String abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFKD,NONE):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFD):$"
"This is a funky String abcABC…"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFD,NONE):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFKC):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFKC,NONE):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFC):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFC,NONE):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
"$$normalize:$"
"!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
"ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ"
"$$normalize:$"
"AaAaAaCcCcCcCcDdDdEeEeEeEeEeGgGgGgGgHhHhIiIiIiIiIıIJijJjKkĸLlLlLlLlLlNnNnNnnNnOoOoOoOEoeRrRrRrSsSsSsSsTtTtTtUuUuUuUuUuUuWwYyYZzZzZzs"
"ƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿǀǁǂǃDŽDždžLJLjljNJNjnjǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯǰDZDzdzǴǵǶǷǸǹǺǻǼǽǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȜȝȞȟȠȡȢȣȤȥȦȧȨȩȪȫȬȭȮȯȰȱȲȳȴȵȶȷȸȹȺȻȼȽȾȿɀɁɂɃɄɅɆɇɈɉɊɋɌɍɎɏ"
"$$normalize:$"
"bBƂƃbbCCcDDƋƌƍƎƏƐFfGƔhƖIKklƛƜƝƞƟOoƢƣPpRSsƩƪtTtƮUuƱƲYyZzƷƸƹƺƻƼƽƾƿǀǁǂǃDZDzdzLJLjljNJNjnjAaIiOoUuUuUuUuUuǝAaAaAEaeGgGgKkOoOoƷʒjDZDzdzGgHpNnAaAEaeOoAaAaEeEeIiIiOoOoRrRrUuUuSsTtȜȝHhȠdȢȣZzAaEeOoOoOoOoYylntȷȸȹACcLTszɁɂBUAEeJjQqRrYy"
"ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"
"$$normalize:$"
"AAAAAAAECEEEEIIIIDNOOOOO×OUUUUYÞßaaaaaaaeceeeeiiiiðnooooo÷ouuuuyþy"
"ŒœÆæǢǣǼǽ"
"$$normalize:$"
"OEoeAEaeAEaeAEae"