unicode length
- Posted by jmduro Dec 21, 2019
- 1735 views
Here is a little function to report an UTF-8 string length.
public function ulength(sequence s) integer res integer i, lg atom char i = 1 res = 0 lg = length(s) if lg < 2 then return length(s) end if while i <= lg do if and_bits(s[i],#80) = #00 then i += 1 elsif and_bits(s[i], #E0) = #C0 then i += 2 elsif and_bits(s[i], #F0) = #E0 then i += 3 elsif and_bits(s[i], #F8) = #F0 then i += 4 else i += 1 end if res += 1 end while return res end function
It works also with ASCII strings so it could replace length().
Jean-Marc