Re: Terminology concerning strings



Hi Egmont,

On Wed, 2005-04-06 at 13:14, Koblinger Egmont wrote:
> a) One can allocate a larger buffer than strlen+1. For example,
> x=malloc(10); strcpy(x, "asdf"); in this example length is 4, size is 10.
> Or is size==5 in this case?

I am not sure if you should count the ending 0 char. I would say not,
but if you do size = strlen + size(<chartype>) anyway. The +
size(<chartype>) should affect the buffer allocation, not the
calculation of the string size. So for single byte chars I would say
size = length (not + 1).

> b) Each multibyte character (e.g. any accented letters in UTF-8) counts as 1
> for length, but at least two for size.

According to
http://www.gnu.org/software/libc/manual/html_node/Extended-Char-Intro.html wchar_t on GNU systems is 4 bytes by default. Internal representation of multibyte strings always uses fixed widths or something like x[3] wouldn't work (without scanning the string). So in case x in the above example is a wchar_t you overflow the buffer nicely ;) .

Leonard.

-- 
mount -t life -o ro /dev/dna /genetic/research





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]