There are certain C standard library functions whose potentially insecure behaviour is well known. These functions, such as
sprintf() or
strcpy() are usually avoided these days. But today, while reviewing some security-related code, that there is one function where you wouldn't really expect any weird oder unusual behaviour. The function that I'm talking about is
strtoul().
Most people will automatically know and/or associate that it's the unsigned long variant of the
strtol() function that is used to convert strings to integers. Some may also think about using this function to verify that the result of the string to integer conversion can never be negative, and thus, strtoul() completely ignores negative numbers. But this is wrong: strtoul() does check for a + or - sign, and if it finds a "-" before the actual number to be converted, it negates the the scanned value prior to returning it. That's what the manpage and the official SuSv2/v3 documentation say, and it sounds innocuous. But it definitely isn't, and this little test program will show this:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
int main(int argc, char * argv[]) {
char * endptr;
unsigned long int value;
value = strtoul(argv[1], &endptr, 10);
if (value == ULONG_MAX) {
}
return 0;
}
Let's see how this program behaves when feeding it some numbers:
To sum this up:
- Positive numbers are scanned correctly, except for ULONG_MAX.
- Negative numbers are scanned, and their (negative) value is casted to an unsigned type. In the case of -1 that becomes 4294967295, this also (incorrectly) signals an error during conversion, in the case of -2 it doesn't.
- Numbers that are too large are signaled as such.
In my opinion, this behaviour is clearly unacceptable, as it is completely counter-intuitive, it can lead to incorrect error signaling, and is a hazard for anybody who wants to stick to unsigned integers only. It's nothing but a thinly disguised
(unsigned long)strtol(...). So don't use it. Use strtol() and explicitly manually exclude negative values instead, and explicitly cast it to an unsigned value afterwards. That limits the maximum value that you can scan, but hey, the world isn't perfect.