Node:String overflows with scanf, Previous:scanf, Up:Deprecated formatted string input functions
If you use the %s
and %[
conversions improperly, then the
number of characters read is limited only by where the next whitespace
character appears. This almost cetainly means that invalid input could
make your program crash, because input too long would overflow whatever
buffer you have provided for it. No matter how long your buffer is, a
user could always supply input that is longer. A well-written program
reports invalid input with a comprehensible error message, not with a
crash.
Fortunately, it is possible to avoid scanf
buffer overflow
by either specifying a field width or using the a
flag.
When you specify a field width, you need to provide a buffer (using
malloc
or a similar function) of type char *
.
(See Memory allocation, for more information on malloc
.)
You need to make sure that the field width you specify does not exceed
the number of bytes allocated to your buffer.
On the other hand, you do not need to allocate a buffer if you specify
the a
flag character -- scanf
will do it for you. Simply
pass scanf
an pointer to an unallocated variable of type
char *
, and scanf
will allocate however large a buffer the
string requires, and return the result in your argument. This is a
GNU-only extension to scanf
functionality.
Here is a code example that shows first how to safely read a string of
fixed maximum length by allocating a buffer and specifying a field width, then
how to safely read a string of any length by using the a
flag.
#include <stdio.h> int main() { int bytes_read; int nbytes = 100; char *string1, *string2; string1 = (char *) malloc (25); puts ("Please enter a string of 20 characters or fewer."); scanf ("%20s", string1); printf ("\nYou typed the following string:\n%s\n\n", string1); puts ("Now enter a string of any length."); scanf ("%as", &string2); printf ("\nYou typed the following string:\n%s\n", string2); return 0; }
There are a couple of things to notice about this example program.
First, notice that the second argument passed to the first scanf
call is string1
, not &string1
. The scanf
function
requires pointers as the arguments corresponding to its conversions, but
a string variable is already a pointer (of type char *
), so you
do not need the extra layer of indirection here. However, you do need
it for the second call to scanf
. We passed it an argument of
&string2
rather than string2
, because we are using the
a
flag, which allocates a string variable big enough to contain
the characters it read, then returns a pointer to it.
The second thing to notice is what happens if you type a string of more
than 20 characters at the first prompt. The first scanf
call
will only read the first 20 characters, then the second scanf
call will gobble up all the remaining characters without even waiting
for a response to the second prompt. This is because scanf
does
not read a line at a time, the way the getline
function does.
Instead, it immediately matches attempts to match its template string to
whatever characters are in the stdin
stream. The second
scanf
call matches all remaining characters from the overly-long
string, stopping at the first whitespace character. Thus, if you type
12345678901234567890xxxxx
in response to the first prompt, the
program will immediately print the following text without pausing:
You typed the following string: 12345678901234567890 Now enter a string of any length. You typed the following string: xxxxx
(See sscanf, for a better example of how to parse input from the user.)