Functions to Read and Write Text or Binary Data - Page 10
May 24, 2001
The most common text file-reading function,
readline, was presented above. It reads and returns
a single line from a file object, including any newline character
on the end of the line. If there is nothing more to be read from
the file, readline returns an empty string. This
makes it easy to, for example, count the number of lines in a
file:
fileobject = open("myfile", 'r')
count = 0
while fileobject.readline()!= "":
count = count + 1
print count
fileobject.close()
The readline method reads and returns
a single line from a file object, up to and including the next
newline.
For this particular problem, an even shorter way of counting all
of the lines is using the built-in readlines method,
which reads all of the lines in a file, and returns them
as a list of strings, one string per line (with trailing newlines
still included):
fileobject = open("myfile", 'r')
print len( fileobject.readlines())
fileobject.close()
readlines can be used to read in all
lines from a file object, and return them as a list of
strings.
An optional argument to readline or
readlines can limit the amount of data they read in
at any one time.
Of course, if you happen to be counting all of the lines in a
particularly huge file, this might cause your computer to run out
of memory, since it does read the entire file into memory at
once. It is also possible to overflow memory with
readline, if you have the mis-fortune to try to read
a line from a huge file that contains no newline characters,
although this is highly unlikely. To handle such circumstances,
both readline and readlines can take an
optional argument affecting the amount of data they read at any
one time. See the Python reference documentation for details.
Use the read method to read a byte
sequence from a file — either the entire file, or a fixed
number of bytes.
Use binary mode (i. e. "rb" or
"wb") when working with binary
files.
On some occasions, you might wish to read all of the data in a
file into a single string, especially if the data is not actually
a string, and you simply want to get it all into memory so you
can treat it as a byte sequence. Or you might wish to read data
from a file as strings of a fixed size. For example, you might be
reading data without explicit newlines, where each line is
assumed to be a sequence of characters of a fixed size. To do
this, use the read method. Without any argument, it will
read all of the rest of a file and return that data as a string.
With a single integer argument, it will read that number of
bytes, or less, if there is not enough data in the file to
satisfy the request, and return a string of the given size. A
possible problem may arise due to the fact that on Windows and
Macintosh machines text mode translations will occur if you use
the open command. On Macintosh any "\r"
will be converted to "\n" while on Windows
"\r\n" pairs will be converted to "\n"
and "\32" will be taken as an EOF character. Use the
'b' (binary) argument open("file",
'rb') or open("file", 'wb'), to open the file
in binary mode to eliminate this issue. This will work
transparently on UNIX platforms.
# Open a file for reading.
input = open("myfile", 'rb')
# Read the first four bytes as a header string.
header = input.read( 4)
# Read the rest of the file as a single piece of data.
data = input.read()
input. close()
The converses of the readline and
readlines methods are the write and
write-lines methods. Note that there is no writeline
function. write writes a single string, which could
span multiple lines if newline characters are embedded within the
string, for example something like:
myfile.write("Hello")
write does not write out a newline after it writes
its argument; if you want a newline in the output, you must put
it there yourself. If you open a file in text mode (using w), any
'\n' characters will be mapped back to the platform-specific line
endings (i. e., '\r\n' on Windows or '\r' on Macintosh
platforms). Again opening the file in binary mode (i. e., 'wb')
will avoid this.
writelines is something of a misnomer; it doesn't
necessarily write lines — it simply takes a list of strings
as an argument, and writes them, one after the other, to the
given file object, without writing newlines. If the strings in
the list end with newlines, they will be written as lines,
otherwise they will be effectively concatenated together in the
file. However, writelines is a precise inverse of
readlines, in that it can be used on the list
returned by readlines to write a file identical to
the file readlines read from. For example, assuming
myfile. txt exists and is a text file, this bit of
code will create an exact copy of myfile.txt called
myfile2.txt:
input = open("myfile.txt", 'r')
lines = input.readlines()
input. close()
output = open("myfile2.txt", 'w')
output. writelines(lines)
output. close()
Opening Files and File Objects - Page 9
The Quick Python Book
Screen Input/Output and Redirection - Page 11
|