previous next Up Title Contents Index

Files


Changing a text file in place

Q : There obviously is a way to change a small piece of a string on a line and then write it back out to a file but I am having difficulty, new to python.
R :
It seems to me that what you're trying to do is
something like:
lines = open("somefile.txt").readlines()

for i in range(len(lines)):
	if re.search(myre, lines[i]):
		lines[i] = lines[i].replace('foo','bar')
open("otherfile.txt","w").writelines(lines)


or, in old Python 1.5.2 or earlier, basically the same thing but with the 4th line changed to

		lines[i] = string.replace(lines[i],'foo','bar')


(this also works in Python 2.0, for compatibility).
I suspect that the loop you're using instead is more like the following:

for line in lines:
	if re.search(myre, line):
		line = string.replace(line,'foo','bar')


i.e., you're assigning the result of the replacement to the loop-variable, leaving the lines list intact and unchanged. This seems the likeliest explanation.
Note that, if you want to do the replacement "in place" (write out to the same file you've read), the fileinput module is very handy for that purpose:

for line in fileinput.input("thefile.txt", inplace=1):
	if re.search(myre, line):
		line = string.replace(line,'foo','bar')
	print line


the 'print line' is redirected by fileinput to the filename being read (the original file having been temporarily 'shunted aside' under a fake name - it will be removed at the end of the loop, unless the further argument backup='.bak' [or whatever] is also passed to the .input function). This saves having to keep the whole file in memory at once, which is handy for huge files, but is no help if you want to write the results to a *different* file (in that case, your approach based on readlines/modify the list/writelines is just fine for most files -- for truly huge files, you'll no doubt want to operate line by line rather than all-at-once, though -- but, you do have to make your modifications to *the list*, with some statement such as 'lines[i] = ...whatever...").
- Alex Martelli
-
You see, the last line should actually read:

	print line,    #note the trailing ","


the lines you get from "fileinput.input('thefile.txt', inplace=1)" have a trailing end-of-line symbol embedded if it was there in the input file. The print statement adds a trailing end-of-line symbol regardless of whether the string to write ended with one, *unless* you end the print statement with a comma, then this adding is omitted.
- Carel Fellinger

You want to get all lines from a file and do something to each

Option 1

while 1:
	line = fp.readline()
	if not line:
		break
	do something with 'line'

- Steve Lamb

Option 2

line = fp.readline()
while line:
	 # do something wiyth line
	 line = fp.readline()

- Alan Gauld

Option 3

while 1:
	line = fetchline()
	if end of data:
		break
	if empty:
		continue
	if comment:
		continue
	process line

- Fredrik Lundh

You want only the non empty lines of a file

A = []
for Line in H.readlines():
if len(Line) > 1: # Don't keep the empty lines containing only '\n')
A.append(Line)

Doing a copy of a file without blank lines

out = open(r'C:\bar.txt', 'w')
for line in open(r'c:\foo.txt').readlines():
if len(line)>1:
out.write(line)
out.close()

Handling files at startup

Import sys macfs
if len(sys.argv) > 1:
# -- Drag and Drop
filelist = sys.argv[1:]
else:
# -- double click on applet, no args
# Macfs
temp, res = macfs.StandardGetFile()
if res:
toto = temp.as_pathname()
filelist.append(toto)
else: # user canceled
sys.exit()

You don't know from wich platform the file come...

# -- line ending
f = open(file, 'rb')
temp = f.read()
f.seek(0)
crNum = string.count(temp, '\r')
lfNum = string.count(temp, '\n')
if lfNum == crNum: # dos
f.close()
os.rename(file, join(doneFolder, split(file)[1][:27]) + '.dos')
temp = string.replace(temp, '\n\r', '\n')
f = open(file, 'w')
f.write(temp)
f.close()
f = open(file)
elif crNum:
f.close()
os.rename(file, join(doneFolder, split(file)[1][:27]) + '.uix')
temp = string.replace(temp, '\r', '\n')
f = open(file, 'w')
f.write(temp)
f.close()
f = open(file)

Walking a directory tree

Use some Python 2 things, be careful

def crawl(root):
	from os.path import join, exists, isdir
	from os import listdir, chdir
	dirs = [root]
	while dirs:
		dir = dirs.pop()
		# This seems a little-known trick in Python: changing to the
		# current directory allows the subsequent isdir-in-a-loop to
		# work on just the file name, instead of making the OS reparse
		# the whole darn path from the root again each time.
		# Depending on the OS, can save gobs of time.
		chdir(dir)
		for f in filter(isdir, listdir(dir)):
			path = join(dir, f)
			if f == "CVS":
				entries = join(path, "Entries")
				if exists(entries):
					process(entries)
				else:
					raise TypeError("CVS dir w/o Entries!! " + path)
			else:
				dirs.append(path)

It does assume the argument given to it is a full path from the root. So change this line:

	dirs = [root]

to

	dirs = [os.path.abspath(root)]

and fiddle the imports however you like to make that pretty. In my 2.0 code, I use

import os.path.abspath as grant

...

	dirs = [grant(root)]

Whatever, it's a one-time cost at function startup.
For a *really* funky time, replace

	while dirs:
		dir = dirs.pop()

with

	for dir in dirs:

and you get a dirt-simple breadth-first search. This relies on (a) that "dirs" grows *inside* the loop; and (b) that Python's iteration protocol doesn't "know" the loop length in advance. The "for" keeps marching over the ever-growing "dirs" until dirs stops growing and the hidden iteration index oversteps its end. Sometimes-nice bonus: at the end, dirs contains a list of all the directories you've visited!
- Tim Peters


previous next Up Title Contents Index

Version : 1.65 Mise à jour : 11/11/00