Alen
Разница в скорости между read и readline на больших файлах, например в 100Мб раз этак в 100, а то и более.
В три раза:
#!/usr/bin/env python3
import timeit
def f1():
with open('file.txt') as fin:
fin.read()
def f2():
with open('file.txt') as fin:
for _ in fin:
pass
def main():
t1 = timeit.Timer('f1()', 'from __main__ import f1')
t2 = timeit.Timer('f2()', 'from __main__ import f2')
for t in t1, t2:
print(t.repeat(3, 5))
if __name__ == '__main__':
main()
[guest@localhost readlines]$ head -3 file.txt
one two three four five six seven
one two three four five six seven
one two three four five six seven
[guest@localhost readlines]$ wc -l file.txt
6165478 file.txt
[guest@localhost readlines]$ stat -c %s file.txt
209626252
[guest@localhost readlines]$
[guest@localhost readlines]$ ./timecmp.py
[1.6103930499994021, 1.6278253110003789, 1.5783134789999167]
[4.757401738000226, 4.788496722000673, 4.74195558700012]
[guest@localhost readlines]$
Add
Добавил .split('\n') - вариант с .read() стал медленнее.
#!/usr/bin/env python3
import timeit
def f1():
with open('file.txt') as fin:
fin.read().split('\n')
def f2():
with open('file.txt') as fin:
for _ in fin:
pass
def main():
t1 = timeit.Timer('f1()', 'from __main__ import f1')
t2 = timeit.Timer('f2()', 'from __main__ import f2')
for t in t1, t2:
print(t.repeat(3, 5))
if __name__ == '__main__':
main()
[guest@localhost readlines]$ ./timecmp.py
[5.052329235999423, 5.18557034100013, 5.0736290980003105]
[4.788822866001283, 4.763918844999353, 4.774715057999856]
[guest@localhost readlines]$