Discussion:
[Tutor] iteration help
richard kappler
2015-08-20 13:27:02 UTC
Permalink
Running python 2.7 on Linux

While for and if loops always seem to give me trouble. They seem obvious
but I often don't get the result I expect and I struggle to figure out why.
Appended below is a partial script. Ultimately, this script will read a
log, parse out two times from each line of the log, a time the line was
written to the lg (called serverTime in the script) and an action time from
elsewhere in the line, then get the difference between the two. I don't
want every difference, but rather the average per hour, so I have a line
count. The script will output the average time difference for each hour.
I've got most of the pieces working in test scripts, but I'm stymied with
the single output bit.

The idea is that the script takes the hour from the server time of the
first line of the log and sets that as the initial serverHr. That works,
has been tested. Next the script is supposed to iterate through each line
of the log (for line in f1) and then check that there is a time in the line
(try), and if not skip to the next line. That works, has been tested.

As each line is iterated over, my intent was that the variable newServerHr
(read from the current line) is compared to serverHr and if they are the
same, the script will increase the count by one and add the difference to a
cummulative total then go to the next line. If the newServerHr and serverHr
are not the same, then we have entered a new clock hour, and the script
should calculate averages and output those, zero all counts and cummulative
totals, then carry on. The idea being that out of 117,000 ish lines of log
(the test file) that have inputs from 0200 to 0700, I would get 6 lines of
output.

I've got everything working properly in a different script except I get 25
lines of output instead of 6, writing something like 16 different hours
instead of 02 - 07.

In trying to chase down my bug, I wrote the appended script, but it outputs
117,000 ish lines (times 02-07, so that bit is better), not 6. Can someone
tell me what I'm misunderstanding?

#!/usr/bin/env python

import re

f1 = open('ATLA_PS4_red5.log', 'r')
f2 = open('recurseOut.log', 'a')

# read server time of first line to get hour
first_line = f1.readline()
q = re.search(r'\d\d:\d\d:\d\d', first_line)
q2 = q.start()
serverHr = (first_line[q2:q2+2])


for line in f1:
try:
s = line
# read server time
a = re.search(r'\d\d:\d\d:\d\d', s) # find server time in line
b = a.start() # find 1st position of srvTime
newServerHr = (s[b:b+2]) # what hour is it now?
if newServerHr != serverHr:
f2.write('hour ' + newServerHr + '\n')
else:
serverHr == newServerHr

except:
pass
--
All internal models of the world are approximate. ~ Sebastian Thrun
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Joel Goldstick
2015-08-20 13:48:37 UTC
Permalink
Post by richard kappler
Running python 2.7 on Linux
While for and if loops always seem to give me trouble. They seem obvious
but I often don't get the result I expect and I struggle to figure out why.
Appended below is a partial script. Ultimately, this script will read a
log, parse out two times from each line of the log, a time the line was
written to the lg (called serverTime in the script) and an action time from
elsewhere in the line, then get the difference between the two. I don't
want every difference, but rather the average per hour, so I have a line
count. The script will output the average time difference for each hour.
I've got most of the pieces working in test scripts, but I'm stymied with
the single output bit.
The idea is that the script takes the hour from the server time of the
first line of the log and sets that as the initial serverHr. That works,
has been tested. Next the script is supposed to iterate through each line
of the log (for line in f1) and then check that there is a time in the line
(try), and if not skip to the next line. That works, has been tested.
As each line is iterated over, my intent was that the variable newServerHr
(read from the current line) is compared to serverHr and if they are the
same, the script will increase the count by one and add the difference to a
cummulative total then go to the next line. If the newServerHr and serverHr
are not the same, then we have entered a new clock hour, and the script
should calculate averages and output those, zero all counts and cummulative
totals, then carry on. The idea being that out of 117,000 ish lines of log
(the test file) that have inputs from 0200 to 0700, I would get 6 lines of
output.
I've got everything working properly in a different script except I get 25
lines of output instead of 6, writing something like 16 different hours
instead of 02 - 07.
In trying to chase down my bug, I wrote the appended script, but it outputs
117,000 ish lines (times 02-07, so that bit is better), not 6. Can someone
tell me what I'm misunderstanding?
#!/usr/bin/env python
import re
f1 = open('ATLA_PS4_red5.log', 'r')
f2 = open('recurseOut.log', 'a')
# read server time of first line to get hour
first_line = f1.readline()
q = re.search(r'\d\d:\d\d:\d\d', first_line)
q2 = q.start()
serverHr = (first_line[q2:q2+2])
s = line
# read server time
a = re.search(r'\d\d:\d\d:\d\d', s) # find server time in line
b = a.start() # find 1st position of srvTime
newServerHr = (s[b:b+2]) # what hour is it now?
f2.write('hour ' + newServerHr + '\n')
serverHr == newServerHr
pass
1. You don't need s, you can use line directly.
2. In your else: code, you want = not == since you want to assign the
new value to the serverHr. That line does nothing now since it is
comparing two values, but making no decision based on the comparison.
3. I'm guessing you are coming from another language. In python
people generally use lower case names with underscores between words.
--
Joel Goldstick
http://joelgoldstick.com
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Alan Gauld
2015-08-20 17:16:47 UTC
Permalink
Post by richard kappler
While for and if loops always seem to give me trouble.
A picky point, but it is conceptually very important.

'while' and 'for' are loops - because the loop back
and repeat code.

'if' is not a loop. It is a selector. It only executes
its code once but selects one of several options.

Thee are basically only three(*) concepts in programming
so far as code structure goes:
1) sequence - one instruction after another
2) repetition - code that repeats or loops
3) selection - code that chooses to go down one of many possible paths.

With those three structures you can write any program.
So it is very important that you keep those concepts
separate in your mind. They do very different things.

As to your specific issue I see Joel has given you
some pointers there.

(*)
Some people like to include a fourth: modularity.
The ability to create reusable code blocks such
as functions. But technically it is not needed,
its just a nice to have.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Mark Lawrence
2015-08-20 17:54:44 UTC
Permalink
Post by richard kappler
Running python 2.7 on Linux
While for and if loops always seem to give me trouble. They seem obvious
but I often don't get the result I expect and I struggle to figure out why.
Appended below is a partial script. Ultimately, this script will read a
log, parse out two times from each line of the log, a time the line was
written to the lg (called serverTime in the script) and an action time from
elsewhere in the line, then get the difference between the two. I don't
want every difference, but rather the average per hour, so I have a line
count. The script will output the average time difference for each hour.
I've got most of the pieces working in test scripts, but I'm stymied with
the single output bit.
How do you write an if loop?
Post by richard kappler
The idea is that the script takes the hour from the server time of the
first line of the log and sets that as the initial serverHr. That works,
has been tested. Next the script is supposed to iterate through each line
of the log (for line in f1) and then check that there is a time in the line
(try), and if not skip to the next line. That works, has been tested.
As each line is iterated over, my intent was that the variable newServerHr
(read from the current line) is compared to serverHr and if they are the
same, the script will increase the count by one and add the difference to a
cummulative total then go to the next line. If the newServerHr and serverHr
are not the same, then we have entered a new clock hour, and the script
should calculate averages and output those, zero all counts and cummulative
totals, then carry on. The idea being that out of 117,000 ish lines of log
(the test file) that have inputs from 0200 to 0700, I would get 6 lines of
output.
I've got everything working properly in a different script except I get 25
lines of output instead of 6, writing something like 16 different hours
instead of 02 - 07.
In trying to chase down my bug, I wrote the appended script, but it outputs
117,000 ish lines (times 02-07, so that bit is better), not 6. Can someone
tell me what I'm misunderstanding?
#!/usr/bin/env python
import re
f1 = open('ATLA_PS4_red5.log', 'r')
f2 = open('recurseOut.log', 'a')
# read server time of first line to get hour
first_line = f1.readline()
q = re.search(r'\d\d:\d\d:\d\d', first_line)
q2 = q.start()
serverHr = (first_line[q2:q2+2])
Are you absolutely certain that this will always be set correctly?
Post by richard kappler
s = line
The line above does nothing effective so remove it.
Post by richard kappler
# read server time
a = re.search(r'\d\d:\d\d:\d\d', s) # find server time in line
b = a.start() # find 1st position of srvTime
newServerHr = (s[b:b+2]) # what hour is it now?
f2.write('hour ' + newServerHr + '\n')
serverHr == newServerHr
Is it possible that lines don't contain a valid time in which case b
will be -1? Your slice will run from -1, the last character in the line,
to +1, i.e. an empty string "".
Post by richard kappler
pass
Remove the try and plain except as it'll mask any problems that you get
in the code. It also prevents CTRL-C or similar from breaking infinite
loops that you've accidentally written.
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Loading...