Technology Answer: SQL TOP 1 analog for lists in Python

Here is an example of my input csv file:

...
0.7,0.5,0.35,14.4,0.521838919218

0.7,0.5,0.35,14.4,0.521893472678

0.7,0.5,0.35,14.4,0.521948026139

0.7,0.5,0.35,14.4,0.522002579599
...

I need to select the top row where the last float > random number. My current implementation is very slow (script has a lot of iterations of this and outer cycles):

for line in foo:
   if float(line[-1]) > random.random():
      res = line
      break
...

How can I make this better and faster?

EDIT:

I was advised to use bisect for this task, but I don't know how to do it.

From stackoverflow

The fastest approach is to use bisect (assuming the float list is ordered). You can do it like this:

import bisect

float_list = [line[-1] for line in foo]
index = bisect.bisect(float_list, random.random())
if index < len(float_list)
    result = foo[index]
else:
    result = None # None exists

The float list has to be ordered for this to work.

You might actually be able to use the appropriate SQL command if you import the CSV file into SQLite. Python has a built-in sqlite library you can use to query the database.

Technology Answer

Thursday, May 5, 2011

SQL TOP 1 analog for lists in Python

EDIT:

0 comments:

Post a Comment

Blog Archive