1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24 """
25 Class for calls to external programs.
26 """
27
28 import tempfile, os, time, subprocess
29
30 import Biskit.tools as t
31 import Biskit.settings as s
32 from Biskit.LogFile import StdLog
33 from Biskit.Errors import BiskitError
34 from Biskit.ExeConfigCache import ExeConfigCache
35
37 pass
38
40 pass
41
43 """
44 All calls of external programs should be done via this class or subclasses.
45
46 Executor gets the necessary information about a program (binary,
47 environment variables, etc) from ExeConfigCache, creates an input file
48 or pipe from a template (if available) or an existing file, wrapps the
49 program call into ssh and nice (if necessary), spawns an external process
50 via subprocess.Popen, communicates the input file or string, waits for
51 completion and collects the output file or string, and cleans up
52 temporary files.
53
54 There are two ways of using Executor
55 ====================================
56
57 1. (recommended) Create a subclass of Executor for a certain program
58 call. Methods to override would be:
59
60 - __init__ ... to set your own default values
61 (call parent __init__!)
62 - prepare ... called BEFORE program execution
63 - cleanup ... called AFTER program execution
64 (call parent cleanup!)
65 - finish ... called AFTER successful program execution
66 - isfailed ... to detect the success status after program execution
67 - fail ... called if execution fails
68
69 Additionally, you should provide a simple program configuration file
70 in biskit/external/defaults/. See L{Biskit.ExeConfig} for
71 details and examples!
72
73 2. Use Executor directly.
74 An example is given in the __main__ section of this module.
75 You first have to create an Executor instance with all the
76 parameters, then call its run() method and collect the result.
77
78 In the most simple cases this can be combined into one line:
79
80 >>> out, error, returncode = Executor('ls', strict=0).run()
81
82 strict=0 means, ExeConfig does not insist on an existing exe_ls.dat
83 file and instead looks for a program called 'ls' in the search path.
84
85
86 Templates
87 =========
88 Templates are files or strings that contain place holders like,
89 for example::
90
91 file_in=%(f_in)s
92 file_out=%(f_out)s
93
94 At run time, Executor will create an input file or pipe from the
95 template by replacing all place holders with values from its own
96 fields. Let's assume, the above example is put into a file 'in.template'.
97
98 >>> x = Executor( 'ls', template='in.template', f_in='in.dat')
99
100 ... will then pass the following input to the ls program::
101
102 file_in=in.dat
103 file_out=/tmp/tmp1HYOvO
104
105 However, the following input template will raise an error::
106
107 file_in=%(f_in)s
108 seed=%(seed)i
109
110 ...because Executor doesn't have a 'seed' field. You could provide
111 one by overwriting Executor.__init__. Alternatively, you can
112 provide seed as a keyword to the original Executor.__init__:
113
114 >>> x = Executor('ls', template='in.template',f_in='in.dat', seed=1.5)
115
116 This works because Executor.__init__ puts all unknown key=value pairs
117 into the object's name space and passes them on to the template.
118
119
120 Communicating Input
121 ===================
122
123 Programs often expect scripts, commands or additional parameters
124 from StdIn or from input files. Executor tries to support many
125 scenarios -- which one is chosen mainly depends on the
126 L{ExeConfig} `pipes` setting in exe_<program>.dat and on the
127 `template` parameter given to Executor.__init__. (Note: Executor
128 loads the ExeConfig instance for the given program `name` into its
129 `self.exe` field.)
130
131 Here is an overview over the different scenarios and how to
132 activate them:
133
134 1. B{ no input (default behaviour)}
135
136 The program only needs command line parameters
137
138 Condition:
139
140 - template == None
141
142 2. B{ input pipe from STDIN
143 (== ``myprogram | 'some input string'``) }
144
145 Condition:
146
147 - exe.pipes == 1 / True
148 - template != None ((or f_in points to existing file))
149
150 Setup:
151
152 1. `template` points to an existing file:
153
154 Executor reads the template file, completes it in
155 memory, and pushes it directly to the program.
156
157 2. `template` points to string that doesn't look like a file name:
158
159 Executor completes the string in memory (using
160 `self.template % self.__dict__`) and pushes it
161 directly to the program. This is the fastest option
162 as it avoids file access alltogether.
163
164 3. `template` == None but f_in points to an *existing* file:
165
166 Executor will read this file and push it unmodified to
167 the program via StdIn. (kind of an exception, if used at
168 all, f_in usual points to a *non-existing* file that
169 will receive the completed input.)
170
171 3. B{ input from file
172 (== ``myprogram < input_file``) }
173
174 Condition:
175
176 - exe.pipes == 0 / False
177 - template != None
178 - push_inp == 1 / True (default)
179
180 Setup:
181
182 1. `template` points to an existing file:
183
184 Executor reads the template file, completes it in
185 memory, saves the completed file to disc (creating or
186 overriding self.f_in), opens the file and passes the
187 file handle to the program (instead of STDIN).
188
189 2. `template` points to string that doesn't look like a file name:
190
191 Same as 3.1, except that the template is not read
192 from disc but directly taken from memory (see 2.2).
193
194 4. B{ input from file passed as argument to the program
195 (== ``myprogram input_file``) }
196
197 Condition:
198
199 - exe.pipes == 0 / False
200
201 For this it is up to you to provide the correct program
202 argument.
203
204 Setup:
205
206 1. Use template completion:
207
208 The best option would be to set an explicit file name
209 for `f_in` and include this file name into `args`, Example::
210
211 exe = ExeConfigCache.get('myprogram')
212 assert not exe.pipes
213
214 x = Executor( 'myprogram', args='input.in', f_in='input.in',
215 template='/somewhere/input.template', cwd='/tmp' )
216
217 Executor create your input file on the fly which is then
218 passed as first argument.
219
220 2. Without template completion:
221
222 Similar, just that you don't give a template::
223
224 x = Executor( 'myprogram', args='input.in', f_in='input.in',
225 cwd='/tmp' )
226
227 It would then be up to you to provide the correct
228 input file in `/tmp/input.in`. You could override the
229 L{prepare()} hook method for creating it.
230
231 There are other ways of doing the same thing.
232
233
234 Look at L{generateInp()} to see what is actually going on.
235
236
237 References
238 ==========
239
240 - See also L{Biskit.IcmCad} for an Example of how to overwrite and
241 use Executor.
242
243 - See also L{Biskit.ExeConfig} for a description of program
244 configuration.
245 """
246
247 - def __init__( self, name, args='', template=None, f_in=None, f_out=None,
248 f_err=None, strict=1, catch_out=1, push_inp=1, catch_err=0,
249 node=None, nice=0, cwd=None, log=None, debug=0,
250 verbose=None, **kw ):
251
252 """
253 Create Executor. *name* must point to an existing program configuration
254 unless *strict*=0. Executor will create a program input from
255 the template and its own fields and put it into f_in. If f_in but
256 no template is given, the unchanged f_in is used as input. If neither
257 is given, the program is called without input. If a node is given,
258 the process is wrapped in a ssh call. If *nice* != 0, the process
259 is preceeded by nice. *cwd* specifies the working directory. By
260 default, this setting is taken from the configuration file which
261 defaults to the current working directory.
262
263 @param name: program name (configured in .biskit/exe_name.dat)
264 @type name: str
265 @param args: command line arguments
266 @type args: str
267 @param template: template for input file -- this can be the template
268 itself or the path to a file containing it
269 (default: None)
270 @type template: str
271 @param f_in: target for completed input file (default: None, discard)
272 @type f_in: str
273 @param f_out: target file for program output (default: None, discard)
274 @type f_out: str
275 @param f_err: target file for error messages (default: None, discard)
276 @type f_err: str
277 @param strict: strict check of environment and configuration file
278 (default: 1)
279 @type strict: 1|0
280 @param catch_out: catch output in file (f_out or temporary)
281 (default: 1)
282 @type catch_out: 1|0
283 @param catch_err: catch errors in file (f_out or temporary)
284 (default: 1)
285 @type catch_err: 1|0
286 @param push_inp: push input file to process via stdin ('< f_in') [1]
287 @type push_inp: 1|0
288 @param node: host for calculation (None->no ssh) (default: None)
289 @type node: str
290 @param nice: nice level (default: 0)
291 @type nice: int
292 @param cwd: working directory, overwrites ExeConfig.cwd (default: None)
293 @type cwd: str
294 @param log: execution log (None->STOUT) (default: None)
295 @type log: Biskit.LogFile
296 @param debug: keep all temporary files (default: 0)
297 @type debug: 0|1
298 @param verbose: print progress messages to log (default: log != STDOUT)
299 @type verbose: 0|1
300 @param kw: key=value pairs with values for template file
301 @type kw: key=value
302
303 @raise ExeConfigError: if environment is not fit for running
304 the program
305 """
306 self.exe = ExeConfigCache.get( name, strict=strict )
307 self.exe.validate()
308
309 self.f_out = t.absfile( f_out )
310 if not f_out and catch_out:
311 self.f_out = tempfile.mktemp( '.out' )
312
313 self.f_err = t.absfile( f_err )
314 if not f_err and catch_err:
315 self.f_err = tempfile.mktemp( '.err' )
316
317 self.keep_out = f_out is not None
318 self.catch_out = catch_out
319 self.catch_err = catch_err
320
321 self.f_in = f_in
322 self.keep_inp = f_in is not None
323 self.push_inp = push_inp
324
325 self.args = args
326 self.template = template
327
328 self.node = node
329 self.nice = nice
330 self.debug = debug
331
332 self.cwd = cwd or self.exe.cwd
333
334
335 self.log = log or StdLog()
336 self.verbose = verbose
337 if self.verbose is None:
338 self.verbose = (log is not None)
339
340
341 self.runTime = 0
342 self.output = None
343 self.error = None
344 self.returncode = None
345 self.pid = None
346
347 self.result = None
348
349 self.initVersion = self.version()
350
351 self.__dict__.update( kw )
352
353
355 """Version of class (at creation).
356 @return: version
357 @rtype: str
358 """
359 return 'Executor $Revision: 2.17 $'
360
361
362 - def communicate( self, cmd, inp, bufsize=-1, executable=None,
363 stdin=None, stdout=None, stderr=None,
364 shell=0, env=None, cwd=None ):
365 """
366 Start and communicate with the new process. Called by execute().
367 See subprocess.Popen() for a detailed description of the parameters!
368 This method should work for pretty much any purpose but may fail for
369 very long pipes (more than 100000 lines).
370
371 @param inp: (for pipes) input sequence
372 @type inp: str
373 @param cmd: command
374 @type cmd: str
375 @param bufsize: see subprocess.Popen() (default: -1)
376 @type bufsize: int
377 @param executable: see subprocess.Popen() (default: None)
378 @type executable: str
379 @param stdin: subprocess.PIPE or file handle or None (default: None)
380 @type stdin: int|file|None
381 @param stdout: subprocess.PIPE or file handle or None (default: None)
382 @type stdout: int|file|None
383 @param stderr: subprocess.PIPE or file handle or None (default: None)
384 @type stderr: int|file|None
385 @param shell: wrap process in shell; see subprocess.Popen()
386 (default: 0, use exe_*.dat configuration)
387 @type shell: 1|0
388 @param env: environment variables (default: None, use exe_*.dat config)
389 @type env: {str:str}
390 @param cwd: working directory (default: None, means self.cwd)
391 @type cwd: str
392
393 @return: output and error output
394 @rtype: str, str
395
396 @raise RunError: if OSError occurs during Popen or Popen.communicate
397 """
398 try:
399 p = subprocess.Popen( cmd.split(),
400 bufsize=bufsize, executable=executable,
401 stdin=stdin, stdout=stdout, stderr=stderr,
402 shell=shell or self.exe.shell,
403 env=env or self.environment(),
404 cwd=cwd or self.cwd )
405
406 self.pid = p.pid
407
408 output, error = p.communicate( inp )
409
410 self.returncode = p.returncode
411
412 except OSError, e:
413 raise RunError, \
414 "Couldn't run or communicate with external program: %r"\
415 % e.strerror
416
417 return output, error
418
419
421 """
422 Run external command and block until it is finished.
423 Called by L{ run() }.
424
425 @param inp: input to be communicated via STDIN pipe (default: None)
426 @type inp: str
427
428 @return: execution time in seconds
429 @rtype: int
430
431 @raise RunError: see communicate()
432 """
433 start_time = time.time()
434
435 cmd = self.command()
436
437 shellexe = None
438 if self.exe.shell and self.exe.shellexe:
439 shellexe = self.exe.shellexe
440
441 stdin = stdout = stderr = None
442
443 if self.exe.pipes:
444 stdin = subprocess.PIPE
445 stdout= subprocess.PIPE
446 stderr= subprocess.PIPE
447 else:
448 inp = None
449 if self.f_in and self.push_inp:
450 stdin = open( self.f_in )
451 if self.f_out and self.catch_out:
452 stdout= open( self.f_out, 'w' )
453 if self.f_err and self.catch_err:
454 stderr= open( self.f_err, 'w' )
455
456 if self.verbose:
457 self.log.add('executing: %s' % cmd)
458 self.log.add('in folder: %s' % self.cwd )
459 self.log.add('input: %r' % stdin )
460 self.log.add('output: %r' % stdout )
461 self.log.add('errors: %r' % stderr )
462 self.log.add('wrapped: %r'% self.exe.shell )
463 self.log.add('shell: %r' % shellexe )
464 self.log.add('environment: %r' % self.environment() )
465 if self.exe.pipes and inp:
466 self.log.add('%i byte of input pipe' % len(str(inp)))
467
468 self.output, self.error = self.communicate( cmd, inp,
469 bufsize=-1, executable=shellexe, stdin=stdin,
470 stdout=stdout, stderr=stderr,
471 shell=self.exe.shell,
472 env=self.environment(), cwd=self.cwd )
473
474 if self.exe.pipes and self.f_out:
475 open( self.f_out, 'w').writelines( self.output )
476
477 if self.verbose: self.log.add(".. finished.")
478
479 return time.time() - start_time
480
481
482 - def run( self, inp_mirror=None ):
483 """
484 Run the callculation. This calls (in that order):
485 - L{ prepare() },
486 - L{ execute() },
487 - L{ postProcess() },
488 - L{ finish() } OR L{ fail() },
489 - L{ cleanup() }
490
491 @param inp_mirror: file name for formatted copy of inp file
492 (default: None) [not implemented]
493 @type inp_mirror: str
494
495 @return: calculation result
496 @rtype: any
497 """
498 try:
499 self.prepare()
500
501 self.inp = self.generateInp()
502
503 self.runTime = self.execute( inp=self.inp )
504
505 self.postProcess()
506
507 except IOError, why:
508 try:
509 self.fail()
510 finally:
511 self.cleanup()
512 raise RunError, why
513
514 try:
515 if self.isFailed():
516 self.fail()
517 else:
518 self.finish()
519 finally:
520 self.cleanup()
521
522 return self.result
523
524
526 """
527 Compose command string from binary, arguments, nice, and node.
528 Override (perhaps).
529
530 @return: the command to execute
531 @rtype: str
532 """
533 exe = t.absbinary( self.exe.bin )
534
535 if self.args:
536 exe = exe + ' ' + self.args
537
538 str_nice = str_ssh = ''
539
540 if self.nice != 0:
541 str_nice = "%s -%i" % (s.nice_bin, self.nice)
542
543 if self.node is not None:
544 str_ssh = "%s %s" % (s.ssh_bin, self.node )
545
546 cmd = "%s %s %s" % (str_ssh, str_nice, exe )
547 cmd = cmd.strip()
548
549 return cmd
550
551
553 """
554 Setup the environment for the process. Override if needed.
555
556 @return: environment dictionary
557 @rtype: dict OR None
558 """
559 if not self.exe.replaceEnv:
560 return None
561
562 return self.exe.environment()
563
564
566 """
567 called before running external program, override!
568 """
569 pass
570
571
572 - def postProcess( self ):
573 """
574 called directly after running the external program, override!
575 """
576 pass
577
578
580 """
581 Clean up after external program has finished (failed or not).
582 Override, but call in child method!
583 """
584 if not self.keep_out and not self.debug and self.f_out:
585 t.tryRemove( self.f_out )
586
587 if not self.keep_inp and not self.debug:
588 t.tryRemove( self.f_in )
589
590 if self.f_err and not self.debug:
591 t.tryRemove( self.f_err )
592
593
595 """
596 Called if external program failed, override!
597 """
598 pass
599
600
602 """
603 Called if external program finished successfully, override!
604 """
605 self.result = self.output, self.error, self.returncode
606
607
609 """
610 Detect whether external program failed, override!
611 """
612 return 0
613
614
616 """
617 Create complete input string from template with place holders.
618
619 @return: input
620 @rtype: str
621
622 @raise TemplateError: if unknown option/place holder in template file
623 """
624 inp = self.template
625
626 try:
627 if os.path.isfile( inp ):
628 inp = open( inp, 'r' ).read()
629 return inp % self.__dict__
630
631 except KeyError, why:
632 s = "Unknown option/place holder in template file."
633 s += "\n template file: " + str( self.template )
634 s += "\n Template asked for a option called " + str( why[0] )
635 raise TemplateError, s
636
637
669
670
672 """
673 Prepare the program input (file or string) from a template (if
674 present, file or string).
675
676 @return: input file name OR (if pipes=1) content of input file
677 @rtype: str
678
679 @raise TemplateError: if error while creating template file
680 """
681 try:
682 inp = None
683
684 if self.template:
685 inp = self.fillTemplate()
686
687 return self.convertInput( inp )
688
689 except Exception, why:
690 s = "Error while creating template file."
691 s += "\n template file: " + str( self.template )
692 s += "\n why: " + str( why )
693 s += "\n Error:\n " + t.lastError()
694 raise TemplateError, s
695
696
697
698
699
700 import Biskit.test as BT
701
702 -class Test(BT.BiskitTest):
703 """Executor test"""
704
705 TAGS = [ BT.EXE ]
706
708 import tempfile
709 self.fout = tempfile.mktemp('_testexecutor.out')
710
713
715 """Executor test (run emacs ~/.biskit/settings.cfg)"""
716 ExeConfigCache.reset()
717
718 self.x = ExeConfigCache.get( 'emacs', strict=0 )
719 self.x.pipes = 1
720
721 args = '.biskit/settings.cfg'
722 if not self.local:
723 args = '-kill ' + args
724
725 self.e = Executor( 'emacs', args=args, strict=0,
726 f_in=None,
727 f_out=self.fout,
728 verbose=self.local, cwd=t.absfile('~') )
729
730 self.r = self.e.run()
731
732 if self.local:
733 print 'Emacs was running for %.2f seconds'%self.e.runTime
734
735 self.assert_( self.e.pid != None )
736
737 if __name__ == '__main__':
738
739 BT.localTest()
740