12. Section name
Trigger jobs locally is trivial
If the only way is to run things remotely, debugging is super hard
Running things locally makes it a lot easier
No messing around with paths and configuration
!
(this has a flip side – more on this later)
12
13. Section name
It’s a library more than a framework
Avoid the “Hollywood principle” and make it easy to customize etc
13
19. Section name
Separate scheduling and execution
Schedule something to run later/somewhere else
!
Recent baby step towards this is a very simple fix for running modules dynamically:
!
$ luigi --module MyModule MyTask --foo xyz --bar 123!
!
The next step would be to do something like
!
$ luigi --module MyModule MyTask --foo xyz --bar 123 --execute-
remotely !
!
A full implementation would include a bunch of command line options to probe status, kill tasks, etc
19
22. Section name
Built in crontab-replacement
@luigi.schedule!
class MyTask(luigi.Task):!
param = luigi.DateParameter(default=datetime.date.today())!
def run(self):!
…!
!
The @luigi.schedule decorator would then
1. Register that my_module.MyTask should be scheduled (by telling the central planner?)
2. Trigger it continuously from somewhere (central planner?)
22
23. Section name
ETA for tasks
Using a persistent task history database, you could train a simple k-NN classifier to predict how long
a task will run
Then use this with the dependency graph to predict when any task will finish
23
24. More features in the central planner
Kill a task
Re-launch a task
Launch a new task
24
25. Section name
Support for other languages
Luigi is written in Python – but the RPC is language agnostic.
25