custom uptodate¶
The basics of uptodate was already introduced. Here we look in more detail into some implementations shipped with doit. And the API used by those.
result-dependency¶
In some cases you can not determine if a task is “up-to-date” only based on input files, the input could come from a database or an external process. doit defines a “result-dependency” to deal with these cases without need to create an intermediate file with the results of the process.
i.e. Suppose you want to send an email every time you run doit on a mercurial repository that contains a new revision number.
from doit.tools import result_dep
def task_version():
return {'actions': ["hg tip --template '{rev}:{node}'"]}
def task_send_email():
return {'actions': ['echo "TODO: send an email"'],
'uptodate': [result_dep('version')]}
Note the result_dep with the name of the task (‘version’). doit will keep track of the output of the task version and will execute send_email only when the mercurial repository has a new version since last time doit was executed.
The “result” from the dependent task compared between different runs is given by its last action. The content for python-action is the value of the returned string or dict. For cmd-actions it is the output send to stdout plus stderr.
result_dep also supports group-tasks. In this case it will check that the result of all subtasks did not change. And also the existing sub-tasks are the same.
run_once()¶
Sometimes there is no dependency for a task but you do not want to execute it all the time. With “run_once” the task will not be executed again after the first successful run. This is mostly used together with targets.
Suppose you need to download something from internet. There is no dependency, but you do not want to download it many times.
from doit.tools import run_once
def task_get_pylogo():
url = "http://python.org/images/python-logo.gif"
return {'actions': ["wget %s" % url],
'targets': ["python-logo.gif"],
'uptodate': [run_once],
}
Note that even with run_once the file will be downloaded again in case the target is removed.
$ doit
. get_pylogo
$ doit
-- get_pylogo
$ rm python-logo.gif
$ doit
. get_pylogo
timeout()¶
timeout
is used to expire a task after a certain time interval.
i.e. You want to re-execute a task only if the time elapsed since the last time it was executed is bigger than 5 minutes.
import datetime
from doit.tools import timeout
def task_expire():
return {
'actions': ['echo test expire; date'],
'uptodate': [timeout(datetime.timedelta(minutes=5))],
'verbosity': 2,
}
timeout
is function that takes an int
(seconds) or timedelta
as a
parameter. It returns a callable suitable to be used as an uptodate
callable.
config_changed()¶
config_changed
is used to check if any “configuration” value for the task has
changed. Config values can be a string or dict.
For dict’s the values are converted to string (actually it uses python’s repr()) and only a digest/checksum of the dictionaries keys and values are saved.
from doit.tools import config_changed
option = "AB"
def task_with_params():
return {'actions': ['echo %s' % option],
'uptodate': [config_changed(option)],
'verbosity': 2,
}
check_timestamp_unchanged()¶
check_timestamp_unchanged
is used to check if specified timestamp of a given
file/dir is unchanged since last run.
The timestamp field to check defaults to mtime
, but can be selected by
passing time
parameter which can be one of: atime
, ctime
, mtime
(or their aliases access
, status
, modify
).
Note that ctime
or status
is platform dependent.
On Unix it is the time of most recent metadata change,
on Windows it is the time of creation.
See Python library documentation for os.stat and Linux man page for
stat(2) for details.
It also accepts an cmp_op
parameter which defaults to operator.eq
(==).
To use it pass a callable which takes two parameters (prev_time, current_time)
and returns True if task should be considered up-to-date, False otherwise.
Here prev_time
is the time from the last successful run and current_time
is the time obtained in current run.
If the specified file does not exist, an exception will be raised.
If a file is a target of another task you should probably add
task_dep
on that task to ensure the file is created before it is checked.
from doit.tools import check_timestamp_unchanged
def task_create_foo():
return {
'actions': ['touch foo', 'chmod 750 foo'],
'targets': ['foo'],
'uptodate': [True],
}
def task_on_foo_changed():
# will execute if foo or its metadata is modified
return {
'actions': ['echo foo modified'],
'task_dep': ['create_foo'],
'uptodate': [check_timestamp_unchanged('foo', 'ctime')],
}
uptodate API¶
This section will explain how to extend doit
writing an uptodate
implementation. So unless you need to write an uptodate
implementation
you can skip this.
Let’s start with trivial example. uptodate is a function that returns a boolean value.
def fake_get_value_from_db():
return 5
def check_outdated():
total = fake_get_value_from_db()
return total > 10
def task_put_more_stuff_in_db():
def put_stuff(): pass
return {'actions': [put_stuff],
'uptodate': [check_outdated],
}
You could also execute this function in the task-creator and pass the value to to uptodate. The advantage of just passing the callable is that this check will not be executed at all if the task was not selected to be executed.
Example: run-once implementation¶
Most of the time an uptodate implementation will compare the current value of something with the value it had last time the task was executed.
We already saw how tasks can save values by returning dict on its actions. But usually the “value” we want to check is independent from the task actions. So the first step is to add a callable to the task so it can save some extra values. These values are not used by the task itself, they are only used for dependency checking.
The Task has a property called value_savers
that contains a list of
callables. These callables should return a dict that will be saved together
with other task values. The value_savers
will be executed after all actions.
The second step is to actually compare the saved value with its “current” value.
The uptodate callable can take two positional parameters task
and values
. The callable can also be represented by a tuple (callable, args, kwargs).
task
parameter will give you access to task object. So you have access to its metadata and opportunity to modify the task itself!
values
is a dictionary with the computed values saved in the last- successful execution of the task.
Let’s take a look in the run_once
implementation.
def run_once(task, values):
def save_executed():
return {'run-once': True}
task.value_savers.append(save_executed)
return values.get('run-once', False)
The function save_executed
returns a dict. In this case it is not checking
for any value because it just checks it the task was ever executed.
The next line we use the task
parameter adding
save_executed
to task.value_savers
.So whenever this task is executed this
task value ‘run-once’ will be saved.
Finally the return value should be a boolean to indicate if the task is
up-to-date or not. Remember that the ‘values’ parameter contains the dict with
the values saved from last successful execution of the task.
So it just checks if this task was executed before by looking for the
run-once
entry in `values
.
Example: timeout implementation¶
Let’s look another example, the timeout
. The main difference is that
we actually pass the parameter timeout_limit
. Here we present
a simplified version that only accepts integers (seconds) as a parameter.
class timeout(object):
def __init__(self, timeout_limit):
self.limit_sec = timeout_limit
def __call__(self, task, values):
def save_now():
return {'success-time': time_module.time()}
task.value_savers.append(save_now)
last_success = values.get('success-time', None)
if last_success is None:
return False
return (time_module.time() - last_success) < self.limit_sec
This is a class-based implementation where the objects are made callable
by implementing a __call__
method.
On __init__
we just save the timeout_limit
as an attribute.
The __call__
is very similar with the run-once
implementation.
First it defines a function (save_now
) that is registered
into task.value_savers
. Than it compares the current time
with the time that was saved on last successful execution.
Example: result_dep implementation¶
The result_dep
is more complicated due to two factors. It needs to modify
the task’s setup_tasks
.
And it needs to check the task’s saved values and metadata
from a task different from where it is being applied.
A result_dep
implies that its dependency is also a setup
.
We have seen that the callable takes a task parameter that we used
to modify the task object. The problem is that modifying setup_tasks
when the callable gets called would be “too late” according to the
way doit works. When an object is passed uptodate
and this
object’s class has a method named configure_task
it will be called
during the task creation.
The base class dependency.UptodateCalculator
gives access to
an attribute named tasks_dict
containing a dictionary with
all task objects where the key
is the task name (this is used to get all
sub-tasks from a task-group). And also a method called get_val
to access
the saved values and results from any task.
See the result_dep source.