=ADD= =reftype= 14 =number= 02-10 =url= ftp://ftp.risc.uni-linz.ac.at/pub/techreports/2002/02-10.ps.gz =year= 2002 =month= 03 =author= Bosa; Karoly + Schreiner; Wolfgang =title= Task Logging, Rescheduling, and Peer Checking in Distributed Maple =abstract= We have extended the parallel computer algebra environment Distributed Maple by fault tolerance mechanisms such that the time spent in a long running computation is not any more waste by the eventual occurence of a session failure. The first mechanism is the logging of task return values and of shared objects, values such that after a failure the newly started session can (transparently to the application program) reuse already computed results. The second mechanism is the migration of tasks such that the session may tolerate the failure of individual nodes without overall failure. The third mechanism is the redirection of messages such that a session may tolerate also the failure of the connections between nodes without overall failure. =sponsor= . =keywords= fault tolerance, distributed systems, computer algebra, parallel computing