Seite drucken

FESA

Fehlereindämmende selbststabilisierende Algorithmen für große, infrastrukturlose, vernetzte Systeme

Kontakt Prof. Dr. rer. nat. Volker Turau
Beginn 1. September 2008
Ende 31. Dezember 2011
Finanzierung Deutsche Forschungsgemeinschaft (DFG)

Projektbeschreibung

Ziel dieses Vorhabens ist die Entwicklung einer Methodik zur Erhöhung der Fehlertoleranz großer infrastrukturloser Netze. Fehlertoleranz ist die Eigenschaft eines Systems, seine Funktionsweise auch dann aufrechtzuerhalten, wenn unvorhergesehene Ereignisse oder Fehler in Hard- oder Software auftreten. In großen, infrastrukturlosen Netzen kann dies nur durch dezentrale Verfahren, wie etwa Selbststabilisierung, erreicht werden. Ein Verfahren ist selbststabilisierend, wenn es aus sich heraus, also ohne äußeres Zutun, nach transienten Fehlern selbsttätig in einen korrekten Zustand zurückkehrt. Die Entwicklung selbststabilisierender Algorithmen verfolgte bisher das Ziel, die Zeitspanne vom Auftreten eines Fehlers bis zur Wiedererlangung eines korrekten Systemzustandes zu minimieren. Dabei wird in Kauf genommen, dass während der Übergangsphase ein beträchtlicher Teil des Netzes seinen Dienst nicht erbringen kann.

Das Projekt hat das Ziel, die Auswirkungen von transienten Fehlern nicht nur zeitlich sondern auch räumlich zu begrenzen, d.h. es wird eine Fehlereindämmung angestrebt. Es soll eine Methodik zum Entwurf selbststabilisierender Algorithmen entwickelt werden, bei denen nur Knoten in der unmittelbaren Umgebung des Fehlerverursachers in die Fehlerbehebung involviert sind. Dadurch kann der überwiegende Teil des Netzes auch während der Übergangsphase seinen Dienst erbringen. Angestrebt wird ein problemunabhängiges Verfahren, welches existierende selbststabilisierende Algorithmen so erweitert, dass die resultierenden Algorithmen fehlereindämmend sind. Dies hat den enormen Vorteil, dass bereits existierende Algorithmen zusätzlich die Eigenschaft der Fehlereindämmung erhalten. Primäres Anwendungsfeld sind Algorithmen für drahtlose Netze, wie etwa Sensornetze. Dazu werden die entsprechenden Randbedingungen wie asynchrones Modell, broadcast als Kommunikationsprimitive, unzuverlässige Kommunikation und beschränkte Ressourcen berücksichtigt.

FESA wird durch die Deutsche Forschungsgemeinschaft (DFG) für 3 Jahre gefördert.

Publikationen

Sven Köhler, Volker Turau und Gerhard Mentges. Self-stabilizing local k-placement of replicas with minimal variance. In Proceedings of the 14th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'12), Springer, Oktober 2012, pp. 16–30. Toronto, Canada.
@InProceedings{Telematik_KT_SSS12, author = {Sven K{\"o}hler and Volker Turau and Gerhard Mentges}, title = {Self-stabilizing local k-placement of replicas with minimal variance}, booktitle = {Proceedings of the 14th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'12)}, pages = {16-30}, series = {Lecture Notes in Computer Science}, volume = {7596}, publisher = {Springer}, day = {1-4}, month = oct, year = 2012, location = {Toronto, Canada}, }
Abstract: Large scale distributed systems require replication of resources to amplify availability and to provide fault tolerance. The placement of replicated resources significantly impacts performance. This paper considers local k-placements: Each node of a network has to place k replicas of a resource among its direct neighbors. The load of a node in a given local k-placement is the number of replicas it stores. The local k-placement problem is to achieve a preferably homogeneous distribution of the loads. We present a novel self-stabilizing, distributed, asynchronous, scalable algorithm for the k-placement problem such that the standard deviation of the distribution of the loads assumes a local minimum.
Sven Köhler und Volker Turau. Fault-containing self-stabilization in asynchronous systems with constant fault-gap. Distributed Computing, 25(3):207–224, 2012.
@Article{Telematik_KT_2011_DC, author = {Sven K{\"o}hler and Volker Turau}, title = {Fault-containing self-stabilization in asynchronous systems with constant fault-gap}, pages = {207-224}, journal = {Distributed Computing}, volume = {25}, number = {3}, year = 2012, issn = {0178-2770}, }
Abstract: This paper presents a new transformation which adds fault-containment properties to silent self-stabilizing algorithms. The transformation features a constant slow-down factor and the fault-gap—that is the minimal time between two containable faults—is also constant. The transformation scales well to arbitrarily large systems and avoids global synchronization. The presented transformation is the first with a constant fault-gap and requires no knowledge of the system size.
Sven Köhler und Volker Turau. Space-efficient fault-containment in dynamic networks. In Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'11), Springer, Oktober 2011, pp. 311–325. Grenoble, France.
@InProceedings{Telematik_KT_SSS11, author = {Sven K{\"o}hler and Volker Turau}, title = {Space-efficient fault-containment in dynamic networks}, booktitle = {Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'11)}, pages = {311-325}, series = {Lecture Notes in Computer Science}, volume = {6976}, publisher = {Springer}, day = {10-12}, month = oct, year = 2011, location = {Grenoble, France}, }
Abstract: Bounding the impact of transient small-scale faults by self-stabilizing protocols has been pursued with independent objectives: Optimizing the system's reaction upon topological changes (e.g. super-stabilization), and reducing system recovery time from memory corruptions (e.g. fault-containment). Even though transformations adding either super-stabilization or fault-containment to existing protocols exists, none of them preserves the other. This paper makes a first attempt to combine both objectives. We provide a transformation adding fault-containment to silent self-stabilizing protocols while simultaneously preserving the property of self-stabilization and the protocol's behavior in face of topological changes. In particular, the protocol's response to a topology change remains unchanged even if a memory corruption occurs in parallel to the topology change. The presented transformation increases the memory footprint only by a factor of 4 and adds O(1) bits per edge. All previously known transformations for fault-containing self-stabilization increase the memory footprint by a factor of 2m/n.
Volker Turau und Bernd Hauck. A fault-containing self-stabilizing (3 - 2/(Delta+1))-approximation algorithm for vertex cover in anonymous networks. Theoretical Computer Science, 412(33):4361–4371, 2011.
@Article{Telematik_HT_2011_TCSVC, author = {Volker Turau and Bernd Hauck}, title = {A fault-containing self-stabilizing (3 - 2/(Delta+1))-approximation algorithm for vertex cover in anonymous networks}, pages = {4361-4371}, journal = {Theoretical Computer Science}, volume = {412}, number = {33}, year = 2011, keywords = {Self-stabilizing algorithms, Fault tolerance, Distributed algorithms, Graph algorithms}, }
Abstract: The non-computability of many distributed tasks in anonymous networks is well known. This paper presents a deterministic self-stabilizing algorithm to compute a 3 - (2 / (Delta+1))-approximation of a minimum vertex cover in anonymous networks. The algorithm operates under the distributed unfair scheduler, stabilizes after O(n+m) moves respectively O(Delta) rounds, and requires O(log n) storage per node. Recovery from a single fault is reached within a constant time and the contamination number is O(Delta). For trees the algorithm computes a 2-approximation of a minimum vertex cover.
Sven Köhler und Volker Turau. A new Technique for proving Self-Stabilization under the Distributed Scheduler. In Proceedings of the 12th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'10), Springer, September 2010, pp. 65–79. New York, NY, USA.
@InProceedings{Telematik_KT_ProofTechnique, author = {Sven K{\"o}hler and Volker Turau}, title = {A new Technique for proving Self-Stabilization under the Distributed Scheduler}, booktitle = {Proceedings of the 12th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS'10)}, pages = {65-79}, series = {Lecture Notes in Computer Science}, volume = {6366}, publisher = {Springer}, day = {20-22}, month = sep, year = 2010, location = {New York, NY, USA}, }
Abstract: Proving stabilization of a complex algorithm under the distributed scheduler is a non-trivial task. This paper introduces a new method which allows to extend proofs for the central scheduler to the distributed scheduler. The practicability of the method is shown by applying it to two existing algorithms, for which stabilization under the distributed scheduler was an open problem.
Sven Köhler und Volker Turau. Fault-containing self-stabilization in asynchronous systems with constant fault-gap. In Proceedings of the 30th IEEE International Conference on Distributed Computing Systems (ICDCS'10), IEEE Computer Society, Juni 2010, pp. 418–427. Genoa, Italy.
@InProceedings{Telematik_KT_FaultContain, author = {Sven K{\"o}hler and Volker Turau}, title = {Fault-containing self-stabilization in asynchronous systems with constant fault-gap}, booktitle = {Proceedings of the 30th IEEE International Conference on Distributed Computing Systems (ICDCS'10)}, pages = {418-427}, publisher = {IEEE Computer Society}, day = {21-25}, month = jun, year = 2010, location = {Genoa, Italy}, }
Abstract: This paper presents a new transformation which adds fault-containment properties to any silent self-stabilizing protocol. The transformation features a constant slow-down factor and the fault-gap – that is the minimal time between two containable faults – is constant. The transformation scales well to arbitrarily large systems and avoids global synchronization.