UPPAAL Stratego Examples and Use Cases

Traffic Dilema
Newspaper and Travel
Safe and Optimal Cruise Control

Traffic Dilema

Suppose we need to travel to Sidney from Aalborg as soon as possible but we are not sure which mode of transport is the fastest: a bike, a car or a train. Each of them have different speeds and risks depending on traffic conditions. Below is a stochastic timed automaton capturing trip possibilities:

We start in the location Aalborg on the left, pack our things and leave within 2 minutes. Then we can choose/control our prefered mode of travel:

Bike is relatively slow but it is the most predictable option which takes to Sydney in between 42 and 45 minutes and thus depends very little on uncontrollable traffic conditions.
Car has a possibility of an Easy and fast trip in 20 minutes if traffic permits (probability weight 10 yields likelyhood of 10/11), but there is a risk of 1 in 11 to get stuck in Heavy traffic and thus the trip could be prolonged up to 140 minutes.
Train have periodic schedules and thus may have a waiting time up to 6 minutes and it is quite likely (10 out of 11) that we will board the train successfully and spend up to 35 minutes in location Go before reaching Sydney. There is also a small risk (1 out of 11) that the train is canceled and we end up in Wait, where we would have to decide either to switch to another train or GoBack home and in Aalborg make a decision again.

Given our model, we can inspect sample trip trajectories via SMC simulate query where the locations are encoded as different levels:

simulate 1 [<=100] { 
     Kim.Aalborg +  2*Kim.Bike +  4*Kim.Easy +  6*Kim.Heavy + 
     8*Kim.Train + 10*Kim.Go +   12*Kim.Wait + 14*Kim.Sydney }

Below are two sample trajectories computed by Uppaal Stratego:

Normal train trip in 20 minutes:	Train and then bike in 50 minutes:

The following queries and resulting plot show where and when the time is being spent:

simulate 100 [<=100] {
     14+Kim.Aalborg, 12+Kim.Bike, 10+Kim.Easy, 8+Kim.Heavy,
      6+Kim.Train,    4+Kim.Go,    2+Kim.Wait, Kim.Sydney 
}

Pr[<=60](<> Kim.Sydney) // probability of reaching Sydney within 60 minutes is about 0.97 (not certain, thus not safe)

E[<=200 ; 500] (max: trip) // mean trip is about 28 minutes

Generate and evaluate a safe strategy to get to Sydney within 60 minutes:

strategy GoSafe = control: A<> Kim.Sydney && time<=60

simulate 100 [<=100] {
     14+Kim.Aalborg, 12+Kim.Bike, 10+Kim.Easy, 8+Kim.Heavy,
      6+Kim.Train,    4+Kim.Go,    2+Kim.Wait, Kim.Sydney
} under GoSafe

A<> (Kim.Sydney && time<=60) under GoSafe // satisfied: using GoSafe we certainly reach Sydney within 60 minutes

E[<=200; 500] (max: trip) under GoSafe // mean trip time of GoSafe is about 33 minutes

The plot shows that most of the time is spent in Bike and Sydney and some in Train, therefore Bike is the sure way to reach Sydney but we can also risk catching a train within 10 minutes (if train does not come, we go back to Bike option).

Learn fast strategy by minimizing the time to get to Sydney:

strategy GoFast = minE (trip) [<=200] : <> Kim.Sydney

simulate 100 [<=100] {
     14+Kim.Aalborg, 12+Kim.Bike, 10+Kim.Easy, 8+Kim.Heavy,
      6+Kim.Train,    4+Kim.Go,    2+Kim.Wait, Kim.Sydney 
} under GoFast

Pr[<=60](<> Kim.Sydney) under GoFast // probability is about 0.94 (rushing is risky!)

E[<=200; 500] (max: trip) under GoFast // mean trip time is 15.4 minutes (almost 2x faster than no strategy)

The plot shows that most of the time is spent in Easy and Heavy and then Sydney, therefore Car is the fastest choice on average but we can get stuck in Heavy traffic and be late.

In order to get safe and fast strategy we optimize the GoSafe by minimizing the trip time to get to Sydney:

strategy GoFastSafe = minE (trip) [<=200] : <> Kim.Sydney under GoSafe

simulate 100 [<=100] {
     14+Kim.Aalborg, 12+Kim.Bike, 10+Kim.Easy, 8+Kim.Heavy,
      6+Kim.Train,    4+Kim.Go,    2+Kim.Wait, Kim.Sydney
} under GoFastSafe

Pr[<=60](<> Kim.Sydney) under GoFastSafe // probability is >0.99 (we could not find a counter-example)

E[<=200 ; 500] (max: trip) under GoFastSafe // mean trip time is 22.4 minutes (faster than GoSafe but not as fast than GoFast)

The result shows that Train is the safe and fast option, but a few times one may need to go back and pick a bike.

The model and queries can be downloaded here: traffic.xml.

A walkthrough video presented at ATVA 2014: ATVA14-walkthrough.wmv.

Newspaper and Travel

Model: newspaper.xml

Safe and Optimal Cruise Control

This example is dedicated to Ernst-Rüdiger Olderog [LMT15]. It demonstrates the usage of symbolic techniques in synthesizing a safe discrete controller for continuous dynamics and then optimizing the control for maximum progress by using statistical techniques — all within a single integrated tool Uppaal Stratego.

The goal of cruise control is to maintain steady speed of a vehicle. As roads are becoming more congested, an adaptive cruise control would also maintain a safe distance to vehicles ahead. Picture below captures a setup with our Ego car and a car in Front.

Cars following each other with cruise control

Ego is equiped with a distance sensor. The distance changes dynamically based on a relative speed between cars and each car controls its speed by acceleration or breaking. The goal is to synthesize a control strategy whether to accelerate, maintain speed, break or even move backwards based on measured relative distance.

Our proposed solution is to model the system as a Stochastic Priced Timed Game, abstract away the stochastic and continuous dynamics, solve the Timed Game with a safety goal (when Ego should accelerate or break but maintain minimal safe distance despite what Front may choose to do), then take the safe Ego strategy back to the original Stochastic Priced setup and optimize the safe controller strategy into fast one by using learning techniques.

Modeling

First we model the controllable options of such a system, namely the actions of car Ego of picking its acceleration values via solid edges:

Controllable car: Ego

Front car can also exercise the same actions, but for us its transitions are not controllable (dashed edges). In addition Front may go beyond the reach of Ego’s distance sensor (Faraway location):

Uncontrollable car: Front

We introduce a scheduler to limit the decision taking for both cars to once per second. The scheduler dictates that Ego should make its choice first without the knowledge of what Front’s choice, then Front may choose its acceleration (may turn against Ego’s choice) and after one second the discrete approximation of velocities and distance is calculated:

Control scheduler

The approximate integer distance is calculated based on averages using the function below:

int distance; // approximated distance between cars
int velocityEgo, velocityFront; // approximated velocities
int accelerationEgo, accelerationFront; // discrete acceleration

void _updateDiscrete()_
{
    int oldVel, newVel;
    oldVel = velocityFront - velocityEgo;

    velocityEgo = velocityEgo + accelerationEgo;
    velocityFront = velocityFront + accelerationFront;

    newVel = velocityFront - velocityEgo;

    if (distance > maxSensorDistance) {
        distance = maxSensorDistance + 1;
    } else {
        distance += (oldVel + newVel)/2;
    }
}

The actual dynamics of continuous distance and velocities is specified by ordinary differential equations over “hybrid clocks” as an invariant expression:

![Dynamical monitor with reset for learning](cruise-monitor.svg "Dynamical monitor with reset for learning")

hybrid clock rVelocityEgo, rVelocityFront; // continuous "real" velocities
hybrid clock rDistance; // continuous "real" distance between cars
hybrid clock D; // "cost" to be minimized: integral over distance

double _distanceRate_(double velFront, double velEgo, double dist)
{
    **if** (dist > maxSensorDistance)
        **return** 0.0;
    **else**
        **return** velFront - velEgo;
}

The hybrid clocks here are special: they do not influence the behavior of the model but rather monitor the “cost” of a simulation run, consequently they are abstracted away under symbolic model checking queries and enabled in statistical queries. Likewise, the stochastic features become plain non-deterministic under symbolic model checking queries. The RESET location provides special hint for statistical learning (the syntax is going to change in the future): the looping edge resets the variables which should not influence the strategy. In this case all variables are ignored as noise except the process locations.

We validate the discrete approximation by inspecting both discrete and continuous trajectories. The plot shows that discrete variables have exactly the same values as continuous ones at integer time points:

simulate 1 [<=20] { rDistance, dist, rVelocityFront - rVelocityEgo, velocityFront - velocityEgo }

Discrete and continuous trajectories

However the verification shows that the distance is not always safe:

A[] distance > 5             // FALSE - unsafe

Statistical model checking also shows that the probability of a crash is unacceptably high if Ego is driven randomly:

Pr[<=100] (<> distance <= 5) // [0.85, 0.86] (95% CI)

The resulting plot shows that the probability density is highest in the beginning and the cumulative probability is over 85% after 100 seconds:

Random drive crash probability density over time

Synthesis of Safe Strategy

Controller synthesis uses symbolic game-solving techniques where Front is an oponent/adversary, Ego is the controlable player and the goal is to maintain the safe distance. Like with other symbolic techniques the continuous dynamics is abstracted away with Uppaal Tiga query and the resulting strategy is stored under the name safe:

strategy safe = control: A[] dist > 5  // success

The synthesized strategy is validated by simulating model behavior with controller choices from safe strategy. The plot shows that the distance trajectories are above 5, i.e. model does not exhibit crashes anymore:

simulate 10 [<=100] rDistance under safe

Simulation under safe strategy

Once we have the safe strategy we can also estimate the smallest safe distance for any relative velocity. The plot below is a result of many infimum queries (symbolic model checking) for each possible relative velocity value:

inf { velocityFront-velocityEgo == _v_ } : dist under safe

Safe distances

The plot shows that the distance has a quadratic dependency on velocity when the velocity is negative (i.e. Front is approaching towards Ego) and it is a bit more complicated when Front is moving away.

Maximizing Progress by Minimizing Distance

Previous section showed some properties of the safe strategy. Apart from being safe, the strategy also exhibits rich (non-deterministic) behavior saying that there are many ways to stay safe. An emediate remark follows that even though the distance is safe it can still be arbitrary large, i.e. Ego may choose to stay home or even drive backwards without any progress. In order to guarantee maximal progress Ego should follow Front as close as possible, i.e. the progress is maximized when the distance between the cars is minimized. A statistical learning query is used to select a particular controllable transition from a non-deterministic set of safe transitions with an objective to minimized the expected cumulative distance D after 100 seconds under safe strategy drive:

strategy safeFast = minE (D) [<=100]: <> time >= 100 under safe

The learned result is a another strategy stored under the name safeFast which can also be checked using statistical queries. For example, here is a number of random runs using safeFast strategy:

simulate 10 [<=100] rDistance under safeFast

Simulation under safe and fast strategy

Notice that the distance trajectories are no longer diverging. The efficiency of the strategies is contrasted by comparing probability distributions of distances, which shows that under the safeFast strategy the distance is reduced from 138 to 24 on average:

Pr[rDistance <= maxSensorDistance + 1] <> time >= 100 under safe

Pr[rDistance <= maxSensorDistance + 1] <> time >= 100 under safeFast

Safe versus fast strategy

In fact the new strategy is so effective at following the front car that the number of possible relative velocities is restricted to a narrow range of [-4,4] (as oposed to [-22,30] under safe) as shown in histogram of distances over velocities:

Box plot of distances over velocities under safe and fast strategy

The box plot also shows that the mean distances stay low and only rarely diverge beyond 70.

Uppaal Stratego model: cruise.xml

References

[LMT15] Kim Guldstrand Larsen, Marius Mikučionis and Jakob Haahr Taankvist. Safe and Optimal Adaptive Cruise Control. Correct System Design Symposium in Honor of Ernst-Rüdiger Olderog on the Occasion of His 60^th Birthday Oldenburg, 2015. {bib}