# Research

I investigate formal methods for the modeling and analysis of **hybrid and probabilistic systems** in the area of **medical cyber-physical systems (CPSs)**, **systems and synthetic biology** and **system design**. I'm particularly interested in the development and application of techniques for **automated verification, control and synthesis** of such systems, which is made challenging by quantitative aspects such as stochastic and continuous non-linear dynamics, as well as the hybrid dynamics that naturally arise in CPSs.

In my current and past research, I have worked on several case studies where providing formal correctness guarantees and correct-by-design model and controller synthesis is critical, including the **artificial pancreas** for closed-loop diabetes therapy, **implantable cardiac devices** for treatment of arrhythmias and **biological networks** for disease prediction and engineering of molecular devices.

##### Data-driven robust control for insulin therapy

The artificial pancreas aims to automate treatment of type 1 diabetes (T1D) by integrating insulin pump and glucose sensor through control algorithms. However, **fully closed-loop therapy is challenging** since the blood glucose levels to control depend on disturbances related to the patient behavior, mainly **meals and physical activity.**

To handle meal and exercise uncertainties, in our work we construct **data-driven models of meal and exercise behavior**, and develop a **robust model-predictive control (MPC)** system able to reject such uncertainties, in this way eliminating the need for meal announcements by the patient. The data-driven models, called uncertainty sets, are built from data so that they cover the underlying (unknown) distribution with prescribed **probabilistic guarantees**. Then, our robust MPC system computes the insulin therapy that minimizes the worst-case performance with respect to these uncertainty sets, so providing a principled way to deal with uncertainty. State estimation follows a similar principle to MPC and exploits a prediction model to find the most likely state and disturbance estimate given the observations.

We evaluate our design on synthetic scenarios, including high-carbs intake and unexpected meal delays, and on large clusters of virtual patients learned from population-wide survey data sets (CDC NHANES).

##### SMT-based synthesis of safe and robust PID controllers

In this work, we present a new method for the **automated synthesis of PID controllers with safety and performance guarantees for hybrid systems with stochastic and nonlinear dynamics**.
Synthesized controllers are robust by design since they minimize the probability of reaching an unsafe state under random disturbances.

It is well known that safety verification is a difficult problem (undecidable) for nonlinear hybrid systems, hence it must be solved using approximation methods. We build on the frameworks of delta-satisfiability (implemented in tools like dReal and iSAT) and probabilistic delta-reachability (ProbReach tool) to reason formally about nonlinear and stochastic dynamics by providing solutions with numerical guarantees up to an arbitrary precision.

We evaluate our approach by synthesizing provably safe and robust controllers for the artificial pancreas case study, where we manage to avoid hypoglycemia (low blood sugar) and reject large random meal disturbances.

##### Robust design synthesis for probabilistic systems

We consider the problem of synthesizing **optimal and robust designs** (given as parametric Markov chains) that are 1) robust to pre-specified levels of variation in the system parameters; 2) satisfy strict performance, reliability and other quality constraints; and 3) are Pareto optimal with respect to a set of quality optimisation criteria.

The resulting Pareto front consists of quality attribute regions induced by the parametric designs (see picture). The size and shape of these regions provide key insights into the system robustness since they capture the sensitivity of quality attributes to parameter changes (i.e, small regions = robust designs).

The method is implemented in the RODES tool and is based on a combination of **GPU-accelerated parameter synthesis for Markov chains** (to compute the quality attribute regions for fixed discrete parameters) through the PRISM-PSY tool, and **multi-objective optimization based on an extended, sensitivity-aware dominance relation**.

##### Optimal synthesis of stochastic chemical reaction networks

**Period:**2016-2017 |

**Collaborators:**Max Whitby, Luca Laurenti, Luca Cardelli, Milan Ceska, Marta Kwiatowska, Martin Franzle

The automatic derivation of Chemical Reaction Networks (CRNs) with prescribed behavior is one of the holy grails of synthetic biology, allowing for design automation of molecular devices and in the construction of predictive biochemical models.

In this work, we provide the **first method for optimal syntax-guided synthesis of stochastic CRNs** that is able to derive not just rate parameters but also the structure of the network. Borrowing from the programming language community, we propose a **sketching language** for CRNs that allows to specify the network as a partial program, using holes and variables to capture unknown components. Under the Linear Noise Approximation of the chemical master equation, a CRN sketch has a semantics in terms of parametric Ordinary Differential Equations (ODEs).

We support rich correctness properties that describe the temporal profile of the network as **constraints over mean and variance of chemical species, and their higher order derivatives**. In this way, we can synthesize networks where e.g., one of the species exhibit a bell-shape profile or has variance greater than its expectation.

We synthesize CRNs that satisfies the sketch constraints and a correctness specification while minimizing a given cost function (capturing the structural complexity of the network). The optimal synthesis algorithm employs **SMT solvers over reals and ODEs** (iSAT) and a meta-sketching abstraction that speeds up the search through cost constraints.

We evaluate the approach on a three key problems: synthesis of networks with a bell-shaped profile (occurring in signaling cascades), a CRN implementation of a stochastic process with prescribed levels of noise and synthesis of sigmoidal profiles in the phosphorelay network.

##### Attacking ECG biometrics

**Period:**2016-2017 |

**Collaborators:**Simon Eberz, Andrea Patane, Ivan Martinovic, Marta Kwiatowska, Marc Roeschlin

With the increasing popularity of mobile and wearable devices biometric recognition has become ubiquitous. Unlike passwords, which rely on the user knowing something, biometrics make use of either distinctive physiological properties or behavior.

In this work, we study a **systematic attack against ECG biometrics**, i.e., that leverage the electrocardiogram signal of the wearer, and evaluate the attack on a commercial device, the Nymi band.
We instantiated the attacks using different techinques: a hardware-based waveform generator, a computer sound card, and the playback of ECG signals encoded as .wav files using an off-the-shelf audio player.

We collected training data from 40+ participants using a variety of ECG monitors, including a medical monitor, a smartphone-based mobile monitor and the Nymi Band. Then, we enrolled users into the Nymi Band and test whether data from any of the above ECG sources can be used for a signal injection attack. Using data collected directly on the Nymi Band, we achieve a success rate of 81%, which decreases to 43% with data from other devices.

To improve the success rate with data from other devices, available to the attacker through e.g. medical records of the victim, we devise a method for **learning optimal mappings between devices** (using training data), i.e., functions that transforms the ECG signal of one device as if it was produced by another device. Thanks to this method, we achieved a **62% success rate with data not produced by the Nymi band.**

##### Closed-loop quantitative verification of rate-adaptive pacemakers

**Period:**2014-2016 |

**Collaborators:**Marta Kwiatowska, Andrea Patane, Alexandru Mereacre, Harriet Lea-Banks

Rate-adaptive (RA) pacemakers are able to adjust the pacing rate depending on the levels of activity of the patient, detected by processing physiological signals data. RA pacemakers represent the only choice when the heart rate cannot naturally adapt to increasing demand (e.g. AV block). RA parameters depend on many patient-specific factors, and effective personalisation of such treatments can only be achieved through extensive exercise testing, which is normally intolerable for a cardiac patient.

We introduce a **data-driven and model-based approach for the quantitative verification of RA pacemakers and formal analysis of personalised treatments**. We developed a **novel dual-sensor pacemaker model** where the adaptive rate is computed by blending information from an **accelerometer**, and a metabolic sensor based on the **QT interval** (a feature of the ECG). The approach builds on the HeartVerify tool to provide statistical model checking of the probabilistic heart-pacemaker models, and supports model personalization from ECG data (see heart model page). Closed-loop analysis is achieved through the **online generation of synthetic, model-based QT intervals and acceleration signals**.

We further extend the model personalization method to estimate parameters from patient population, thus enabling safety verification of the device. We evaluate the approach on three subjects and a pool of virtual patients, providing rigorous, quantitative insights into the closed-loop behaviour of the device under different exercise levels and heart conditions.

##### Precise parameter synthesis for continuous-time Markov chains

**Period:**2014-2016 |

**Collaborators:**Milan Ceska, Marta Kwiatowska, Frits Dannenberg, Peter Pilar, Lubos Brim

Given a parameteric continuous-time Markov chain (pCTMC), we aim at finding parameter values such that a CSL property is guaranteed to hold (**threshold synthesis**), or, in the case of quantitative properties, the probability of satisfying the property is maximised or minimised (**max synthesis**).

The solution of the threshold synthesis (see picture) is a decomposition of the parameter space into true and false regions that are guaranteed to, respectively, satisfy and violate the property, plus an arbitrarily small undecided region. On the other hand, in the max synthesis problem we identify an arbitrarily small region guaranteed to contain the actual maximum/minimum.

We developed **synthesis methods based on a parameteric extension of probabilistic model checking** to compute probability bounds, as well as refinement and sampling of the parameter space. We implemented **GPU-accelerated algorithms** for synthesis in the PRISM-PSY tool, which we applied to a variety of biological and engineered systems, including models of molecular walkers, mammalian cell cycle, and the Google file system.

##### HeartVerify: Model-Based Quantitative Verification of Cardiac Pacemakers

**Period:**2015-2017 |

**Collaborators:**Alexandru Mereacre, Marta Kwiatowska, Andrea Patane, Benoit Barbot

HeartVerify is a framework for the **analysis and verification of pacemaker software and personalised heart models**. Models are specified in MATLAB Stateflow and are analysed using the Cosmos tool for statistical model checking, thus enabling the analysis of complex nonlinear dynamics of the heart, where precise (numerical) verification methods typically fail.

The approach is modular in that it allows configuring and testing different heart and pacemaker models in a plug-and-play fashion, without changing their communication interface or the verification engine. It supports the verification of probabilistic properties, together with additional analyses such as simulation, generation of probability density plots (see figure) and parametric analyses. HeartVerify comes with different heart and pacemaker models, including methods for **model personalization from ECG data** and rate-adaptive pacing.

##### Hardware-in-the-loop energy optimization

**Period:**2015-2016 |

**Collaborators:**Marta Kwiatowska, Alexandru Mereacre, Benoit Barbot, Andrea Patane, Chris Barker

Energy efficiency of cardiac pacemakers is crucial because it reduces the frequency of device re-implantations, thus improving the quality of life of cardiac patients.

To achieve effective energy optimisation, we take a **hardware-in-the-loop (HIL) simulation approach**: the pacemaker model is encoded into hardware and interacts with a computer simulation of the heart model. In addition, a power monitor is attached to the pacemaker to provide real-time power consumption measurements. The approach is model-based and supports generic control systems. Controller (e.g., pacemaker) and plant (e.g., heart) models are specified as networks of parameterised timed input/output automata (using MATLAB Stateflow) and translated into executable code.

We realise a **fully automated optimisation workflow**, where HIL simulation is used to build a probabilistic power consumption model from power measurement data. The obtained model is used by the optimisation algorithm to find optimal pacemaker parameters that e.g., minimise consumption or maximise battery lifetime. We additionally employ parameter synthesis methods to restrict the search to safe parameters. We employ timed Petri nets as an intermediate representation of the executable specification, which facilitates efficient code generation and fast simulations.

##### Probabilistic timed modelling of cardiac dynamics and personalization from ECG

**Period:**2015-2017 |

**Collaborators:**Marta Kwiatowska, Benoit Barbot, Alexandru Mereacre, Andrea Patane

We developed a timed automata translation of the IDHP (Integrated Dual-chamber Heart and Pacer) model by Lian et al. Timed components realize the conduction delays between functional components of the heart and action potential propagation between nodes is implemented through synchronisation between the involved components. In this way, the model can be easily extended with other accessory conduction pathways in order to reproduce specific heart conditions.

The IDHP model can be also **parametrised from patient electrocardiogram (ECG)** in order to reproduce the specific physiological characteristics of the individual. For this purpose, we extended the model in order to generate synthetic ECG signals that reflect the heart events occurring at simulation time. Starting from raw ECG recordings, our method automatically finds model parameters such that the synthetic signal best mimics the input ECG, by minimising the statistical distance between the two. The resulting parameters correspond to probabilistic delays in order to reflect variability of ECG features.

The IDHP heart model can be downloaded from the tool page, while personalization from ECG data is implemented in the HeartVerify tool.

##### SMT-based synthesis of gene regulatory networks

**Period:**2013-2014 |

**Collaborators:**Hillel Kugler, Boyan Yordanov, Christoph Wintersteiger, Youssef Hamadi

Unraveling the structure and the logic of gene regulation is crucial to understand how organisms develop and so is the derivation of predictive models able to reproduce experimental observations and explain unknown biological interactions.

This research centers on the **synthesis of biological programs**, with specific focus on Boolean gene regulatory networks (GRN). We developed methods based on the idea of **synthesis by sketching**, which enables explicit modeling of hypotheses and unknown information, specified as e.g. choices or uninterpreted functions. Through the formalization as an SMT problem, the method can automatically and efficiently resolve the unknown information in order to obtain a model that is consistent with observations.

We applied this approach to synthesize the **first GRN model of the sea urchin embryo** (an important model organism in biology) that precisely and fully reproduce available temporal and spatial expression data and the effects of perturbation experiments. The data we used is the result of 30+ years of research in the Davidson lab at Caltech.

##### Network analysis for bioaccumulation and bioremediation in contaminated food webs

**Period:**2012-2015 |

**Collaborators:**Marianna Taffi, Pietro Lio, Claudio Angione, Mauro Marini, Sandra Pucciarelli

In this project we develop computational methods and models to study **pollution dynamics in ecological networks** and to shed light on three key questions: *How persistent organic pollutants (e.g. PCBs that bind to the fat tissue) are transferred in food webs through feeding connections? What are the species that play role in the pollutant distribution? How to synthesize effective bioremediation strategies mediated by pollutant-degrading bacteria?*

We present a computational framework to 1) reconstruct bioaccumulation networks from (partial) data; 2) identify key species in contamination dynamics through a new network index based on sensitivity analysis; and 3) analyze the multiscale effects of microbial bioremediation on species-level contamination by integrating metabolic networks of biodegrading bacteria and bioaccumulation networks.

We consider the case of **PCBs bioaccumulation in the Adriatic food web and aerobic PCBs bioremediation** by a strain of *P. putida* (see the Tools page to download the models).

##### Parameter synthesis for pacemaker design optimization

Verification is useful in establishing key correctness properties of cardiac pacemakers, but has limitations, in that it is not clear how to redesign the model if it fails to satisfy a given property. Instead, parameter synthesis aims to automatically find optimal values of parameters to guarantee that a given property is satisfied.

In this project, we develop methods to **synthesize pacemaker parameters that are safe, robust and, at the same time, able to optimise a given quantitative requirement** such as energy consumption or clinical indicators. We solve this problem by **combining symbolic methods (SMT solving)** for ruling out parameters that violate heart safety (red areas in the figure) **with evolutionary strategies** for optimising the quantitative requirement.

##### Formal analysis of bone remodelling

**Period:**2011-2013 |

**Collaborators:**Emanuela Merelli, Pietro Lio, Marco Viceconti, Ezio Bartocci, Mohammad Ali Moni

In this project we study bone remodelling, i.e., the biological process of bone renewal, through the use of computational methods and formal analysis techniques. Bone remodelling is a paradigm for many homeostatic processes in the human body and consists of two highly coordinated phases: resorption of old bone by cells called osteoclasts, and formation of new bone by osteoblasts. Specifically, we aim to understand how diseases caused by the weakening of the bone tissue arise as the resulting process of multiscale effects linking the molecular signalling level (RANK/RANKL/OPG pathway) and the cellular level.

To address crucial spatial aspects such as cell localisation and molecular gradients, we developed a **modelling framework based on spatial process algebras and a stochastic agent-based semantics**, realised in the Repast Simphony tool. This resulted in the **first agent-based model and tool for bone remodelling**, see also the Tools page.

In addition, we explored probabilistic model checking methods to precisely assess the probability of a given bone disease to occur, and also hybrid approximations to tame the non-linear dynamics of the system.

##### A multi-level model for self-adaptive systems

Self-adaptive systems are able to modify their own behaviour according to their environment and their current configuration, in order to fulfil an objective, to better respond to problems, or maintain desired conditions.

In this project, we introduce a **hierarchical approach to formal modelling and analysis of self-adaptive systems**. It is based on a structural level *S*, describing the adaptation dynamics of the system, and a behavioural level *B* describing the admissible dynamics of the system. The *S* level imposes structural constraints on the *B* level, which has to adapt whenever it no longer can satisfy them (top-down adaptation).

We introduce **weak and strong adaptability relations**, capturing the ability of a system to adapt for some evolution paths or for all possible evolutions, respectively, and show that **adaptability checking**, i.e. deciding if a system is weak or strong adaptable, **can be reduced to a CTL model checking problem.**