Stochastic and computational methods applied to biology Mutation frequencies in a birth-death branching process Supervisor(s): Tibor Antal A tumour can grow from one cell to billions of cells. Over billions of cell divisions, errors in DNA replication accumulate. So a tumour is genetically diverse. This diversity is important for at least two reasons. First, it is a key factor in resistance to treatment and disease recurrence. Second, genetic data can provide a window into the past evolutionary trajectory of a tumour. On both counts, mathematics has helped our collective understanding. Mathematical models offer precise, quantitative descriptions which, when combined with data, illuminate otherwise obscure genetic processes. This thesis is not so concerned with data. Rather we study some of the most simple and fundamental probabilistic illustrations of a growing cell population (not limited to cancer). Our broad intention is to explore the quantitative relationship between a cell population’s growth and its genetic information. Link to online thesis Cancer recurrence times and early detection from branching process models Supervisor(s): Tibor Antal Detecting cancer early is recognized as one of the most effective strategies to im- prove prognosis and chances of survival. However, while screening and diagnostic technologies are advancing, their effectiveness relies on a quantitative understanding of the complex biological processes underlying the development of a tumor. Suitable theoretical frameworks to investigate and describe these processes are often provided by mathematical models, whose results can then be reinterpreted in the context of cancer evolution. In this thesis we consider two mathematical descriptions for the growth of tumors and related cellular populations. The first one leads to estimates for the first time that a metastasis generated by a primary lesion becomes detectable. The second presents instead an abstract representation for an emerging and promising type of screening tests called liquid biopsies. Quantitative features of these models are also compared with relevant clinical data. Furthermore, both approaches are based on stochastic mathematical tools that are introduced at the beginning of this thesis. Link to thesis online Distributions of RNA polymerase and transcript numbers in models of gene expression describing the mRNA life-cycle Supervisor(s): Ramon Grima, Nikola Popovic A eukaryotic cell consists of three parts: the cell membrane, the nucleus, and the cytoplasm. The cytoplasm fills the space between the membrane and the nucleus. The DNA, which consists of genes (sections of DNA) and contains the genetic information responsible for the development and function of an organism, is housed inside the nucleus. The nucleus is also where the information that is stored in a gene, is copied into a new molecule of messenger RNA (mRNA); this is the process of mRNA production, and it is called transcription. After being produced, the mRNA molecule carries the copied message to the ribosomes, which are cellular machines that use the encoded information in order to perform biological protein synthesis in the cytoplasm. The mechanism of protein production is called translation. This fundamental process, which enables cells to convert encoded information in DNA to synthesise proteins, is called gene expression. Due to technological advancements, several experimental techniques can be used to measure the number of mRNA and protein molecules in single cells. Experimental data show that measured numbers vary randomly over time and that this variation is different from cell to cell. Consequently, there are random fluctuations in the gene expression process, and they have a profound effect on cellular functions. In the last few decades, scientists have tried to shed light on the underlying mechanisms of gene expression by using mathematical models that can help extract information from experimental observations. The most well-known and widely used model, that describes mRNA dynamics, is the so-called telegraph model. This model consists of three sets of events: (i) Gene switches between active and inactive states; i.e. the RNA transcription process can or can not begin. The simple biological explanation for this switching is that for transcription to be initiated, certain protein molecules, called transcription factors, that participate in this process must be present and located near the gene. (ii) When the gene is active, the transcription of the mRNA may begin, and (iii) when an mRNA molecule is produced, it may decay. In other words, in the telegraph model, all these events are modelled as a system of four chemical reactions: gene activation, gene inactivation, mRNA production, and mRNA degradation; all reactions are random. Now, mathematically speaking, this system of chemical reactions can be described by an equation, the solution of which is the probability distribution of mRNA molecule numbers; specifically, this distribution provides us with the information of what is the probability of finding a certain number of mRNA molecules in a cell at a certain time. The analytical expression of this distribution paired with experimental data can be used to estimate the rates at which reactions occur in the model, which may provide us with a better understanding of which processes happen faster/slower than others during gene expression. The telegraph model is a simplified representation of transcription and unfortunately can not explain several biological observations because gene expression is a much more complicated process in real life. In this thesis, we present the construction and mathematical analysis of three more biologically realistic models of gene expression. The first model of interest in this thesis is a model of gene expression that takes into account the significant role of a polymerase molecule in the process of transcription; polymerase is an enzyme that helps to assemble the mRNA molecule by attaching to and moving along the DNA. This model is an extended version of the telegraph model, with the difference that the gene fluctuates between three different states. There are two permissive states of the gene, on and off (transcription factor is bound, and the gene activity is depending on the binding state of the polymerase molecule), and one non-permissive state of the gene (neither transcription factor nor polymerase is bound). This change in gene states is not commonly modelled, but our work shows the importance of including this biological detail in our model. The second model, that we study, focuses on the complex process of mRNA transcription. Some well known biochemical steps of transcription are not often modelled in detail and here we develop a model that takes into account these steps e.g. transcriptional initiation, polymerase movement along DNA, polymerase detachment from DNA, polymerase pausing on DNA, and transcription termination. By performing mathematical analysis, we try to understand how the common telegraph model emerges from this more complex model while showing also how fluctuations in the number of RNA molecules depend on model parameters. Our third model considers oscillatory signal-dependent dynamics that affect the mRNA life cycle. It is well known that internal and external signals affect cell fate and hence, this model can provide us with some insights into the underlying mechanism of how gene expression responds to stimuli. We show that, depending on the frequency of signal oscillations and the measurement time, the signal can increase or decrease the fluctuation in the mRNA number in the cytoplasm compared to those in the nucleus. Link to thesis online This article was published on 2025-04-22