RiskPaths¶

RiskPaths is a well-known MODGEN model that is primarily used for teaching purposes and described here[5] in terms of the model itself and here in terms of implementation[6]. It models fertility in soviet-era eastern Europe, examining fertility as a function of time and union state. In the model, a woman can enter a maximum of two unions in her lifetime. The first union is divided into two sections: a (deterministic) 3 year period during which fertility is at a maximum, followed by a (stochastic) period with lower fertility.

riskpaths

Counts of transitions by age: first pregnancy (purple), beginning of first union (blue), end of first union (ochre), start of second union (green), end of second union (red).

Note: the flat mortality rate used in this model skews mortality events towards younger ages.

Examples Source Code

The code for all the examples can be obtained by either:

pulling the docker image virgesmith/neworder:latest (more here), or
installing neworder, downloading the source code archive from the neworder releases page and extracting the examples subdirectory.

Inputs¶

The input data is basically refomatted versions of the original inputs (commented) from the MODGEN example:

import numpy as np
from enum import Enum

# classification UNION_STATE
class UnionState(Enum):
  NEVER_IN_UNION = 0
  FIRST_UNION_PERIOD1 = 1
  FIRST_UNION_PERIOD2 = 2
  AFTER_FIRST_UNION = 3
  SECOND_UNION = 4
  AFTER_SECOND_UNION = 5

class Parity(Enum):
  CHILDLESS = 0
  PREGNANT = 1


def partition(start: float, finish: float, step: float=1.) -> np.ndarray[np.float64, np.dtype[np.float64]]:
  """ Helper function to return an inclusive equal-spaced range, i.e. finish will be the last element """
  # ensure finish is always included
  return np.append(np.arange(start, finish, step), finish)

# Dynamics parameters

# Age of Consent at which the fertility rates begin
min_age = 15.0
max_age = 100.0

# //LABEL(RiskPaths) RiskPaths
# parameters {
#   logical CanDie = FALSE; // Switch to turn mortality on

#    // Age-specific death probabilities
#   double  ProbMort[LIFE] = {
#       (100) .01, 1,
#   };

# flat mortality up to age 100 in 1y intervals
mortality_rate = np.full(int(max_age), 0.01)
mortality_rate[-1] = 1.0
mortality_delta_t = 1.0

# fertility rates given in 2.5y chunks from 15 to 40 incl
fertility_delta_t = 2.5
AgeintState = partition(min_age, 40.0, fertility_delta_t)

#    // Age baseline for first pregnancy
#   double  AgeBaselinePreg1[AGEINT_STATE] = {
#       0, 0.2869, 0.7591, 0.8458, 0.8167, 0.6727, 0.5105, 0.4882, 0.2562, 0.2597, 0.1542, 0,
#   };

#  f(AgeintState)
p_preg = np.array([0.0, 0.2869, 0.7591, 0.8458, 0.8167, 0.6727, 0.5105, 0.4882, 0.2562, 0.2597, 0.1542, 0.0])

#    // Age baseline for 1st union formation
#   double  AgeBaselineForm1[AGEINT_STATE] = {
#       0, 0.030898, 0.134066, 0.167197, 0.165551, 0.147390, 0.108470, 0.080378, 0.033944, 0.045454, 0.040038, 0,
#   };

p_u1f = np.array([0.0, 0.030898, 0.134066, 0.167197, 0.165551, 0.147390, 0.108470, 0.080378, 0.033944, 0.045454, 0.040038, 0.0])

# union1 lasts at least 3 years
min_u1 = 3.0

#    // Relative risks of union status on first pregnancy
#   double  UnionStatusPreg1[UNION_STATE] = {
#       0.0648, 1.0000, 0.2523, 0.0648, 0.8048, 0.0648,
#   };
# See UnionState
#                     sgl      u1a     u1b     sgl     u2      sgl
r_preg = np.array([0.0648, 1.0000, 0.2523, 0.0648, 0.8048, 0.0648])

# taking this to mean union2 formation (=length of post-union1 single period)
#    // Separation Duration Baseline of 2nd Formation
#   double  SeparationDurationBaseline[DISSOLUTION_DURATION] = {
#       0.1995702, 0.1353028, 0.1099149, 0.0261186, 0.0456905,
#   };


# UnionDuration = [1, 3, 5, 9, 13]
# currently need to modify above to have equal spacing
union_delta_t = 2.0
#                         1          3          5          7          9         11         13
r_u2f = np.array([0.1995702, 0.1353028, 0.1099149, 0.1099149, 0.0261186, 0.0261186, 0.0456905])

# Something wrong here: more data than dims
#    // Union Duration Baseline of Dissolution
#   double  UnionDurationBaseline[UNION_ORDER][UNION_DURATION] = {
#       0.0096017, (2) 0.0199994, 0.0213172, 0.0150836, 0.0110791,
#       (2) 0.0370541, (2) 0.012775, (2) 0.0661157,
#   };
# };

#                         1           3          5          7          9         11           13
r_diss2 = np.array([[0.0096017, 0.0199994, 0.0199994, 0.0199994, 0.0213172, 0.0150836, 0.0110791],
                   [0.0370541, 0.0370541, 0.012775,   0.012775,   0.012775, 0.0661157, 0.0661157]])

[file: ./examples/riskpaths/data.py]

Implementation¶

The model implementation is in continuous time, unlike the original MODGEN implementation. Firstly, age at death is sampled for the population, then the time(s) of the union transitions. This former is done in the model constructor:

class RiskPaths(neworder.Model):
  def __init__(self, n: int):

    super().__init__(neworder.NoTimeline(), neworder.MonteCarlo.deterministic_identical_stream)

    # initialise population - time of death only
    self.population = pd.DataFrame(index=neworder.df.unique_index(n),
                                   data={"TimeOfDeath": self.mc.first_arrival(data.mortality_rate, data.mortality_delta_t, n, 0.0),
                                         "TimeOfPregnancy": neworder.time.NEVER,
                                         "Parity": Parity.CHILDLESS,
                                         "Unions": 0,
                                        })

[file: ./examples/riskpaths/riskpaths.py]

As there are no branching transitions, the times of the events (should they occur) can be sampled directly. The possible transitions, all of which have an impact on the fertility rate, are:

enter first union, a fixed length "honeymoon" period during which fertility is highest
enter second phase of first union
leave first union
enter second union
leave second union

Each of these are sampled as an open-ended (i.e. might not happen) nonhomogeneous Poisson process, and events that happen after the individual's sampled age at death are discarded.

Once an individual's union history is known, birth events can be sampled (births have no impact on union status in this model). Thus the step method samples union state and then pregnancy (code not show here for brevity):

  def step(self) -> None:

    # first sample union state transitions, which influence pregnancy
    self.__union()
    # now sample state-dependent pregnancies
    self.__pregnancy()

[file: ./examples/riskpaths/riskpaths.py]

Output¶

Once the end of the timeline has been reached, the finalise method:

  def finalise(self) -> None:
    neworder.log("mean unions = %f" % np.mean(self.population.Unions))
    neworder.log("pregnancy ratio = %f" % np.mean(self.population.Parity == Parity.PREGNANT))

[file: ./examples/riskpaths/riskpaths.py]

simply prints a couple of summary statistics:

[py 0/1]  mean unions = 0.923190
[py 0/1]  pregnancy ratio = 0.467840

The histogram above was generated with code that can be found in the examples, see the links above.