## Building a collagen fibrils model for Molecular Dynamics simulations (MD)

### Fibril model geometry

The simulations performed in this study were generated using a mesoscopic model where solvated tropocollagen molecules are described as a collection of particles interacting according to multi-body potentials. Further details on the development of the mesoscopic model can be found in a series of previous publications (Buehler, 2006a, 2006b). In the mesoscopic model, collagen microfibril geometry is based on Protein Data Bank entry 3HR2 representing the microfibrillar structure of type I collagen measured in situ by X-ray crystallography (Orgel et al., 2006). The full atomistic information of the triple helical tropocollagen molecule is simplified and represented as a single chain of ‘beads’ or ‘super-atoms’. Every bead in the mesoscopic model represents several atoms of the full-atomistic model (Fig. 1). The influence of solvation water is also taken into account in the model. This simplification allows us to reach time scales of microseconds and length scales of several hundred nanometers that cannot be achieved using full-atomistic resolution. The coordinates of the atoms of a single tropocollagen molecule are averaged using spline approximation along the length of the molecule. Equidistant beads are then created along the spline to create the coarse-grained model (Fig. 1A and B). The distance between the beads is set to an equilibrium distance r0 of 14.0 Å, which approximates the diameter of the tropocollagen molecule. After aligning the molecule along its principal axis, the fibril is then built by replication of the tropocollagen to form a cylinder of diameter d=20nm (Fig. 1D and E). The periodicity of the molecules is given by the X-ray crystallography measurement (Orgel et al., 2006). The model of the fibril is built to exhibit five gap/overlap regions along its length and is composed of 155 tropocollagen molecules which represent a total of 33,790 beads. The structure has been built to ensure its periodicity along the fibril axis (x-axis). During the simulation, the model is replicated by using periodic boundary conditions along its length to simulate an infinitely long fibril. The length of the model is larger than a single molecule length (~300 nm) in order to prevent artefactual interactions between the two ends of a molecule through the periodic boundaries. The collagen fibrils show the characteristic staggered arrangement as observed in experiments.

### Modeling enzymatic cross-links

In this study, we focus on the role of enzymatic cross-links in the mechanics of collagen fibrils. Experimental analyses of the molecular geometry suggest that intermolecular enzymatic cross-links primarily develop between lysine or hydro-xylysine residues at the ends of tropocollagen molecules (Viguet-Carrin et al., 2006; Eyre and Wu, 2005; Bailey et al., 1998; Robins and Bailey, 1977). Enzymatic cross-links form a covalent bond between side chains of the residues of two tropocollagen molecules. We study the influence of divalent and trivalent cross-links. Divalent and trivalent cross-links have been assumed to link two and three different collagen molecules respectively. Based on a distance criterion, the last bead of a collagen molecule forms a covalent bond with one or two adjacent molecules to create a divalent or a trivalent cross-link respectively. Fig. 1C shows a representation of the collagen cross-links in the model. The amount of both divalent and trivalent cross-links varies from 0 to 100% where 100% represents two terminal cross-links per collagen molecules (2 mol/mol), which is the maximum number of enzymatic cross-links that can be formed per molecule (Eyre and Wu, 2005). The distribution of the cross-links in the fibril is generated randomly.

### Coarse-grained model parameterization of collagen molecules

The total energy of the system is given by the sum of all bond energy, three-body and pairwise interaction energy as

where Ebond represent the energy contribution due to stretching, Eangle, the energy due to bending and Einter, the energy due to intermolecular interactions. Each term is given by the sum of all the inter-bead interactions of this kind as

The bonding energy contributions have been modeled with a bilinear law, to approximate the nonlinear stress- strain behavior of a single collagen molecule under tensile loading, as described previously (Buehler and Gao, 2006; Buehler and Wong, 2007). The force between two particles is

where

where k(0)T and k(1)T are spring constants for the small and large deformations. The stretching energy Φbond(r) is given by integrating Fbond(r) over the radial distance. The bending energy contributions of triplets of particles

are defined by:

where kB has been computed based on the bending stiffness of the molecule (Buehler, 2006a, 2006b). The parameter φi represents the equilibrium angle between two beads of the coarse-grained model. Several different equilibrium angles have been selected in order to mimic a collagen molecule’s initial geometry (Fig. 1B). The angles have been measured on the coarse-grained model representing a single collagen molecule and rounded to the nearest integer to reduce the number of variables. This results in eight equilibrium angles, ranging from 170 to 1801. Intermolecular interactions are modeled by the Lennard–

Jones (LJ) potential:

where σLJ is the distance parameter and εLJ the energy parameter which determine the strength of the intermolecular adhesion without the presence of any cross-links (Buehler, 2006b). The influence of water has been taken into account during the fitting of the coarse-grained parameters with full-atomistic results and is not explicitly modeled here. No viscosity has been included here and its influence will be analyzed in a subsequent study. All coarse-grained parameters for solvated tropocollagen molecules are summarized in the following Table:

### Coarse-grained model parameterization of cross-links

The cross-link properties are modeled using a bottom-up approach, in which we incorporate results from earlier work into the model. We selected two representative structures of divalent and trivalent cross-links (dehydro-lysino-norleucine and Lysyl-pyridinoline (Eyre and Wu, 2005)) and created a full atomistic model using Material studio 4.4 (Accelerys, Inc.).

The mechanical behavior of the cross-links has been assessed by steered molecular dynamics in LAMMPS (Plimpton, 1995). Using ReaxFF reactive force field, we are able to evaluate the mechanical behavior of the cross-links until failure (Fig. 2). For the trivalent cross-link, a tensile test has been performed in the three possible directions. Due to very similar behavior, the mechanical response of the cross-link has been averaged and fit to a single set of bond parameters in its coarse-grained representation. The responses of the cross- links have been assumed to be bilinear. Fibrils models have been built with the parameters of the cross-links presented in Table 1. To explore the influence of cross-links mechanical properties, we also build three fibril models with cross-links densities of 20%, 60% and 100% where the stiffness of the cross-links has been increased by multiplying the tensile stiffness parameters of trivalent cross-links by a factor two; all other parameters remain the same.

### Simulation parameters and procedure

All molecular dynamics simulations are performed using the LAMMPS code (Plimpton, 1995). The integration time step is Δt=10 fs. Equilibration calculation is carried out in NPT ensemble by applying Nose–Hoover thermostat set to 300 K and Hoover barostat set to 0 Pa along fibril length. Relaxation times for the thermostat and barostat are fixed to 1 and 10 ps respectively. Tensile loading is applied within the NVT ensemble. A single fibril is equilibrated in vacuum for 20 ns and the structure reaches the equilibrium state with no further structural change measured by root-mean-square deviation of bead positions. We also ensure that the fibril length after equilibration remains similar for all models to confirm the stability of the model and to prevent any folding of the fibril. Despite the relaxation along the fibril axis, the length variation between the different models is minimal (<0.35%). A visual test was performed in addition to guarantee the conservation of the fibrillar geometry. After equilibration calculation, the structure displays the characteristic D-period of collagen fibrils in agreement with experiments (Wenger et al., 2007; Hulmes et al., 1981; Hulmes, 2002). To model tensile deformation of collagen fibril we apply homogeneous strain to the entire fibril model along the x-axis with the equivalent strain rate of 3.3m/s. This strain rate is five orders of magnitude larger than what is commonly used experimentally for stretching collagen material as the consequence of the timescale limitation of themolecular model. Total times spans of several microseconds are the most that can be simulated because of the small integration time step.We use the virial stress to compute the stress tensor (Tsai, 1979) for analysis of the strain–stress behavior of the fibril and collagen molecules.