A Guide to Monitoring NASH Progression in Diet-Induced Mouse Disease Models

Fred Beasley, PhD

Friday, December 18th, 2020

A Guide to Monitoring NASH Progression in Diet-Induced Mouse Disease Models

Numerous diet-based models have been developed to induce nonalcoholic fatty liver disease and steatohepatitis (NAFLD/NASH) in mice. Chronic consumption of high-energy feeds enriched with fat causes steatotic buildup, lipotoxic inflammation, hepatic dysfunction, and liver fibrosis. Next generation diets¹ include fructose and cholesterol as accelerants toward the diseased liver state.

A key challenge to using diet-induced NASH mice for preclinical research involves selection and quantification of disease endpoints. In humans, underlying metabolic syndrome is a typical precursor for the later onset of NASH. For both the underlying and liver stages of disease progression, some of the hallmarks of the human condition are recapitulated in the mouse, while others are not. Even among conserved symptoms, some manifest themselves differently in the mouse, despite similar physiological underpinnings. Deciding which endpoints to measure and how to interrogate them requires careful consideration. New trends in sampling techniques will also be presented.

Endpoint selection aims

When selecting disease endpoints to measure in the mouse, one should consider:

Translational relevance to the human condition, especially with regards to markers measured during clinical trialing.
Economic and practical limitations of the researcher's budget, skillset, and supporting analytical infrastructure.
Enabling noninvasive, nonterminal sampling, so disease progression/resolution can be measured serially and longitudinally, and terminal values can be compared to baseline for individuals rather than between groups.

Reliable quantification of disease induction is crucial. Individual mice respond variably to NASH diets, even within an inbred background. Some will prove resistant to the diet, and it is desirable to exclude them from the therapeutic phase of the study. Sorting subjects into treatment arms should occur when sufficiently high baseline values exist to offer measurable windows for therapeutic effect to be confirmed. Sorting should minimize standard deviation between groups. This will be more challenging if the trial is prophylactic in design, since low responders will not be obvious and larger group sizes will be needed to appropriately power the study.

Standards for the Evaluation of NASH

Liver biopsy is the gold standard for baseline and post-treatment evaluation of human patients in Phase 2/3 NASH trials. The FDA's draft guidance on NASH trial design recommends the following efficacy endpoints:

Resolution of steatohepatitis on overall histopathological reading and no worsening of liver fibrosis, or
Improvement in liver fibrosis greater than or equal to one stage and no worsening of steatohepatitis, or
Both resolution of steatohepatitis and improvement in fibrosis.

Human NASH is scored along one of several consensus composite indices. This requires a pathologist to evaluate criteria including steatosis severity, inflammation, and hepatocyte morphology. Fibrosis is also evaluated by the pathologist along an index that measures the localization and extent of scarring. These indices have been adapted for NASH/fibrosis as they present themselves in the mouse².

Liver biopsy is necessarily an invasive surgical procedure, with high challenge, cost, and risk burdens, and scoring is inherently subjective. To generate a more robust data set, patient serum is also obtained and used to measure surrogate markers of liver injury and impairment. Additional analytical chemistry interrogates metabolites that are salient to NASH.

Two Phase 3 trial drug candidates have recently completed their interim (18 month) analyses. Intercept Pharmaceuticals' Ocaliva (REGERATE) generated positive data³ and is continuing with the long-term safety arm of the trial. Genfit's elafibranor (RESOLVE IT) did not achieve key objectives and was discontinued⁴. Regardless of outcome, the trial endpoints and patient enrollment strategies provide a useful context to discuss NASH study design for the mouse.

Direct liver interrogation of NASH

Human endpoints: Ocaliva improved biopsy-confirmed liver fibrosis by ≥1 stage in a greater number of patients than placebo. Ocaliva did not worsen patient NASH composite score. Elafibranor did not achieve significant improvement in either metric.

Human endpoints explained: Liver biopsies were obtained at baseline screening and at 18 months. A NAFLD activity score⁵ (NAS) and a fibrosis staging score were determined by pathologist evaluations. The NAS is a composite score ranging from 0 to 8, comprising: steatosis grade (0-3) + lobular inflammation degree (0-3) + hepatocellular ballooning grade (0-2). Fibrosis staging ranges from 0 to 4; at stage 3, "bridging" fibrosis is observed to connect lobules and portal areas.

Mouse endpoint equivalents: Survival biopsy of the mouse liver is possible⁶. This requires advanced technical expertise, and risks animal infection or death. If survival biopsies are not feasible, a cohort of mice can be designated for terminal biopsy collection to serve as a representative baseline group. In this fashion, experimental groups could not be sorted based on biopsy results, so would not generate longitudinal histological data; furthermore, sub-responders could not be sorted out.

A mouse NAS (0-8) has been adapted from the human scoring system. The same parameters are interrogated. A key limitation is that hepatocyte ballooning rarely exceeds a score of 1. Fibrosis staging can also be evaluated, but stage 3 (bridging) typically takes extremely long induction periods to achieve using purely dietary means, thus may not be a practical endpoint to aim for.

Recent trends: To overcome the qualitative and subjective nature of a pathologist's review, quantitative data can be derived from stained and immunostained liver tissue. This is also useful for describing NASH fibrosis in shorter timeframes. It is increasingly common for histology providers to offer high content imaging and algorithm analysis of endpoints including but not limited to:

% area fibrosis (collagen deposition using PicroSirius Red or Masson's Trichrome stain)
% area steatosis (lipid droplets using hematoxylin & eosin stain); this can also be delineated into micro- and macro-vesicular steatosis levels
Inflammation (% area using galectin 3 immunostain, immune cell density using H&E stain)
Hepatic stellate cell activation (cell density using α smooth muscle actin stain)

An emerging technique may soon become broadly available for noninvasive and quantitative measurement of rodent liver fibrosis. Shear wave elastography (SWE)⁷ involves pulsing acoustic force toward the liver and measuring the velocity of the shear waves that are generated; the calculated V_s has a strong correlation to the severity of liver fibrosis. SWE is an upgraded modality of transient elastography that incorporates imaging for improved resolution of liver disease foci.

Learn at Home Series:

NASH

NASH surrogate and biomarker interrogation of liver diseases

Human endpoints: Ocaliva robustly decreased levels of the circulating biomarkers alanine and aspartate aminotransferase (ALT, AST) and γ glutamyl transferase (GGT). In a Phase 2b trial (GOLDEN 505)⁸, elafibranor had moderate effect on ALT but robustly improved GGT and alkaline phosphatase (ALP).

Human endpoints explained: Liver function tests quantify hepatic enzymes that are released into circulation more abundantly in cases of liver injury or dysfunction. ALT and AST are gold standard biomarkers for hepatocellular diseases including hepatitis, toxicity, and cirrhosis; GGT and ALP describe cholestasis and oxidative stress. Sampling for these is noninvasive, can be done with great frequency, and routinely serves as a surrogate for liver biopsy in Phase 2 trials.

Mouse endpoint equivalents: All four biomarkers are routinely used to monitor NASH and other liver diseases in mice. ALT is extensively used for sorting animals by disease severity at baseline. If mouse blood is not required for additional analyses, these biomarkers might be sampled as frequently as every two weeks, although a four-week sampling interval is more common.

An animal's fasted/fed state and the route of blood sampling may influence biomarker levels. Downstream analytical methodologies may stipulate the use of specific anticoagulants that can also impact values. Best practices should include standardization of sample collection time of day, route, and collection vessels used throughout a study.

Recent trends: ALT, AST, ALP, and GGT are biomarkers for liver dysfunction, but not NASH per se. In the clinic, the lack of widely accepted noninvasive tests specifically developed to quantify NASH is a barrier for patient diagnosis, but numerous candidate assays are emerging in popularity. These include NIS4, a multianalyte assay for quantifying microRNA miR-34a (steatosis/inflammation marker), alpha 2 macroglobulin and glycoprotein YKL-40 (fibrosis markers) and HbA1c (metabolic marker)⁹. These biomarkers merit validation in different rodent models of NASH as noninvasive tools to quantify disease progression.

Serum chemistry and metabolic comorbidities

Human endpoints: Ocaliva trial recruits were required to present with at least one comorbidity of NASH, including obesity or type 2 diabetes. Elafibranor recruits were evaluated at baseline and post-treatment for blood glucose, serum triglycerides, and insulin resistance.

Human endpoints explained: NASH is commonly preceded by metabolic syndrome, defined as the presence of at least three of five of the following cluster factors: abdominal obesity, high serum triglycerides, low HDL cholesterol, high blood pressure, and hyperglycemia/insulin resistance. NASH drugs may work along a mechanistic axis that acts systemically on fat or glucose metabolism, and thus have beneficial secondary effects on metabolic syndrome.

Mouse endpoint equivalents: Routine measurements of body weight (weekly or daily) is best practice for most mouse studies, and especially practical for high fat NASH diet models. Body weight is convenient for sorting animals at baseline, and to weed out sub-responders. If infrastructure permits, the relative proportions of fat to lean mass can be measured using dual energy X ray absorptiometry. For improved resolution, depot specific fat masses can be quantified using magnetic resonance imaging. Hepatomegaly (liver as % of body weight) is also an easily scored terminal endpoint.

For many mouse dietary NASH models, the facets of hypertriglyceridemia and hyperglycemia may be absent or only weakly recapitulated¹⁰^,¹¹. The former is not crucial, as fat accumulation within the liver is a more pressing focus and a proven model prerequisite. Regarding the latter: while fasted blood glucose may appear normal in diet-induced NASH mice, a glucose incursion test may confirm these animals are in fact insulin resistant. Glucose clearance rates and HOMA IR are practical efficacy endpoints that may also be used for sorting purposes in addition to disease monitoring.

Sampling strategy summary

The table below summarizes various potential endpoints for preclinical NASH studies.

Technique	Invasive?	Useful for sorting?	Utility for longitudinal sampling?	Popularity	Comments
Body weight	No	Yes; routine	Routine, weekly	High	Routine best practice for rodents studies; no cost
Serum ALT	No	Yes; routine	Routine, every 4 weeks	High	Most commonly used surrogate for direct liver interrogation; affordable
Serum AST, ALP, GGT	No	Yes	Common, every 4 weeks	Mid	AST is more commonly analyzed than GGT, ALP; AST is usually not affected to same degree as ALT
Serum glucose	No	May be useful to disqualify outliers	Poor (not reliable)	Mid	Many diet-based models do not induce appreciable hyperglycemia; easy and affordable
Serum triglycerides	No	Not reliable	Poor (not reliable)	Mid	Many diet-based models do not elevate, or may actually decrease serum triglycerides
Glucose tolerance test/ HOMA-IR	No	Yes	Fair, every 4 weeks	Mid	Time consuming; requires technical expertise; measuring insulin for HOMA-IR adds cost
% body fat composition (DXA)	No	Yes	Good, every 4 weeks	Low	Requires specialized instrumentation and training
Body composition (MRI)	No	Yes	Good, every 4 weeks	Low	Requires specialized instrumentation and training
Hepatomegaly (% liver weight)	Yes; terminal	No; may be useful as a baseline endpoint for non-longitudinal sampling strategy	No	High	Routinely calculated when livers are harvested for other analyses
Survival liver biopsy	Yes	Yes	No, repeat sampling is strongly discouraged	Mid	Most direct way to interrogate liver for sorting animals and obtaining longitudinal baseline data; requires specialized training; can provide numerous post-hoc histology readouts
Terminal liver biopsy	Yes; terminal	No; may be useful as a baseline endpoint for non-longitudinal sampling strategy	No	High	Most direct way to interrogate liver; requires larger animal cohorts so study will be adequately powered, as subresponders will be sampled; can provide numerous post-hoc histology readouts
Liver stiffness (SWE)	No	Low	Good, every 4-8 weeks	Low	Relatively new technique that does not have track record to support widespread use

Watch the Taconic Biosciences Webinar:

The Diet Induced NASH B6: A Translational NASH Model for Drug Discovery

Learn at Home Series:

NASH

Request a scientific consultation to support your NASH research.

References:

1. The more things change, the more they stay the same: the Amylin liver NASH (AMLN) Diet.

2. Liang W et. al. Establishment of a general NAFLD scoring system for rodent models and comparison to human liver pathology. PLoS One 2014; 9: e115922.

3. Younossi ZM et. al. Obeticholic acid for the treatment of nonalcoholic steatohepatitis: interim analysis from a multicentre, randomized, placebo-controlled phase 3 trial. Lancet 2019; 394: 2184-2196.

4. Genfit press release. GENFIT: Announces results from interim analysis of RESOLVE IT Phase 3 trial of elafibranor in adults with NASH and fibrosis. May 11, 2020.

5. Kleiner DE et. al. Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 2005; 41: 1313-1321.

6. Oldham S et. al. Incorporation of a survivable liver biopsy procedure in mice to assess nonalcoholic steatohepatitis (NASH) resolution. J Vis Exp 2019; 16: 146.

7. Lin S-H et. al. Non-invasive assessment of liver fibrosis in a rat model: shear wave elasticity imaging versus real-time elastography. Ultrasound Med Biol 2013; 39: 1215-1222.

8. Ratziu V et. al. Elafibranor, an agonist of the peroxisome proliferator-activated receptor α and δ, induces resolution of nonalcoholic steatohepatitis without fibrosis worsening. Gastroenterol 2016; 150: 1147-1159.

9. Harrison SA et. al. A blood-based biomarker panel (NIS4) for non-invasive diagnosis of nonalcoholic steatohepatitis and liver fibrosis: a prospective derivation and global validation study. Lancet Gastroenterol Hepatol 2020; 5: 970-985.

10. Hansen HH et. al. Human translatability of the GAN diet-induced obese mouse model of nonalcoholic steatohepatitis. BMC Gastroenterol 2020; 20: 210.

11. Boland ML et. al. Towards a standard diet-induced and biopsy-confirmed mouse model of nonalcoholic steatohepatitis: impact of dietary fat source. World J Gastroenterol 2019; 25: 4904-4920.

Experience & Expertise You Can Trust

Taconic Biosciences' model generation team has produced about 5,000 models in the last 15 years, developing a globally-recognized reputation for advancing the work of in vivo researchers. Our scientific program managers are here to help you navigate the complexities of model generation.

Request a Consultation