Feline USG Predictor v1.5a

Estimation of Urine Specific Gravity from Serum Chemistry, CBC, and Patient Age  |  Memorial Cat Hospital  |  March 2026

What’s new in v1.5a: Reduced from 7 to 5 input features, dropping Amylase and Cholesterol. During pilot deployment at Memorial Cat Hospital, we found that these two analytes are not consistently included on every panel, limiting the number of patients who could be screened. Removing them increases patient eligibility while maintaining nearly identical accuracy. Sensitivity: 84.9%; Specificity: 75.5%. New: per-prediction explanations show which values are driving each result.
Contents
  1. Clinical Rationale
  2. What Changed in v1.5a
  3. Algorithm & Inputs
  4. Performance Metrics
  5. Confusion Matrix
  6. Threshold Tuning
  7. Feature Importance
  8. Prediction Explanations
  9. Version History
  10. Limitations
  11. Study Population

1. Clinical Rationale

Urine Specific Gravity is a cornerstone of feline renal assessment. IRIS staging of Chronic Kidney Disease incorporates USG alongside creatinine and SDMA to differentiate stages and guide management. Loss of concentrating ability (USG <1.035 in cats) is frequently among the earliest detectable signs of tubular dysfunction, often preceding azotemia.

However, urine collection is not always achievable at the time of presentation. An empty bladder, patient temperament, contraindications to cystocentesis (coagulopathy, abdominal masses), or client constraints may preclude urinalysis.

The Gap: In our dataset of 3,642 feline visits over 4 years, only 1,506 (41%) included a urinalysis. In the remaining 2,136 visits, renal concentrating ability was unknown despite bloodwork being available.

This model estimates USG from serum chemistry, CBC, and patient age — values already being collected on the same blood draw — providing a screening estimate of concentrating ability at zero additional cost, zero additional procedure time, and zero additional client charge.

2. What Changed in v1.5a

Motivation: During pilot deployment at Memorial Cat Hospital, we discovered that Amylase and Cholesterol are not included on every chemistry panel ordered. Some panels (e.g., renal-focused or mini-chem panels) omit these analytes, meaning those patients could not be screened by v1.4. Dropping these two features allows the tool to screen every patient with a standard chem + CBC, regardless of which panel was ordered.

Summary of Changes

Aspectv1.4v1.5a
Input Features 7 5
Dropped Features Amylase Cholesterol
Eligible Patients (in dataset) 1,296 1,334 (+3%)
Sensitivity 85.6% 84.9%
Specificity 73.7% 75.5%
Accuracy 79.8% 80.3%
Prediction Explanations No New
Net impact: Sensitivity decreased by 0.7 percentage points (well within normal statistical variance). Specificity actually improved by 1.8pp. Amylase (6.6% importance) and Cholesterol (4.5% importance) together contributed only 11.1% of v1.4’s predictive power — the model compensated by relying more heavily on the remaining renal markers.

SDMA Evaluation

We also evaluated a variant that adds SDMA as a 6th feature. Adding SDMA yielded only +1% sensitivity (85.9% vs 84.9%) but reduced eligible patients from 1,334 to 1,083 due to SDMA not being reported on all panels. Given the marginal benefit and significant data loss, SDMA was not included in the production model. This decision may be revisited if SDMA reporting becomes more universal.

3. Algorithm & Inputs

Model Type

Machine learning classifier trained with clinically-weighted error costs (missing a sick cat is penalized more heavily than flagging a healthy one). All reported metrics are from held-out data that the model never trained on.

Required Inputs (5)

3 lab values from a standard chemistry + CBC panel, plus patient age. No Amylase, Cholesterol, T4, electrolyte panel, or SDMA required.

Serum Chemistry (2)

BUNmg/dL
Creatininemg/dL

CBC & Demographics (3)

Hemoglobin (HGB)g/dL
Abs. Lymphocytes/μL
Patient Ageyears

Output

Binary screening classification — Adequate (≥1.035) or Impaired (<1.035), with probability score, risk tier, and per-feature explanations showing which bloodwork values are driving the prediction. This is a triage tool that identifies cats who may benefit from urinalysis, not a diagnostic test.

4. Performance Metrics

84.9% Sensitivity
(catches impaired cats)
75.5% Specificity
(clears healthy cats)
15.2% Miss Rate
(impaired cats cleared)
24.5% False Flag Rate
(healthy cats flagged)

v1.5a vs v1.4 Comparison

Metricv1.5a (5 feat)v1.4 (7 feat)ChangeClinical Meaning
Sensitivity (catches sick cats) 84.9% 85.6% -0.7% Within normal statistical variance; effectively unchanged
Specificity (clears healthy cats) 75.5% 73.7% +1.8% Fewer unnecessary urinalysis recommendations
Accuracy 80.3% 79.8% +0.5% Correct call 4 out of 5 times
Miss Rate 15.2% 14.4% +0.8% ~1 additional missed cat per 125 screened
False Flag Rate 24.5% 26.3% -1.8% Fewer healthy cats flagged

Validation Set Performance (independent held-out data)

MetricTraining EvaluationValidation SetGap
Sensitivity 84.9% 81.0% -3.9%
Specificity 75.5% 73.9% -1.6%
Accuracy 80.3% 77.5% -2.8%

A 3–4% generalization gap between CV and validation is typical and healthy. v1.4 did not have a held-out validation set, so its reported metrics may be slightly more optimistic than v1.5a’s.

Cross-validation metrics evaluated on 1,334 cases with paired bloodwork and urinalysis. Class balance: 685 Impaired / 649 Adequate.

5. Confusion Matrix

v1.5a Binary: Adequate vs Impaired

Predicted
Adequate Impaired
Actual Adequate 392 127
Impaired 83 465
392 + 465 = 857 correct calls
The model correctly classifies 80.3% of cases.
83 false reassurances — impaired cats the model called adequate. Clinically, a missed impaired cat is far more costly than an unnecessary urinalysis flag.

6. Threshold Tuning

The binary classification uses a clinically-optimized decision threshold, tuned to prioritize catching impaired cats while keeping the false flag rate manageable. Missing a sick cat is weighted more heavily than flagging a healthy one.

What the v1.5a threshold optimizes for

“Don’t let a sick cat walk out the door.”
84.9% of truly impaired cats are caught. The tradeoff: 24.5% of healthy cats are flagged for a urinalysis they may not need. An unnecessary UA (~$35 add-on) is far better than a missed kidney diagnosis.

What happens when it flags a cat

A flag is not a diagnosis. It means: “this cat’s bloodwork pattern is consistent with cats that have impaired concentrating ability — consider collecting urine.”

If UA confirms impairment: early detection, earlier intervention.
If UA is normal: client gets peace of mind, cat gets a clean bill.

7. Feature Importance & Physiological Basis

With only 5 features, the model concentrates its predictive power on the strongest renal and hematologic markers. BUN, Age, and Creatinine now account for over 70% of the model’s total importance:

BUN
30.0%
Patient Age
22.0%
Creatinine
20.0%
Abs. Lymphocytes
15.0%
Hemoglobin
13.0%

Physiological Interpretation

AnalyteImportancePhysiological Link to Urine Concentration
BUN 30.0% Primary marker of glomerular filtration rate. As GFR declines, BUN rises and concentrating ability diminishes. BUN also contributes to the medullary concentration gradient via urea recycling — elevated BUN paradoxically reflects the failing kidney’s inability to maintain this gradient.
Patient Age 22.0% CKD is progressive and age-dependent. In this dataset, 98% of cats over 18 years had impaired concentration vs 15% of cats aged 5–10. Age captures the cumulative renal decline that bloodwork alone may not fully reflect, including subclinical nephron loss.
Creatinine 20.0% Muscle-derived GFR marker. Co-regulated with BUN through renal excretion. Together with BUN, captures the primary renal axis.
Abs. Lymphocytes 15.0% Hematologic marker of immune status and systemic illness chronicity. CKD cats often develop lymphopenia as part of the chronic disease syndrome. Low lymphocyte counts correlate with disease severity and duration.
Hemoglobin 13.0% Reflects hydration status and erythropoietin production. Dehydrated cats have higher HGB and more concentrated urine. CKD cats develop non-regenerative anemia (low HGB) with concurrent loss of concentrating ability.

Dropped features: Amylase (6.6% in v1.4) and Cholesterol (4.5%) were the two lowest-importance features. Together they contributed only 11.1% of v1.4’s predictive power. With their removal, the remaining features absorbed their contribution with negligible performance loss.

8. Prediction Explanations

New in v1.5a: every prediction now includes a per-feature breakdown showing how each bloodwork value contributed to the result. This answers the question: “Why is this cat being flagged (or cleared)?”

How to read the explanation chart:

This transparency helps veterinarians understand which bloodwork values are driving the recommendation, rather than treating the model as a black box. For example, a cat might be flagged primarily because of elevated BUN and advanced age, even though its creatinine is still within normal range — the explanation chart makes this reasoning visible.

9. Version History

VersionFeaturesSensitivitySpecificityKey Change
v1.010 Initial model — bloodwork only
v1.111 Added patient age
v1.214 Full feature set; regression + classification
v1.3785.6%70.0% Reduced to 7 fields; clinically-weighted error costs
v1.4785.6%73.7% Hyperparameter tuning; classification only; fewer false flags
v1.5a584.9%75.5% Dropped Amylase & Cholesterol (pilot data); added per-prediction explanations; improved validation
v1.4 → v1.5a: Motivated by pilot deployment at Memorial Cat Hospital, where not every patient had Amylase and Cholesterol on their panel. Removing these two lowest-importance features increased patient eligibility by 3% while maintaining equivalent accuracy (−0.7% sensitivity, +1.8% specificity). Added per-prediction explanations and moved to a more rigorous validation framework with independent held-out testing.

10. Limitations

This tool is a screening estimate, not a diagnostic test. It should inform clinical decision-making, not replace urinalysis.
LimitationClinical ImpactMitigation PlanStatus
Regression to the mean at extremes Resolved in v1.4. Previous versions predicted continuous USG values, which suffered from regression to the mean at extremes. v1.4+ uses binary classification only, eliminating this limitation. Resolved Complete
Amylase/Cholesterol availability Resolved in v1.5a. Pilot deployment revealed these two analytes are not on every panel, excluding otherwise eligible patients from screening. Resolved Complete
Single-practice, single-species dataset Trained on 3,642 feline cases from one hospital. External validation is required before broader deployment. Pursuing collaboration with Texas A&M Veterinary Medical Teaching Hospital or external validation on a second, independent patient population. Additional validation sites will be recruited from professional networks and through conference contacts. Target: 2–3 external datasets. Oct 2026
No urinalysis replacement USG is one component of urinalysis. Sediment analysis, urine protein, urine culture, and pH provide independent diagnostic information. By design — not a limitation to resolve. This tool is a screening triage tool that identifies cats who should receive urinalysis. It is not intended to replace UA. N/A
Pre-renal and post-renal effects Dehydration elevates BUN disproportionately and concentrates urine simultaneously. The model may conflate pre-renal and intrinsic renal causes. Add hydration status and recent fluid therapy as optional input fields in a future version. Explore adding BUN/Creatinine ratio as an engineered feature to help distinguish pre-renal azotemia. Q1 2027
Temporal confounders Blood and urine may not be collected simultaneously. Hydration status and recent fluid therapy affect USG independently of serum values. During prospective pilot at Memorial Cat Hospital, record time delta between blood draw and urine collection. Analyze whether prediction accuracy degrades with increasing time gap. Jun 2026
SDMA marginal benefit Testing showed SDMA adds only +1% sensitivity but reduces eligible patients by ~19% due to inconsistent reporting. Not included in production. Will revisit if SDMA reporting becomes more universal or if external validation on SDMA-rich datasets (e.g., TAMU teaching hospital) shows greater benefit. Deferred
~15% miss rate Approximately 1 in 7 impaired cats will be classified as adequate (15.2% miss rate). Pursue three paths: (1) additional training data from external sites, (2) explore advanced modeling techniques, (3) evaluate whether flagging of borderline cases can reduce clinically significant misses. Q1 2027
No prospective outcome data No data yet showing that flagging cats with this tool leads to earlier diagnosis or improved clinical outcomes. Memorial Cat Hospital pilot will track every flag: was UA performed, what was the USG result, what was the diagnosis at 6 and 12 months. This prospective dataset will be the basis for outcome claims in the publication. Mar 2027

11. Study Population

ParameterValue
SourceMemorial Cat Hospital, Houston, TX
Date RangeJanuary 2022 – February 2026
Species100% Feline
Total Lab Reports3,642
Reports with Urinalysis1,506 (41%)
Cases Used for v1.5a1,334 (complete bloodwork + USG, no missing values in the 5 input features)
USG Range in Dataset1.005 – 1.086
USG Mean / Median1.036 / 1.034
Class Balance (≥1.035 / <1.035)649 Adequate / 685 Impaired (49% / 51%)
Validation MethodCross-validation with independent held-out validation set

Age Distribution

Age GroupnMean USG% Impaired (<1.035)
Under 5 years51.04920%
5–10 years1001.04715%
10–14 years5561.04329%
14–18 years5531.02873%
Over 18 years811.01998%

Model v1.5a  |  March 2026  |  Memorial Cat Hospital  |  For investigational and research use  |  Not validated for clinical deployment