A mid-market SaaS company came to us with a problem they could describe but not diagnose. They had 50,000 contacts in HubSpot, a marketing team running monthly email campaigns, an SDR team running outbound sequences, and results that were declining quarter over quarter despite increasing activity.
Their email open rates had dropped from 24% to 17% in six months. Their outbound sequence reply rate was 1.8% — well below the 5-8% benchmark for well-targeted cold outreach. Their MQL-to-SQL conversion rate had fallen to 12%, and the sales team had started maintaining their own prospect spreadsheets rather than trusting the leads from HubSpot.
The marketing team blamed messaging. The sales team blamed lead quality. Leadership blamed both teams. Nobody blamed the data.
Here's what we found, what we did, and what changed.
The Audit: What 50,000 Contacts Actually Looked Like
We ran a structured data quality audit across all five dimensions: completeness, accuracy, freshness, validity, and uniqueness. The results explained everything.
Completeness
We defined their critical field set based on their GTM motion: email address, job title, company name, industry, employee count, country, and seniority level. Seven fields.
| Field | % Populated | Assessment |
|---|---|---|
| Email address | 94% | Acceptable |
| Job title | 58% | Poor |
| Company name | 87% | Fair |
| Industry | 41% | Critical |
| Employee count | 33% | Critical |
| Country | 72% | Fair |
| Seniority level | 29% | Critical |
Their average completeness across critical fields was 59%. This meant that 41% of the data their routing, scoring, and segmentation depended on simply didn't exist.
Freshness
We checked their Last Modified Date as a proxy for freshness (they had no enrichment date tracking).
- Contacts modified in last 90 days: 12%
- Contacts modified 91-365 days ago: 31%
- Contacts not modified in 365+ days: 57%
More than half their database hadn't been touched in over a year. Given field-specific decay rates (job titles at 65.8% annually, emails at 37.3%), we estimated that the 57% in the 365+ day bucket had, on average, 2.1 materially inaccurate fields per record.
Uniqueness
We ran a probabilistic duplicate analysis across the full database. Results:
- Exact-match duplicates (same email): 1,840 pairs (3.7%)
- Fuzzy-match duplicates (same person, different email/spelling): estimated 4,200 additional pairs (8.4%)
- Estimated total duplicate rate: 12.1%
HubSpot's native dedup tool had surfaced 340 pairs — less than 20% of the actual duplicate population.
Validity
We ran their email list through a verification service:
- Valid: 71%
- Invalid (hard bounce risk): 8%
- Risky (catch-all, disposable, role-based): 14%
- Unknown: 7%
The 8% invalid rate meant approximately 4,000 email addresses in their database would hard-bounce on the next send. At their current send volumes, this was enough to push their bounce rate above the 2% ISP threshold.
The Composite Score
We scored every contact on a 0-100 scale across all five dimensions, then assigned letter grades:
| Grade | Count | % of Database | Description |
|---|---|---|---|
| A (80-100) | 4,120 | 8.2% | Campaign-ready, fully enriched, fresh |
| B (65-79) | 14,280 | 28.4% | Usable with minor gaps |
| C (50-64) | 12,890 | 25.6% | Significant gaps, enrichment needed |
| D (30-49) | 11,450 | 22.8% | Unreliable for outbound use |
| F (0-29) | 7,507 | 14.9% | Unusable — suppress or archive |
Only 36.6% of their database (grades A and B) was reliably usable for revenue operations. The rest was consuming enrichment budget, inflating contact counts, degrading deliverability, and producing misleading reports.
The Remediation: What We Did
Phase 1: Triage (Day 1)
Suppress the F-grade records. The 7,507 contacts scoring below 30 were immediately suppressed from all marketing lists and sales sequences. These records were consuming send capacity and deliverability capital with zero return. Suppression alone improved their effective campaign targeting by removing the worst 15% of their audience.
Flag the invalid emails. The 4,000+ invalid email addresses were flagged as Email Status = Invalid and excluded from all email workflows. This prevented the next campaign from pushing their bounce rate above the ISP threshold.
Merge the exact-match duplicates. The 1,840 exact-match duplicate pairs were merged using a primary record selection rule: keep the record with the richer engagement history. This immediately cleaned up attribution and reduced double-counting in reports.
Phase 2: Enrichment (Days 2-3)
We ran a waterfall enrichment pass across the remaining 42,740 active contacts, prioritizing by grade:
- C-grade contacts first (25.6% of database) — these had the highest ROI for enrichment because they were close to usable but missing 2-3 critical fields
- D-grade contacts second — lower probability of enrichment success but still worth attempting
- B-grade contacts third — filling minor gaps to push them to A-grade
The waterfall used four providers in sequence: Apollo for US-centric firmographic data, Cognism for European contacts, Hunter.io for email discovery, and ZeroBounce for email validation.
Phase 3: Quality Gates (Day 4)
We implemented quality gates in their HubSpot workflows:
Campaign-ready gate: Before a contact is enrolled in any marketing email campaign, check: Email Status = Valid AND Job Title is known AND Company Name is known. Contacts that fail are routed to an enrichment workflow instead of the campaign.
Sequence-ready gate: Before a contact is enrolled in a sales sequence, check: Email Status = Valid AND Job Title is known AND Last Enriched Date is within 180 days. Stale contacts get re-enriched before sequence enrollment.
Routing-ready gate: Before lead routing fires, check: Country is known AND (Employee Count is known OR Industry is known). Missing fields trigger enrichment before routing, preventing exceptions.
Phase 4: Ongoing Monitoring (Day 5+)
We set up the freshness tracking system described in our data decay strategy guide:
Last Enriched Datecustom property, updated automatically on every enrichment action- Freshness distribution report on the RevOps dashboard
- Automated re-enrichment workflow triggered at 180-day freshness threshold for active contacts
- Monthly decay audit cadence
The Results: 90 Days Later
Email Performance
| Metric | Before | After (90 days) | Change |
|---|---|---|---|
| Hard bounce rate | 3.2% | 0.4% | -87% |
| Open rate | 17% | 28% | +65% |
| Click-through rate | 1.1% | 2.8% | +155% |
| Unsubscribe rate | 0.8% | 0.3% | -63% |
The deliverability recovery was the most dramatic improvement. By suppressing invalid addresses before they bounced and maintaining ongoing email validation, their sender reputation recovered within 6 weeks. The open rate improvement was partly deliverability-driven (more emails reaching the inbox) and partly targeting-driven (campaigns reaching the right people with accurate personalization).
Outbound Performance
| Metric | Before | After (90 days) | Change |
|---|---|---|---|
| Sequence reply rate | 1.8% | 6.2% | +244% |
| Meetings booked per SDR/month | 4.2 | 11.8 | +181% |
| Lead-to-meeting conversion | 3.1% | 8.7% | +181% |
The outbound improvement was driven by two factors: reps were now sequencing contacts with accurate job titles (so personalization was relevant) and fresh email addresses (so messages actually arrived). The SDR team stopped maintaining shadow spreadsheets within three weeks.
Pipeline Impact
| Metric | Before | After (90 days) | Change |
|---|---|---|---|
| MQL-to-SQL conversion | 12% | 23% | +92% |
| Pipeline generated/month | $380K | $720K | +89% |
| Routing exceptions/month | 340 | 28 | -92% |
The Lessons
Lesson 1: Volume is not value. 50,000 contacts sounds impressive. 18,400 campaign-ready contacts is the real number. Every metric — open rates, reply rates, conversion rates — improves when you stop diluting your campaigns with unusable records.
Lesson 2: Enrichment ROI is highest on C-grade records. The contacts that are close to usable but missing 2-3 fields produce the highest return per enrichment dollar. Focus enrichment on the B-to-A and C-to-B transitions, not on trying to rescue F-grade records.
Lesson 3: Quality gates prevent re-accumulation. Without gates, the database will degrade back to its previous state within 6-9 months. Automated quality checks at campaign enrollment, sequence enrollment, and routing entry points maintain the standard without manual effort.
Lesson 4: Sales trust follows data trust. The SDR team's behavior changed within weeks of seeing accurate data in HubSpot. When reps can trust that the job title, company, and phone number in the CRM are current, they use the CRM. When they can't, they use LinkedIn and spreadsheets. Data quality is a prerequisite for CRM adoption, not a consequence of it.
Ready to Score Your Database?
MarketingSoda Refine's free database health scan runs the same five-dimension audit we used in this case study. Connect your HubSpot via OAuth and receive an A-F grade distribution across your contact database in 60 seconds.
Want to see your health score?
Run a free data quality audit on your HubSpot portal. No credit card, no commitment — just clarity.
Start Free Audit



