Skip to main content
Retail & eCommerceNielsenODBCKNIME

Case Study

Cracking Nielsen's ETL: From 20 Days to 3

Faster POS data: cutting Nielsen ETL from 20 days to 3.

P&G
Fortune 500 CPG
Founder's Track Record

While at P&G Canada, Arturo rebuilt Nielsen's POS ETL, bypassing its slow and failure-prone UI with direct ODBC automation. This gave sales and planning teams faster access than even Walmart — turning data bottlenecks into competitive advantage.

Key Results

85%
Faster Data SLA
Nielsen POS data delivery cut from 20 days to 3
$5M
Annual POS Uplift
Revenue driven by better promotional timing and faster market response
3
Category Captaincies
Walmart recognized P&G as category captain in Oral Care, Feminine Care, and Baby Care

The Transformation

Before
After
20-day data SLA
3-daydata SLA
Manual Nielsen UI downloads
Automated ODBC pipeline
Frequent extraction failures
Auto-retry with checksum validation
Analysts spending 3 days/week on extraction
Zero-touch automated delivery
Stale data in category reviews
Fresher data than Walmart buyers had

The Challenge

Nielsen POS data was the lifeblood of category management at P&G Canada, but the standard extraction pipeline was painfully slow. Downloads through Nielsen's UI took up to 20 days to complete — and that was when they worked. Frequent timeouts and session failures meant analysts spent hours babysitting extractions, restarting jobs, and reconciling partial datasets. By the time data reached sales and planning teams, it was already stale by two or three weeks.

The impact went beyond inconvenience. P&G's Walmart Canada team was losing ground in category reviews because competitors with faster data pipelines were making promotional and assortment decisions weeks ahead of us. Category managers couldn't respond to market shifts they couldn't see yet. The sales team knew intuitively that something was off at certain stores, but couldn't back it up with current numbers. Meanwhile, the analysts tasked with extracting the data were spending a third of their week just getting it out of Nielsen's system — time that should have gone to analysis.

Our Approach

We bypassed Nielsen's UI entirely and went straight to the source. We reverse-engineered Nielsen's backend database structure and established a direct ODBC connection — something their standard tooling didn't support or document. This required mapping their internal table relationships and understanding how Nielsen segmented data by geography, category, and time period at the database level.

From there, we built an automated extraction pipeline using KNIME workflows running on dedicated VM infrastructure. Each workflow handled a specific data domain (POS volumes, dollar sales, distribution metrics, share of shelf) and ran on cron-scheduled jobs that pulled data continuously without manual intervention. The extraction logic was designed to handle Nielsen's data update cadence — we knew which tables refreshed on which days, so the pipeline prioritized the most time-sensitive metrics.

Reliability was as important as speed. We built retry logic and row-count validation checks into every extraction step so that failed or partial pulls recovered automatically instead of requiring manual restarts. A checksum comparison against the previous extract caught cases where Nielsen updated data retroactively — which happened more often than you'd expect. We also designed the output format to plug directly into P&G's internal reporting templates, eliminating the reformatting step that previously added another full day to every cycle.

Once the pipeline was stable, we set up weekly distribution of refreshed datasets to the entire P&G Walmart Canada team — sales leads, category managers, and supply chain analysts all received current, validated data on the same cadence. Instead of each analyst running their own extraction and getting different results depending on when they pulled, everyone worked from the same dataset. We also built category-specific views that pre-filtered the data for each analyst's scope, cutting their setup time to zero.

The Outcome

The SLA dropped from 20 days to 3 — an 85% reduction in data latency. But the real impact was downstream. Sales teams could respond to market movements within days instead of weeks. Category managers walked into Walmart reviews with data that was fresher than what Walmart's own buyers had access to — a significant competitive edge in a retail environment where data advantage directly translates to shelf influence.

That advantage contributed directly to P&G securing Walmart Category Captaincy in Oral Care, Feminine Care, and Baby Care — the retailer's recognition that P&G was the most analytically capable partner in those categories. Category Captaincy meant P&G helped Walmart make assortment and shelving decisions for the entire category, not just their own products.

The faster insights translated to roughly $5M in annual POS uplift, driven by better promotional timing, smarter assortment recommendations, and faster reaction to competitor moves. Analysts who had been spending 3 days a week on data extraction redirected that time to actual analysis — the kind of deep-dive work that informed quarterly business reviews and annual planning. The pipeline ran for years after initial deployment with minimal maintenance, becoming part of the team's invisible infrastructure.

What the Client Says

Arturo rapidly learned new technologies and applied them to business problems—redesigned our bulk data extraction to be 10× faster within two months, then enabled others to scale the approach.

Aamir Mohatarem

Senior Manager - Business Analytics, Procter & Gamble

Want similar results for your organization?

Get in touch

Ready to turn data into decisions?

Let's discuss how Clarivant can help you achieve measurable ROI in months.