Microsoft marketing and sales team gather tremendous amount of data which is often unstructured and complex. In order generate value from the data it is critical to identify and integrate the data with the CRM systems. In addition, the measurement of ROI depends on this data. However, the level of inconsistency and lack of architecture resulted in highly inaccurately and missing data, often missing the essentials to be used by sales teams and downgrading the efforts of marketers and sellers. An urgent corrective action was required to fill the database gaps without compromising the data privacy and compliance.
Idea and Solution
In order to create an architecture and matching algorithm design, it was important to study of all available data sources and analyze and collate a list of unique identifiers i.e. email domain mapping. top parent relationships. Firstly, an Excel template was designed to intake the raw data in a standardized format. The template is enabled with Macro which ensures validation mistake proofing and matches column header of the raw data file with the required standard column name for package. Further, A SSIS Package was designed to take raw data + Mapping Files as input and process them for matching at 2 different levels (100% Email domain Match + 85% Company Name Fuzzy matching algorithm). Finally. a customized SQL script was created to debug , troubleshoot and audit the matching results.
An architecture of cutting edge data pipelines was created to validate the data, apply business logics and minimize the amount of critical data loss.
The package has a built in feature where all PII and personal information is deleted automatically from the databases after package execution to be compliant on privacy standards like GDPR
This packaged product has been a milestone in improving the data quality. This not only improves quality of leads for the CRM systems but also allows enhanced measurement of ROI resulting in better marketing campaign targeting and data driven decision making. Moreover, this package initially created for 1 geographical area has been expanded globally to take the advantages of best practices sharing.