Do you know where your data really comes from? (It’s ok… not many people do). We’ve seen it time and time again. Business leaders think they’re purchasing data that’s been triangulated from multiple sources for accuracy. Or they think the only way to get pristine data is to pay $4.00 a pop for hand-cleansed records. There are some common misconceptions out there, about where data comes from (especially business data); and we’re here to clear them up.
We created this 3-part blog series to help demystify how vendors actually collect business data, as well as examine the dramatic performance gap between traditional and modern methods.
In Part 1 of the series, we define the techniques used to gather business data. Part 2 covers common misconceptions about data sourcing. Part 3 explores how AI and deep machine learning are revolutionizing the game. Let’s dive in:
1. Surveys & Call Centers
Whether it’s a phone call or an email, businesses are contacting people in various ways to gather and verify information like employment, company revenue, locations, and service needs.
Some data providers have partnerships with these call centers in a mutually beneficial capacity, where they provide contacts to call on, and in return, they receive verified information back.
With crowdsourcing, members of a community are contributing to the collective information (think Wikipedia). These platforms include gamification aspects to encourage members to contribute. All these contributions are triangulated and, applying the ‘Wisdom of the Crowd’ math theory, an answer is achieved.
3. Data Aggregation
Data aggregation is when a provider purchases data from multiple sources, uses triangulation to attempt verification, then sells the data along with value-adding professional services.
4. Website Scraping
Web scraping is based on structured data such as identifying a company’s Home page, body copy, About page, plus crawling HTML meta information from various web pages.
5. AI & Deep Machine Learning
The human brain is built to observe, think abstractly, and find patterns but is limited in terms of high volume production. On the other hand, machines can compute at superhuman speeds, but require initial and periodic human input to understand the pattern(s) involved.
With modern data sourcing, machines are taught to crawl the web, extract relevant information and model the thinking of business professionals. The system applies logic, semantics, principles, and theories to fill in knowledge gaps whenever necessary. Humans are in the loop to QA the algorithm (not the data itself), enabling the collection of pristinely accurate data at scale.