Experts Utilize The World Wide Web To Build The Ultimate B2B Reference Set

Share this:

In data science, a ‘match’ is only as good as the reference set being used to render it. If a match gets recorded against an inaccurate reference set, then significant gaps in business insights arise.

Data vendors have historically matched accounts using a traditional approach called ‘fuzzy matching’. However, there is a new innovation powered by machine learning and natural language processing technology that is completely revolutionizing the way businesses organize and leverage data across the company.

Below are details about this critical data management topic, including definitions, the unique challenges of B2B account matching, expert commentary, and the innovation changing the world of B2B data.

An Overview of Data Matching

Data matching refers to the effort involved in identifying, mapping and updating two or more sources of information together, resulting in a single entity with an updated view of information.

In an organization’s CRM, typically an account record will display a variety of information that is leveraged across all business units, to make strategic and tactical decisions. Internal data sources include transactional points like finance, credit, web forms, a marketing automation system, and Salesforce, whereas external data sources include EverString, ZoomInfo, D&B, Circle Back, and others.

To understand account matching, here is a common example:

  1. The database has an existing account record titled “ACME, Corporation”.
  2. An incoming data stream has new information relevant to this account, so it attempts to locate and update the record.
  3. If the new data source is labeled slightly different, (such as The ACME Corp instead of ACME, Corporation), matching ensures those entities are still linked together to avoid duplication.
  4. If it does not find an existing match, it will create a new account record and fill in the fields as much as possible.

The Critically-Important Reference Set

Any matching process requires a ‘reference set’. The reference set is like the DNA blueprint for a database. It represents the master source that any incoming data is verified against.

Consider the difference between fingerprint matching technology and DNA testing. The former method is useful at helping identify someone, but the latter provides an unsurpassed, submicroscopic level of granularity. Just like DNA contains trace bits of code that make each human unique, the internet has trace bits of data that tell a unique story about each business, each industry, and each market.

When most data vendors discuss matching, they are using a fairly static reference set. This reference set is likely being maintained with human and machine inputs, to varying degrees of accuracy and coverage. In today’s fast-paced world, business data is constantly evolving. The best, most real-time source of information today is the world wide web.

Traditional Approach: A no-win compromise between accuracy and coverage

Under a traditional approach, the reference set being used is significantly limited. The source is often form-fills, webinar attendance, or surveys. Although it may include millions of records, an outdated reference set could also include numerous dead or shell companies resulting in suboptimal accuracy and coverage rates.

Modern Method: The World Wide Web renders a 99.8% match-rate

Instead of relying on traditional methods, smart matching uses the ultimate reference set: the world wide web. By crawling and extracting bits of trace data from search engine results pages, these information bits are gathered and synthesized to produce the most accurate, up-to-date reference set from which the best data matching can occur.Account-Matching-Differences-Reference-Set-EverString

For example, imagine a business focuses on selling to enterprise-level accounts so they have a filter set in their marketing automation system for 2,000 employees or more. Then, this week Forbes Magazine reports a massive surge in hiring at ACME, Inc., doubling their workforce from 1,500 to 3,000 employees.

  • With traditional data matching, the organization will MISS this opportunity to sell to ACME, Inc. because the account data is not representing an accurate employee count, reflective of present-day marketplace news.
  • With modern matching techniques, the database will be enriched with this new target account, along with a multitude of other relevant business information based on your target industry, trends, news, and advanced machine-learning algorithms trained to think like executive business leaders.

Fuzzy vs. Smart Matching

The Current State of Matching

Fuzzy matching is a computer-assisted translation technique employed to produce a data set match when an identical pair cannot be found. In place of an exact match, fuzzy matching applies a percentage to find the closest possible entity. Fuzzy matching is not ideal since it has an estimation factor that is inferior to what data scientists can now achieve with machine-learning advancements.

What’s Coming Next With Smart Matching

The World Wide Web is full of information, both business and non-business related. With AI and machine learning, smart matching filters what is relevant for a business and what is not and then finds patterns from the bits of data from all pages indexed on the web. Those patterns are then linked to the company entities through advanced algorithms, with a level of accuracy never seen before.

Smart matching happens when you can teach a machine to answer the question ‘what would a human being do to come to the conclusion that this data belongs to a particular company?’Fuzzy-Matching-vs-Smart-Matching-EverString-Data

The Vital Difference Between Data Providers

Traditional Data Resellers

Value-added data resellers purchase bulk data stores from various sources, enrich, and then resell the data in various segments at high fees. Along with the data, they typically offer a customer data platform (CDP) or predictive marketing tool to help add value.

Although these resellers offer professional services to help teams operationalize the information, the underlying data quality problem is not solved. When data is verified using a flawed reference set, it may still register a match even though it is pairing up to previously incorrect information.

Datapoint triangulation does not address the ultimate problem either. Although triangulation can improve accuracy, if the reference set is erroneous, then the match will be flawed, even though it is recorded as ‘complete’. Examples of value-added data resellers include D&B, DiscoverOrg, ZoomInfo, Experian, and Equifax.

Modern Data Sourcing

The optimal source for a reference set is one that is used and continuously refreshed by the entire world. Modern data providers leverage this up-to-date information from the World Wide Web, along with machine-learning technology that has been taught how to model the evaluation process of leading business executives.

Using business-oriented natural language processing (NLP), models are developed and machines are deployed to crawl, extract, and synthesize bits of data left on the web.

If a business does not have a website yet, they still have a digital presence. Smart matching technology leverages any indexed page on the world wide web to extract data even if the business has no actual company website or domain.

By the numbers, EverString covers:

  • 10M companies with a website in the U.S.
  • 5M companies with some digital presence in the U.S.
  • 12M companies without a website, but still some digital presence in the U.S.
  • 60M dead-to-shell companies in the U.S.
  • 10M companies across the rest of the world

Results of Smart Account-Matching

Uncover New Opportunities

Fuzzy matching excludes a significant portion (between 30-70%) of potential high-fit accounts, meaning those potential customers will never exist in the company’s ecosystem. Even if the entity does exist, no data enrichment will occur since an outdated reference set is used. In other words, current information is excluded and therefore never triggers the account into a marketing nurture or sales activation campaign.

Smart matching leverages the most current reference set, the World Wide Web to generate a 99.8% match-rate, helping businesses stay on top of new sales opportunities that otherwise would have gone unexplored.

Unified, 360-Degree Customer View

Smart account-matching creates a comprehensive and united view of an account, including both internal and external data at any level of granularity needed. The result is a robust data foundation that includes tens of thousands of different account attributes and buyer behavior patterns, spanning the entire organization.Traditional-vs-Smart-Account-Matching-Bar-Graph-Data-EverString

Q&A With EverString COO

We sat down with EverString’s Chief Operating Officer, Amit Rai, to answer frequently asked questions about the new innovation in data management that is transforming how businesses leverage data.

Read the full Q&A here >>

In Closing

Data matching plays a critical role in the health of a company database. When matching errors occur, duplicate or inaccurate data is funneled into the business, causing leaders to miss significant insights and revenue opportunities.

Smart matching technology transforms how data matching happens by using the trillions of digital traces consumers and businesses leave behind on the World Wide Web to ensure a near-perfect match across a company’s digital ecosystem. The result is a significantly more comprehensive and robust data foundation, full of continuously enriched attributes that drive business insights from a centralized source of truth.

Learn More About Smart Matching

See how experts have developed the ultimate machine learning reference source for B2B data.

account-matching-innovation-cover-pageDownload this White Paper and learn about:

• Data matching science terminology
• The critically-important data reference set
• The vital difference in data providers
• Results of smart account-matching
• Answers to frequently asked questions (FAQ)

Share this: