What is Data Mining?

What is Data Mining?

You may have noticed that sometimes, the Internet seems almost psychic. Facebook knows who your friends are before you add them, and Google ads suggest products and services you actually need. You may visit a website for the first time, only to find that the sidebar ads know where you live and are suggesting restaurant deals in your area.

Contrary to appearances, these companies don't have crystal balls—they're using the magic of data mining to apply the information they do have about you, and make extraordinarily educated guesses.

What is data mining?

The details of data mining are pretty complex, but at the core, it’s the process of gathering vast amounts of data and then extracting useful information. Using ever-mysterious algorithms that only programmers and statisticians can begin to grasp, the practice can produce marketing gold for businesses.

Data mining gathers and sorts through data from thousands, millions, or even billions of points. This large-scale information discovery can be either descriptive or predictive, and can be used to detect one or more of several different types of patterns:

  • Anomaly detection
  • Association learning
  • Classification
  • Cluster detection
  • Regression

One of these things is not like the others

Anomaly detection looks for differences in data that can be compared against a standard to determine certain information. This type of data mining is often used as part of fraud defense. Credit card companies use anomaly detection to flag suspicious transactions, which are verified with the cardholder before processing.

While anomaly detection isn't commonly used from a marketing standpoint, it's definitely a useful tool for protection. This process makes it easier to pinpoint suspicious activity and prevent possible disaster.

If you like this, try that

Anyone who's bought something from Amazon is familiar with the effects of association learning through data mining. Though Amazon doesn't disclose its algorithms—and probably encourages the rumors that they have a team of programmers changing them every 30 minutes or so—the merchant giant uses association learning to make personalized online recommendations.

Even without Amazon's zealously guarded algorithm secrets, the if you like X, you'll like Y formula that can be derived from association learning can benefit any business. With a plethora of products to choose from, consumers often appreciate a nudge in the direction that's interesting to them.

People who buy car insurance like coffee mugs

With oceans of data to sort through, cluster detection is an essential form of data mining that recognizes sub-categories or distinct clusters, which people reading through piles of reports would otherwise miss. This type of data mining can point out purchasing habits among certain groups, providing an excellent source of targeted marketing.

Separating the wheat from the chaff

Classification enables the application of an existing structure for sorting into pre-determined categories. This type of data mining makes things like automated email folder routing possible. For example, spam filters use sophisticated classification algorithms to weed out messages asking you to buy Viagra or donate large sums of money to Nigerian princes.

Learning from the past

With regression, data from past behavior is collected and applied to predict your future actions. Again, the algorithms are complex, but Facebook uses regression data mining to weigh certain factors and pinpoint new behaviors to encourage, or features to offer—though there might have been an element of anomaly detection behind the decision to introduce Timeline.

How does data mining factor into your life?

If you spend any time online, whether for business or pleasure, you're affected by data mining. Your information is used by companies who want your business in various ways, including:

  • Targeted advertising, such as related products and geographical information
  • Spam that is sent to your email address when you sign up for related services
  • Phone calls from survey companies or lead generation firms using data harvested online
  • Snail mail, including offers related to things you've expressed interest in online
  • Friend suggestions through social media sites like Facebook, Twitter, and Google+
  • Police and security profiling, which sometimes relies on Internet data to identify suspicious activity like credit card fraud and illegal downloading

Data mining practices represent a good reason to protect your privacy online. Never give out personal information to an untrusted source, and avoid posting your email address, phone number, or mailing address on public websites. This can help you avoid spam, junk mail, and other forms of targeted advertising—including those eerily prescient banner ads.

Share Article:
The right software for your business

Get your personalized recommendations now.