Data vs. Information

Learning Outcomes

Define and distinguish between “data” and “information”

decorative imageMany people are under the impression that the terms “data” and “information” are interchangeable and mean the same thing. However, there is a distinct difference between the two words. Data can be any character, text, word, number, and, if not put into context, means little or nothing to a human. However, information is data formatted in a manner that allows it to be utilized by human beings in some significant way. An individual has an almost unlimited amount of data associated with him or herself.  This data is of little use to business in its raw, unorganized form. It is not until the data is formatted or compiled into something meaningful that business has information about the individual. For example, suppose the department store Big Box is collecting data about its customers from a loyalty card program and online customer surveys. It collects the following data about a particular customer:

  • Age: 34
  • Big Box Account #: 123456
  • Gender: Female
  • Zip Code: 22322
  • Children: 2
  • Marital Status: Married
  • Last Purchase: Jogging Pants

These pieces of data alone are not particularly useful to Big Box. It is not until the data is compiled that Big Box begins to get a “picture” of the customer behind account #123456. Transforming this data into information, Big Box is able to know that this customer is a married female who has 2 children and enjoys jogging. They also know that because she lives in zip code 22322 that she is most likely to shop at their store at Halifax Mall since the mall is in the same zip code as the customer’s home address. If Big Box wants to market to her successfully, then they will use this information to include her in an upcoming active wear promotion. Also, since she has children they will also include her in promotions that include children’s wear. The key to collecting data and turning it into useful information for Big Box is that it is a continual process.

So, Big Box includes Customer #123456 in a future mailing and when she comes into the store and makes a purchase her loyalty card records that she purchased several items in the toddler clothing department. This data can be useful information when Big Box sends out information about their annual “Santa Comes to Town” promotion. They can use the purchase data to inform them that Customer #123456 has a toddler and toddlers love to come see Santa!

Practice Question

Later in the year, Customer #123456 makes an online purchase of a pair of men’s work boots and a men’s heavyweight coat. The data that comes into Big Box may look like this:

  • Customer #123456
  • Date: 10/5/2018
  • Item #56-9876 Cougar Work Boots, Size 11
  • Item #43-2341 Men’s Heavyweight Denim Coat, Size XL

Not very interesting data by itself. But, now Big Box can use this data to have even better information about Customer #123456. They know that Mr. #123456 probably works outdoors, possibly in a skilled trade; hence the need for work boots and a heavy weight coat. When Big Box spends their promotion dollars on a men’s suit sale they will not target Customer #123456 because they have “information” about them, gathered from these individual pieces of data. As Customer #123456 makes additional purchases, visits the company’s web site and responds to special offers they will collect more and more data. Every piece of data collected will be useful in giving Big Box more and more information about this particular customer. Now, imagine this data is collected on every customer for every purchase over a period of years. The quantity of raw data collected is staggering and the challenge for Big Box is to store this data in a manner that allows it to be turned into information. This is where data warehousing and data mining come into play.