By Ken Gifford
Have you ever taken someone to Baskin-Robbins who really likes ice cream? Invariably, whenever I take my wife there, she has a pretty good idea in her mind what she has a taste for. However, when presented with the bewildering array of choices available, she has a hard time making up her mind. This kind of situation today is often referred to as “Information Overload”. We have an idea of the information we’re searching for, but our senses become saturated with all of the data that is presented to us. It’s a bit like taking a drink from a fire hose.
American writer on Internet Technologies, Clay Shirky states:
“It’s not information overload. It’s filter failure.”
Wikipedia puts it like this: “Some cognitive scientists and graphic designers have emphasized the distinction between raw information in a form we can use in thinking. In this view, information overload may be better viewed as ‘organization underload’. That is, they suggest that the problem is not so much the volume of information, but the fact that we cannot discern how to use it well in the raw or biased form it is presented to us.”
Myself, I simply like to make the distinction between data and information. Data is simply that, raw data in its simplest form. While information is data distilled into something actionable.
In the industrial world, this is usually represented in trends. Trended data is simply a continuous line depicting a data point over time. Most of us have seen the images of someone giving a presentation standing in front of an easel with a red line drawn on it sloping up or down. This is a simple representation of a trend line.
When analyzing time series data however, a trend chart becomes very useful in detecting sudden changes and seeing exactly when they happened. This allows the user who needs to analyze the data to quickly zero in on a time range for deeper analysis. Because of this, trends are the primary view in any industrial data collection tool kit.
But what about the question of what data to collect? This is truly a loaded question. Whenever we enter a new data collection project with a customer we ask the question – “What data do you wish to collect?” Invariably, the answer is “All of it”. This is challenging in several respects. It is usually an indication that the end user is unfamiliar with which data is most useful in determining how well a process is running. Before we invalidate this request, we should also take into account that there are often many normally unseen factors which can affect a process. By collecting “everything” this gives those involved an opportunity in understanding what is going on when something really out of the ordinary occurs. This also means there has to be some really good “filters” of the data to allow a user to be able to sort and view information as needed to make good decisions.
Before making the decision to go into a “deep data dive” approach, one should also consider another factor – cost. If you are planning on putting a data collection system that is collecting from thousands of sources, than realize a large system like that will cost more. One approach is to build the system sized large enough to collect the “high value” data points, with some overhead – of say an additional 100 data sources. What then can be done is, these overhead sources (“tags” as they are referred to in software-land) can be repurposed as needed to collect a variety of points when trying a new spec, or if experiencing a recurring issue on a specific piece of equipment. This can help keep the overall cost of the project down, but give you the flexibility to do deeper analysis when you need it.
I hope you’ve enjoyed this series on automated data collection. If you have any questions, or just want to have a conversation about data collection then call us at 800-844-8405.