Predictive Maintenance On The Cheap

“In the grim darkness of the far future, there is only war machine learning.”

I try to find a place to park my stuff, because the “open” office plan
practically means that I have to partake in a daily hunger games to find a
place to sit. I am very unsuccessful, as I get instantly stabbed by a coworker
that has already found a sword in a chest. I resign to the sofa nearby for
my final resting place. Taking my final breath, I gaze at the screen of my
would be killer, and I see some time series data, in the form of very colorful
squiggly lines.

Now that I’m back in the real world, I go up to my colleague, who I’ll call
Bob from now on, and ask “What’s that?” like the curious intern that I am.
Turns out, Bob was looking at the power generation data from one of our
solar power plants. After conversing for a short while, I get a new quest: try
to identify any panels that might be broken, or (the kicker) will break down
in a short time frame.

Predictive Maintenance

Predictive maintenance basically means that you try to find the faults before
they find you. Simple as that. In theory, you might get:

…reduced maintenance costs by 25%, reduced breakdowns by 70%, and increased
productivity by 25%…

if Deloitte is to be believed.

Now, there are a bunch of ways that one might go about doing that, provided you
have some previous data: KNN, XGBoost, deep learning, etc. This is almost just a run
of the mill classification task in that case. But, watt if (yeah, sue me) you
don’t have anything? Just the instantaneous current produced, logged every
five minutes, per panel. Well, we should be able to calculate how much current
a given panel should be producing, right? There is a
way to calculate that, sure, but that requires measuring how much light is
shining on the panel. But we didn’t have irradiance sensors, that is, we had an
irradiance sensor. Singular. For 10 acres of not equally flat terrain. Also,
even though the pilot solar power station was just 10 acres and had 2 kilowatts
of capacity, we had another potential use case that would require the solution
to scale up to 1800 acres. That’s give or take 50.000 data points, every
five minutes. So, the new requirements: find something that’s light on storage
(so as not to contribute to the ever-increasing cloud bills),
and light on compute (so that it can scale well).

There Has To Be A Paper About This

That was the thought that sprang to my mind, as I prayed to our lord and
saviour, Google Scholar. After a couple of literature reviews, there it was:
Outlier Detection Rules for Fault Detection in Solar Photovoltaic Arrays
(Zhao et al., 2013). I’ll spare you the nitty-gritty, but know that in the
paper authors had sane performance baselines, and actually tested the methods
by making a small array of solar panels and slowly torturing them. The main idea
of the paper was something called the Hampel Identifier, which uses basically
the median absolute deviation from the median.

I realize reading that conjures
nothing concrete in your mind, so let me walk you through that quickly. Take the median.
Now, think about all the data points, and how far away
they are from the median. Then, take the median of those differences.
Still with me? Good. Finally, using that value as a benchmark, you can identify
outliers by how much they deviate from the median. If one given value is
too far away from the median when compared to the benchmark, then
that’s an outlier. Here’s also a simple implementation as an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
def hampel_identifier(data, threshold=3):
# Calculate median and MAD (Median Absolute Deviation)
median_value = median(data)
mad = median([abs(x - median_value) for x in data]) / 0.6745 # 0.6745 is a constant for normal distribution

# Identify outliers
outliers = []
for i, value in enumerate(data):
if abs(value - median_value) > threshold * mad:
outliers.append((i, value)) # Store index and value of outlier

return outliers

The Hampel Identifier is robust, simple to calculate, and requires no prior data,
so it was perfect for our use case. We quickly implemented it with Bob, and
identified a good number of panels that were anomalously underperforming.
My internship ended while we were in the process of coordinating with
the engineers in the field, so I don’t know how many panels were really on
their way out, so I can only that the system is doing Deloitte proud.

(P.S. I lied about there only being machine learning in the grimdark future,
it was all statistics. Always has been.)

References

Zhao, Y., Lehman, B., Ball, R., Mosesian, J., & De Palma, J.-F. (2013). Outlier detection rules for fault detection in solar photovoltaic arrays. 2013 Twenty-Eighth Annual IEEE Applied Power Electronics Conference and Exposition (APEC), 2913–2920.