Data Janitor Work: What No One Tells You About

By April 24, 2014Retail
Riptide blogpic_janitor

As much as we are excited about all the new applications in energy and facility management, we know first-hand that the big barrier to analyzing data is getting clean, workable data. The hidden cost of analytics is getting and maintaining good data. The rules of big data apply in our industry too – before you hand data to the data scientists, you need the data janitors. Accenture estimates that 80% of the effort of analytics is data cleaning – necessary to have data you can trust. Riptide IO’s team has learned this lesson the hard way.

So, how can you cost-effectively roll out an analytics project, either portfolio-wide energy or HVAC fault detection and really trust the data? First, start with a solid data model. It’s likely that the data in your buildings today come from a mix of vendor equipment and is formatted in a variety of ways. Our most successful engagements begin with identifying the data you will collect and standardizing on a data model. (Anyone want to see our data model? Just send us an email.)

The backbone of a data model is setting and implementing tagging. The Project Haystack data modeling standard for Buildings and Equipment systems uses a simple meta-model based on the broadly accepted concept of “tags”. Investing the time up-front on the data model and tagging schema is first step to good data. Stay tuned-more posts on other best practices to getting and keeping good data.

Leave a Reply