Data Analytics: From Smart Meters to Smart Decisions
February 26, 2018
Today, the power grid consists of over 9,200 generating units producing more than 1 million megawatts of generating capacity connected by more than 300,000 miles of transmission lines (US Department of Energy). Energy consumption in the United States increased from 39,109 trillion British Thermal Units (BTUs) in 2012 to 40,394 trillion BTUs in 2016 (US Energy Information Administration, 2016). To meet increasing demand and intelligently incorporate new technology, while also preventing blackouts and power losses, the power grid must evolve. To cope with growing demand, adapt to increasing complexity, and actively manage the needs of consumers in real time without constant human interference, the grid has to become smarter. Technological changes being implemented throughout its entirety are facilitating the evolution of the grid towards meeting the needs of the 21st Century (US Department of Energy).


Smart grids have digital technology that allow for two-way communication between the utility and customers using sensors (e.g., pervasive sensors, smart meters, and data networks) (US Department of Energy). This two-way communication allows the smart grid to better predict loads and potential outages, and gives utilities greater insight into the overall health of the grid (US Department of Energy). Smart meters, like the one pictured to the left , monitor real-time power consumption and communicate with the utility. These meters can also receive signals from the utility to offer customers incentives for reducing loads in times of high demand (Simmhan, et al., 2013a).

The power grid is expected to become more dynamic and require constant decisions based on data streaming in from various sources at any given time (Yin, Kulkarni, Purohit, Gorton, & Akyol, 2011). Going from one meter reading each month (or even fewer) to a meter reading every 15 minutes, or 96 million reads per day for every 1 million meters, results in a 3,000-fold increase in data, which can be overwhelming if not properly managed and processed (IBM, 2012). Utilities require more powerful data analytics solutions to handle this extreme increase in data volume, variety, and velocity. This newsletter describes how smart meter data can be used for customer segmentation and the potential benefits from this segmentation in terms of forecasting and demand response (DR) programs.

Customer Segmentation

Customer segmentation is the practice of dividing a customer base into groups based on their energy usage patterns. Smart meters are being deployed to consumer premises to monitor real-time power consumption and communicate back to the utility. The data gathered from smart meters can provide better understanding of customer behavior and, hence, can facilitate customer segmentation.

Unsupervised clustering of daily electricity consumption data can provide insight into consumption patterns of customers in a microgrid. For example, Ward’s method of hierarchical clustering was used on daily average electricity consumption data from smart meters to group households in a Southwest European city into distinct types of annual consumption profiles (Gouveia & Seixas, 2016). Similarly, adaptive k-means clustering was used to isolate energy use profiles among more than 200K Pacific Gas and Electric Company (PG&E) customers with smart meters (Kwac, Flora, & Rajagopal, 2014). In addition to shape-based analysis of usage profiles, households can also be grouped based on entropy/variability in usage patterns, which can help improve the potential success of targeting and recommendation design. For example, high-entropy households, indicating variability in occupancy and energy using activities, may benefit more from energy reduction programs, such as appliance rebates, rather than DR programs. The underlying idea behind unsupervised clustering of energy consumption data is to identify households with repeatable/similar load traces. However, these approaches do not allow for deriving household properties explicitly; instead, human interpretation of the obtained patterns is needed.

Supervised learning methods can be applied on electricity consumption data to predict specific properties of private households, like their floor area, age of building, or the number of persons living in them (Beckel, Sadamori, Staake, & Santini, 2014). Raw smart meter traces for a defined period and granularity can be converted into informative features including consumption figures, ratios, temporal properties, and statistical properties. Consumption figures capture the electricity consumption data at different periods of the day and allow comparison of households with respect to their average electricity consumption during these periods. Ratios are features calculated as the ratio of two different consumption figures and are informative of lifestyle patterns like whether cooking typically takes place over lunchtime, in the evening, or both. Temporal features capture the time instant of the first occurrence of an event, such as the daily electricity consumption exceeding a specific threshold. Finally, statistical properties include features such as the variance or the total number of peaks of the consumption trace over a day. Such classification models were found to be successful in estimating the majority of the considered household properties with high accuracy. (Beckel, Sadamori, & Santini, 2013). Models can be further improved by applying feature filtering and selection to reduce the dimensionality and improve robustness of the model (Hopf, Sodenkamp, Kozlovkiy, & Staake, 2016).

Benefits of Smart Meters and Customer Segmentation

Effective customer segmentation can enable utilities to accurately forecast load requirements and prices accounting for available resources and design successful DR programs.

Load Forecasting

Load forecasting is a technique used by power utilities to predict the power or energy needed to meet the supply and demand equilibrium. The accuracy of these predictions is of great importance to the operation and management of load of a utility company. Correct customer segmentations allow utilities to group individual households into groups and accurately model their short-term and long-term load needs. Depending on the forecasting length, different factors are the driving forces for forecasting. While in the short-term (traditionally defined as under a week), forecasts are based on variation of electricity load and generation dispatches, in long-term load forecasting (traditionally defined as any forecasting beyond one week), other factors become more dominant. For example, in midterm to one-year forecasting, the major market forces affecting loads and forecasts are seasonal weather, annual generation, and planned outages (Chan, Tsui, Wu, Hou, Wu, & Wu, 2012). With customer segmentation, the potential exists to be able to forecast load at the granularity of each household.

Price Forecasting

Accurate customer segmentation leads to accurate load forecasting, which in turn is crucial for accurate price forecasting. Price forecasting is the process of using mathematical models to predict what electricity prices will be in the future. Price forecasting techniques can be broadly divided into two types: simulation-based methods and statistical and artificial intelligence (AI)-based methods (Hong, 2014). Simulation-based methods create a mathematical model of the electricity market, load forecasts, outage information, and bids from market participants (Chan, Tsui, Wu, Hou, Wu, & Wu, 2012). The results of simulation methods yield the optimal economic dispatch for the system by solving the unit commitment mathematical problem, setting the locational marginal prices (LMPs), and solving the optimal power flow (OPF). Price forecasting accuracy using simulation methods is highly dependent on the quality of the input data. Statistical or AI methods, conversely, do not require a comprehensive knowledge of market procedures. Statistical methods use historical prices, weather, outages, and load to forecast future prices. Statistical methods cost less to implement but are less accurate in analyzing price spikes caused by congestion in the transmission network (Hong, 2014).

Targeted DR Programs

It is estimated that the capacity to meet demand during the top 100 hours in one year accounts for between 10 and 20% of electricity costs (Wells, 2004), as generation and transmission capacity is provided to meet peak demand that occurs infrequently (Arnold, 2011). Demand response (DR) programs aim to change customers’ electric usage through changes in market electricity prices. The goal of DR programs is to balance electricity demands on the grid by reducing electricity demands during peak demand periods and increasing usage during low demand periods. (Federal Energy Regulatory Commission, 2009). To implement successful DR programs, it is essential for utilities to understand usage patterns of customer segments and design programs accordingly. A classic example of DR is to price electricity cheaper at night in the summer to incentivize a certain customer segment to run washers and dryers at night as opposed to during the daytime when demand for electricity is higher due to additional cooling needs. By enabling bi-directional flow of information in the smart grid, smart meters further encouraging users’ participation in energy savings and cooperation through DR mechanisms (Diamantoulakis, Kapinas, & Karagiannidis, 2015).

Future Trends

As the century progresses, smart grids will continue to grow and will need to cope with larger-scale integration of distributed energy resources (such as renewables), cyber-physical components, and devices. With this growth, the volumes of data collected by smart grids will also continuously increase. Data analytics methods including customer segmentation, load forecasting, and price forecasting will play key roles in increasing the efficiency and the profitability of the utility industry.

For more information on the power grid, check out our recently published book.

Works Cited

US Energy Information Administration. (2016, 08 26). eia. Retrieved 09 01, 2016, from Consumption and Efficiency:

US-Canada Power System Outage Task Force. (2004). Final Report on the August 14, 2003 Blackout in the United STates and Canada: Causes and Recommendations. US-Canada Power System Outage Task Force.

Arnold, G. (2011). Challenges and opporutnities in smart grid: A position article. Proc. IEEE vol. 99, no. 6, 922-927.

Beckel, C., Sadamori, L., & Santini, S. (2013). Automatic socio-economic classification of households using electricity consumption data. Proceedings of the fourth international conference on Future energy systems (pp. 75-86). Berkeley, California, USA: ACM.

Beckel, C., Sadamori, L., Staake, T., & Santini, S. (2014). Revealing household characteristics from smart meter data. Energy, 397-410.

Chan, S., Tsui, K., Wu, H., Hou, Y., Wu, Y., & Wu, F. (2012). Load/Price Forecasting and Management of Demand Response for Smart Grids: Methodologies and Challenges. IEEE Signal Processing Magazine, Special Issue on Signal Processing Techniques for Smart Grid, 68-85.

Commission, E. (2015, 06 22). Retrieved 04 11, 2017, from

Diamantoulakis, P. D., Kapinas, V. M., & Karagiannidis, G. K. (2015). Big data analytics for dynamic energy management in smart grids. Big Data Research 2, 2(3), 94-101.

Federal Energy Regulatory Commission. (2009). The Natiuonal Assessment of Demand Response Potential. Washington, DC: Department of Energy.

Gouveia, J. P., & Seixas, J. (2016). Unraveling electricity consumption profiles in households through clusters: Combining smart meters and door-to-door surveys. Energy and Buildings, 116, 666-676.

Hong, T. (2014). Energy forecasting: Past, present, and future. Foresight: the International Journal of Applied Forecasting, 43-48.

Hopf, K., Sodenkamp, M., Kozlovkiy, I., & Staake, T. (2016). Feature extraction and filtering for household classification based on smart electricity meter data. Computer Science-Research and Development, 31(3), 141-148.

IBM. (2012). Managing big data for smart grids and smart meters. Somers, NY: IBM.

Kwac, J., Flora, J., & Rajagopal, R. (2014). Household energy consumption segmentation using hourly data. IEEE Transactions on Smart Grid, 420-430.

Simmhan, Y., Aman, S., Kumbhare, A., Liu, R., Stevens, S., Zhou, Q., et al. (2013a). Cloud-based software platform for big data analytics in smart grids. Computing in Science and Engineering, 15(4), 38-47.

Soliman, S. A.-h., & Al-Kandari, A. M. (2010). Electrical Load Forecasting: Modeling and Model Construction . Burlinton, MA: Elsevier.

Strasser, T., Siano, P., & Vyatkin, V. (2015). Guest editorial - New Trends in intelligent energy systems-An industrial informatics point of view. IEEE Transactions on Industiral Informatics, 11(1), 207-209.

US Department of Energy. (n.d.). What is the Smart Grid? Retrieved 3 21, 2016, from

Wells, J. (2004). Electricity Markets-Consumers Could Benefit From Demand Programs, But Challenges Remain. Washington, DC: Government Accountability Office.

Willis, H. (2002). Spatial Electric Load Forecasting. New York: Marcel Dekker.

Woodie, A. (2015, 10 13). Retrieved 04 11, 2017, from Datanami:

Yin, J., Kulkarni, A., Purohit, S., Gorton, I., & Akyol, B. (2011). Scalable Real Time Data Management for Smart Grid. Proceedings of the Middleware 2011 Industry Track Workshop (pp. 1-6). Lisbon, Portugal: ACM.