Factors Associated with Electricity Losses in Colombia

The purpose of this research is to explore the factors associated with transmission and distribution electricity losses in Colombia, measured through the proxy variable non-technical losses. A literature review is carried out to find out the variables that have been significant in past studies. Once the issue is contextualized in the Colombian case, a statistical and econometric analysis is developed with the available variables. Transmission and distribution electricity losses were found to be positively significantly associated with crime, unemployment, and income; while these have a negative association with urbanization and population density. Results are of interest to practitioners, academics and policy makers.


INTRODUCTION
The reduction of electricity losses is a challenge for energy utilities. These losses can be technical and non-technical. Technical losses occur naturally due to power dissipation in systems (Depuru et al., 2011) and non-technical ones are associated with lack of measurement, theft, illegal connections or alteration of meters (Obafemi and Ifere, 2013). For Jamil (2018), electricity theft is an important component of non-technical losses and occurs due to the dishonesty of users with the complicity, many times, of corrupt officials.
Losses can also be divided into generation, transmission and distribution. These last two are generally considered together as Transmission and Distribution Losses (T and D losses). This indicator is often used to measure inefficiencies in systems, including electricity theft since it is very difficult to estimate it. For this reason, T and D losses are a good approximation for the measurement of electricity theft (Gaur and Gupta, 2016;Razavi and Fleury, 2019;Smith, 2004). According to Smith (2004), the losses in generation are 2-6%. For this author, T and D losses are <6% in highly efficient systems, 9-12% in less efficient systems, and above 15% in inefficient systems (Smith, 2004).
Losses of electricity have costs and avoiding them has several benefits. One of the most important costs is the reduction in income from the sale of energy that forces operators or utilities to increase rates to other consumers (Smith, 2004). Chirwa (2016) gives evidence of a relationship between electricity losses and the level of electricity rates in Malawi. However, this situation not only generates financial but also ecological problems, since electricity losses are associated with higher CO 2 emissions (Daví-Arderius et al., 2017). On the other hand, reducing electricity losses helps improve the financial situation of energy companies, reduces emissions and avoids the need for additional infrastructure to generate electricity (Averbukh et al., 2019).
Losses in Colombia are close to 16.6% (ASOCODIS 1 , 2019), costing energy companies around 115 billion Colombian pesos (29 million US dollars) annually (CREG, 2019). The objective of this article is to explore the factors that are associated with electricity losses in Colombia. To achieve this, a review of the main scientific research documents will be made to determine what factors are associated both positively and negatively with energy losses. The study will be contextualized in the Colombian case. Models will be developed to observe the statistical significance of some explanatory variables of energy losses. Likewise, the respective statistical analysis of the data and the validation tests will be carried out. Finally, some conclusions will be offered as well as public policy implications will be listed.

DRIVERS OF ELECTRICITY LOSSES
To measure electricity losses, the studies mainly use the indicator transmission and distribution losses (T and D Losses) (Briseño and Rojas, 2020a;Gaur and Gupta, 2016;Razavi and Fleury, 2019;Smith, 2004). However, other research seeks to directly measure electricity theft (Briseño and Rojas, 2020b;Yurtseven, 2015), which is an important part of non-technical losses.
The methodologies used in electricity losses research are varied. To name a few, the following stand out: A theoretical model analysis, questionnaires, correlations, ordinary least squares (OLS) models, regression with panel data, random forest models, generalized method of moments (GMM), and the method of generalized feasible least squares (FGLS).
Generally, the variables that are significantly and positively associated with electricity losses are price (Yakubu et al., 2018;Yurtseven, 2015), poverty (Gaur and Gupta, 2016), corruption (Gaur and Gupta, 2016;Yakubu et al., 2018), unemployment (Briseño and Rojas, 2020a), government inefficiency (Briseño and Rojas, 2020b), and crime Rojas, 2020a, 2020b;Razavi and Fleury, 2019). On the other hand, some variables have a significant negative influence on electricity losses, such as good governance (Smith, 2004), education (Briseño and Rojas, 2020a;Yurtseven, 2015), literacy (Gaur and Gupta, 2016;Razavi and Fleury, 2019) and monitoring (Jamil, 2018), among others. Some variables such as urbanization show mixed results, as has been evidenced in previous studies. Table 1 shows the main studies, the methodologies applied, and the variables that have been significant in a positive and negative sense.
As evidenced in studies, electricity losses depend on social, economic and contextual factors that can vary from one region to another. In this study, the focus will be on the Colombian case, so the following section will broadly develop the context in which the country's electricity system operates.

COLOMBIAN ELECTRICITY CONTEXT
The national electricity system has a coverage >97% and is composed of (1) power generators usually on a large scale 1 Asociación Colombiana de Distribuidores de Energía. (García-Sierra and Zerda-Sarmiento, 2016) such as large hydroelectric and thermal plants.
(2) Energy transmitters, responsible for transporting large volumes of energy from generation nodes to consumption nodes in large cities.
(3) Energy distributors in charge of delivering energy to end-users in individual homes, businesses and industries. (4) And electricity marketers, which are in charge of the relationship processes with the end-customer, such as billing and collection of the entire electricity chain. The agents of the Colombian electricity system since Law 142 of 1994 called "Law of Domiciliary Public Services" and Law 143 of 1994 called "Electricity Law" can be public, private or mixed (Congreso de Colombia, 1994a, 1994b. Currently, the Colombian electricity market is made up of 112 marketers, 42 distributors, 16 transmitters, and 74 generators. Private agents focus mainly on the generation and commercialization of energy because these are the most liberalized sectors; and, to a lesser extent, in the distribution and transmission sectors where the majority are public and mixed agents (ASOCODIS, 2019).
The total electricity generation of the system in 2018 amounted to 68,947 GWh (UPME, 2019). According to official figures, the supply of electricity generation by hydroelectric companies is 11,834.57 megawatts (MW). The country's total net effective capacity for all types of energy generation is 17,319.59 MW (UPME, 2019). 68% of the country's energy supply in 2019 came from hydraulic generation. Today, 28 centrally operated and 115 non-centrally operated hydroelectric plants are working. In the centrally operated, the net capacity amounts to 10,974 MW, while the second group reaches 860.57 MW.
The most important electricity companies for their participation in the energy generation and commercialization market are Empresas Públicas de Medellín, ISAGEN, ENEL, and CELSIA. The country's total consumption is 68,754 GW, of which the regulated market (low electricity consumption customers) consumes 46,956 GW, and 21,798 GW the unregulated market (large consumers of electrical energy at official, industrial and commercial level) (UPME, 2019).
The total number of users in 2018 was 14,807,399 of which 13,525,323 correspond to the residential sector (91.3%) and 1,282,076 to the rest of users (8.7%) (UPME, 2019). The percentage of energy losses is very high for companies. For this reason, the monitoring and control of loss reduction plans require a great effort on the part of the regulators (Romero-López and Vargas-Rojas, 2010).
Among the sectorial challenges stand out the improvement in the quality of the service (there are still interruptions of several hours a year) and the decrease in electrical energy losses (García et al., 2020).

DESCRIPTION OF DATA
To find the factors that influence electricity losses in Colombia, a database was built whose observations are the departments (political and administrative divisions) of the country during the years 2017, 2018 and 2019. The departments of Amazonas, Guainia, Vaupes, Vichada are isolated from the national transmission system, and their measurements are carried out directly by the same companies. This implies that the measures are not available since it is difficult to recognize the users served by these companies given their jungle condition that complicates access to reliable and auditable information. Unlike interconnected companies that have information digitized by automatic mechanisms, these remote regions use less reliable indirect measurement methods. For this reason, only 28 departments of the 32 existing in Colombia were considered.
With respect to the variables, the dependent variable is the percentage of transmission and distribution losses (T and D LOSSES), measured through the proxy variable non-technical losses. The explanatory variables were selected according to the following criteria: (1) Information available for the three sample years, (2) relevance according to the literature review, and (3) non-redundant data. Given these premises, it was possible to obtain information on variables related to crime (homicides and kidnappings), unemployment, urbanization, income and population density. All of them proved to be significant in at least some of the studies cited in the literature review. Table 2 shows the variables, their explanation, their units of measurement and the source of information.  In the correlation matrix, positive associations are observed between T and D LOSSES and the variables CRIME, UNEMPLOYMENT, INCOME and POP_DENS. Likewise, a negative correlation with URBANIZATION is observed.
Regarding the relationship between the explanatory variables, a correlation >0.5 is observed between the CRIME and UNEMPLOYMENT variables; as well as a correlation close to 0.5 between URBANIZATION and INCOME. It is important to analyze the correlations between the independent variables since the case of multicollinearity may occur. In the next section, an econometric model is carried out as well as its respective validation tests.

EMPIRICAL RESULTS
To explore the statistical relationship between electricity transmission and distribution losses (T and D LOSSES) in Colombia with their hypothetical determinants, several econometric models were carried out. Since there is information on both measurement units (Departments) and time units (years 2017, 2018 and 2019), it is possible to generate data panels. However, it was not possible to explain the phenomenon of energy losses with non-observable, fixed or random effects. The variables were not significant or the models were not validated with their assumptions. The models with the best adjustments   Table 5. Both are ordinary least squares (OLS) panels.
The first model that was carried out to explain T and D LOSSES was an ordinary least squares panel including the variables CRIME, UNEMPLOYMENT, URBANIZATION, INCOME and POP_DENS. The constant and POP_DENS were significant at 5%, while the other variables were significant at 1%. The model complied with normality in errors (P = 0.45), correct specification (P = 0.68) and homoscedasticity (0.09). It is important to note that the Putumayo department observations were previously eliminated for the 3 years due to their high squared errors. Regarding multicollinearity, there is a correlation slightly >0.5 between the explanatory variables CRIME and UNEMPLOYMENT (signal of possible multicollinearity). However, the variance inflation factors (VIF) are <10, which suggests that there are no collinearity problems. However, considering the Belsley-Kuh-Welsch (BKW) collinearity diagnoses, a possible moderately strong collinearity is observed since there is a condition index above 10 associated with a variance proportion >0.5 (URBANIZATION variable). The coefficient of determination R 2 of the first model was 0.57, which suggests that 57% of the changes in T and D LOSSES are explained by the variables CRIME, UNEMPLOYMENT, URBANIZATION, INCOME and POP_DENS.
To eliminate the possibility of moderate multicollinearity, several models were carried out. However, the one that met all the assumptions was model 2. In this model, only the variables CRIME and UNEMPLOYMENT are involved. Although both variables correlate slightly >0.5 (signal of possible multicollinearity), the variance inflation factors are <10 and there are no condition indices >30 or 10. Therefore, it is considered that there is no evidence of excess or problematic collinearity. Likewise, the model complies with normality (P = 0.41), correct specification (P = 0.26) and homoscedasticity (P = 0.49). The coefficient of determination R 2 was 0.21. However, it is considered a better model because it satisfies the validation tests more clearly. As mentioned above, fixed and random effects panels were carried out without achieving a model with significant variables with the expected sign and that fulfilled the statistical assumptions.

POLICY IMPLICATIONS AND CONCLUSION
The results might be useful to provide public policy with a guide on elements to consider in the stimulation models of loss reduction plans. In the regressions carried out, it was observed that crime and unemployment are the major factors that explain energy losses. In this sense, the current models used in Colombian regulation may be correlating internal factors of the electricity sector with a vision that is too endogenous. The contributions of our study indicate that exogenous variables are those that best represent the economy implicit in energy loss efforts. As observed in this article, perhaps targeting programs based on crime and unemployment rates can be better predictors of sources of energy losses than variables related to consumption or geographic location.
This research shows that urbanization and population density are factors that reduce energy losses, unlike other studies. In the Colombian case, it is necessary to recognize that urban geographic areas and in processes of densification or hyper urbanization have loss control processes greater than other less urbanized areas. The study suggests an important reorientation of efforts to reduce losses to areas with less densification.
It is important to consider the variables unemployment and crime in energy loss reduction projects. These indicators can help focus efforts on specific geographic areas for a more effective result. The study is limited in understanding the magnitude and specific channels in which unemployment and crime influence electricity losses. However, it does show us a statistically significant relationship.