RegData India: A quantitative analysis of national laws in India

RegData, an initiative of the RegData, an initiative of the Mercatus Center, is an effort to quantify various aspects of regulation. The Mercatus Center created RegData in 2012 with an aim to “introduce an objective, replicable, and transparent methodology for measuring regulation.” It uses custom-made text analysis and a machine-learning algorithm to measure the different features of law. These features include volume, restrictiveness and linguistic complexity. Together, these metrics indicate the regulatory burden a particular law, department or ministry imposes. Some variables, like the restrictive terms or ‘binding words’, demonstrate associations with economic growth and productivity (Mercatus Center 2019).

In collaboration with the Mercatus Center, we obtained quantitative metrics for all  876 national laws of India. For this purpose, we used the list of laws made available on the official portal of the Government of India. This empirical analysis, along with our categorisation of laws by the Ministry and Department, will help open way for further research on the burden imposed by laws. 

We study laws on the following three metrics:

  • Volume: This metric quantifies the number of words in a law. We also use this metric to study the average number of words per law in a particular year, and the average number of words per law in each Ministry.
  • Binding words: RegData uses a text analysis program to count the number of  binding words or “restrictions”in a law—that create an obligation to comply or limit choice sets. Restrictive terms are likely to be higher in laws that are lengthy. To get an estimate of the density of restrictiveness, we also calculate the ‘normalised binding words’ for each law. This metric refers to the average number of words after which a ‘binding’ word appears. For instance, if the normalised binding words for a law is 300, it means that on average a restrictive term appears after every 300 words in the law. Lower normalised binding words would imply that a law is more restrictive. 

Linguistic complexity: This metric measures the complexity of a given law. Complexity is understood by how a law fares on  the following four sub-categories: Shannon entropy, sentence length, conditional count and Flesch Reading Ease score. These four metrics provide an understanding of how easy or difficult it is to comprehend a law. A law that is tough to comprehend may also increase the compliance costs for regulated entities (in terms of effort, time and money) (Mercatus Center 2020).

Download the Research Brief