It was during the formulation of the 10th five-year plan in 2002 that the government announced its resolve to be S.M.A.R.T. (Simple, Moral, Accountable, Responsible and Transparent). In the following years, the government upheld its commitment. The landmark RTI legislation was passed in 2005 and then the launch of the UIDAI and Aadhar in 2009 made India digitally ready. However,the lesser-known development was the National Data Sharing and Accessibility Policy (NDSAP) drafted in 2012. In the very recent past, the government has initiated a process to revise NDSAP, thus making it more relevant in present times.
The NDSAP provides guidelines on how the government data can be shared with public and private entities in open-file formats. It encourages government ministries to set-up cells within their jurisdiction that would upload data on a website (data.gov.in) and it will to be accessed by people across medium. The portal opens up information in government systems and provides accurate, reliable statistical data. This data would be used to keep the government accountable and would further lead to public engagement. The guidelines of NDSAP govern how open government data(OGD) would be used and accessed.
With more than 27,000 resources, across 101 departments having been viewed more than 7.8 million times, the government’s open data platform seems to be a great success.
Quality of data takes a hit
There are many discrepancies with the data-sets uploaded on the website. The primary problem is that of standardization. The NDSAP simply provides a framework for government offices and subordinates on how to upload the data files, like naming conventions; basic dos and don’ts for data contribution and approval are highlighted in the implementation guidelines of NDSAP. The missing point is the underscoring of the kind of data that should be made public- the specificity of variables and parameters for each ministry. The policy allows various ministries and departments to prepare a list of datasets that the relevant ministry would release and also create a ‘negative list’, which would have the datasets that the ministries would not make public. No common thread governs these datasets and so we see varying kinds of data that are uploaded on the forum. Even though the ministries are mandated to upload a certain number of datasets, they aren’t told what the nature of such data should be.
To solve this problem, the government relies on the people accessing the portal.The website has a tool where a particular user can ask for a dataset and if it gets more than 100 votes, then the respective ministry would have to make the data public if it is not on the negative list. However, at a policy level, certain data points should be made compulsory to be uploaded on the OGD platform. Many datasets are already up on the respective websites of the ministries; they just need to be collated to reduce the time in scouting websites of individual ministries.
Many datasets are incomplete which further deteriorate the quality of data on the OGD. Some are incomplete with time, the others with various variables measured. There isn’t a certain level of ‘threshold’ mandated by the government to make the process more useful. An independent researcher, Natasha Agarwal points out in one of her papers that she did a search for ‘Issuance of visas to various foreign nationals across various categories’ and found that the data is available only for some countries. In some columns, the availability of data stops in 2014 itself and no information is available under certain categories. For example, ‘total number of visas issued’ is not available for each category under each country. Such a light-hearted attitude clearly doesn’t support the cause of opening up government data.
Duplication of datasets is another worry. It undermines the quality of the search results tremendously. I did a simple search for ‘balance of payments’ on the website and among other choices, two stood out. The first was titled, ‘Overall Balance of Payments upto 2012-13’ and the second being ‘Overall Balance of Payments upto 2013-14’. Clearly, there is a repetition of data sets. One certainly could remove the BoP dataset upto 2012-13 and update it with the newer one. The duplication problem exists even across portals. Many data sets available on the OGD platform can also be accessed from the data banks maintained on the websites of the respective ministries. Thus, making it irrelevant for that data to be uploaded on the OGD database.
Immediate Changes Required
As much as the commitment to making government data freely accessible to public should be lauded, there are a few changes that need to come in effect immediately. Most importantly, the government needs to come up with a ‘priority list’ of datasets that should be released in the coming year. This will maintain parity across ministries in uploading facts and figures, and would further maintain confidence of the people.
The government should also establish a dedicated cell at the National Informatics Centre to crosscheck and cross-reference data. Many times the metadata file isn’t available with the data-sets nor are the items correctly referenced to their original source of information. Thus, there is an urgent need to do away with such discrepancies. Duplication of datasets will also be resolved by setting up such an authority.
Many researchers suggest that the National e-governance plans be merged with NDSAP to improve inter-operability and ‘co-ordination amongst data producers and data collection exercises’. However, this is a long-term objective.
The current government is committed to making data accessible to all and there surely seems hope to refine and resolve the publishing and use of open-government data in times to come.