The Fault in our Data: How a sloppy logic increased Mob Lynching in India!

Apoorv Purwar
5 min readJun 5, 2019

There are three kinds of lies: lies, damned lies, and statistics.

The rise in the wave of fake news has been a cause of concern for both netizens and citizens, but masquerading under it is an even bigger threat of ‘Faulty News’ which is dividing our society even deeper. For it gives the reader a conviction to form views, based on the news which is backed by data — but what it hides is that the data might very well be incorrect, incomplete or manipulated.

The case in consideration here is of the reports on ‘astronomic increase in cow vigilantism in India’, by International and Indian media houses of ‘repute’.
A quick web search would throw news articles like the following, which have headlines scary enough to convince you to never visit a violent country like India, and will also fill you with rage against the ruling government:

While I don’t intend on making a case here for, if the hate crimes have actually gone up, or if it’s an issue which always existed but was artificially inflated for political gains, but what I do intend for you to see is that, what hides in the fine prints of ALL these articles!

All these articles, quote the study by an organization called IndiaSpend.
Now, if you dig deeper and analyze how IndiaSpend collected this data, you will find that ‘two Interns of IndiaSpend — Delna Abraham & Ojaswi Rao’, collected “Online News Report in English Media from year 2010 to 2019” to build the database of cow-related hate crimes — and that is the ONLY source backing the claim by global media outlet that a country of 1.25 Billion has turned intolerant after BJP came to power!

To understand, how deeply this data and comparison mechanism is flawed, consider the following —

  • In 2010, only 7.5% of the Indian population had access to the internet and in 2019 that number is up to ~50% of the population.
    With more than 500 million more people accessing the internet, their appetite for news also shoots up, and you will see way more news articles being published ‘online’ in 2018 or 2019 as compared to 2010 or 2011.
  • With the issue of cow vigilantism gaining more traction after 2014, it is anything but obvious that more people and media agencies are talking about it now than ever before and hence way more reports in ‘Online English Media’, which is the source of this data.
    On the other hand, a lot of old web pages are pulled down with time, just like your Orkut profile no longer exists, a lot of old news article do not either!
  • In a lot of cases, it turns out that what was initially regarded as cow-vigilantism was rather a personal-feud, which was incorrectly reported as cow related violence. I wonder how many news agencies go back and pull-down the initial erroneous article, and if the two interns at IndiaSpend were careful enough to exclude those cases. More so, the data doesn’t even take into account the actual police complaints filed from 2010–2019, which would have been a much more reliable source of data.

How can even someone compare the count of Online Media Reports in 2010 with the count of Online Media Reports in 2019 on a certain topic, when the landscape of Online Media itself has transformed in an unparalleled way in this decade, from nothing to everything.

If you collect online media reports on ‘rotten apples’ in India in 2010 vs in 2019, I bet you will again find 97% more online reports on rotten apples now compared to 2010. That doesn’t mean that India has started producing 97% more rotten apples since BJP came in power!

While it is totally understandable that two interns did a sloppy analysis and came up with this data, to write something in their intern report, without realizing the fault in their data collection mechanism, but what is outrageous is that media giants like The Washington Post, Bloomberg, etc, who have an army of analysts to vet data and reports, didn’t even perform any rudimentary checks on the data which they used to paint a country of 1.25 Billion as intolerant and lynchers!
It only begs the question that how trustworthy and reliable are any of these media houses in the present day.

I condemn and strongly oppose any kind of intolerance against any individual, but I think it is high time to be intolerant towards faulty and manipulative data and news, which is dividing our society and increasing hatred!

Update 1 (06/06/2019):

After reading this article, some of you shared this data, which says that there have been 167 incidents of hate-crimes in BJP ruled states while only 39 cases of hate-crimes in Congress-ruled and 10 in states ruled by AAP.
Based on these numbers, they claim that BJP is the most communal party! And are they, right?

Well, they clearly didn’t read the very first line of this post!

Do you see how manipulative these numbers are?
Let us assume that these numbers are correct, but do you know how many states BJP rules in India vs Congress or AAP? Moreover, the number of people (population) each of these party rule over?
If yes, then why are these “fact-checker” agencies quoting the overall number here which is so misleading? Why not come up with something like ‘the number of hate-crimes per state (ruled by each party)’ or even better number of hate-crimes per 1000 people in states ruled by each party?
Now that would be a real comparison!
And a rough estimate of it would be like (per state, before Dec 2018 when this data was collected) -

BJP — 167/20 = 8.3
Congress — 39/2 =19.5
AAP — 10/1 = 10

By this logic, Congress-ruled states are twice as intolerant as BJP and witness double the hate-crime incidents!
Why are we even questioning BJP then, when they are the most non-violent of the three? Shouldn’t Congress be blamed for hate-crimes instead?
Now divide that by the number of people (population) in the states ruled by each party and the numbers would further favor BJP!

Note, I am not making a case for BJP here — even one hate crime in any state is WRONG, just telling you how deceptive these numbers are and how media houses are manipulating people’s views with wrong news and manipulated data.
And the people doing this should realize it because this has serious consequences. It’s something way more than a sloppy intern project now and is misleading and polarizing a whole nation. Which should STOP.

--

--