Large media and academia have created giant datasets by combining all sorts of information. Some of this is public information; other information is only given out to select parties (voter data), and additional data is purchased from data brokers. This means many journalists have more information about people than the police. This is why journalists and others often were able to track down January 6th participants and provide lists of their names to law enforcement.
For the media, these giant datasets can help them quickly produce a story on any individual, the minute they have a name or other information, they can pull all sorts of information to create a richer story. They can provide all sorts of information on any person, if they have criminal or civil court cases, if they voted and which party they have registered with, contact information, and where they live. They also have tons of imagery that can be traversed, looking for facial recognition matches. It can help them to track down possible witnesses to interview. This data allows organizations to produce stories more quickly than the competition.
One use of these large datasets was during the Surfside Condo collapse in Fort Lauderdale, some of the residents were first told of the tragedy by reporters looking for comments. The reporters were able to find many of the residents of the condo and call their voter registration listed numbers, which is often a cell phone. This means that news of the loss of their possessions and possible family members was broken by reporters looking for comments. Many journalists are not equipped with the sensitivity to break this kind of news.
The courts and the Constitution prevent the FBI and other federal law enforcement agencies from keeping such large sets of data. They would need to go to court to get a warrant to request access to this data. This means they have to know what they are looking for and be able to explain why they suspect this person. However, if they are given the information or even pay for it, they can avoid the need to have these specifics.
The obvious danger of this warehousing of information is it infringes on individuals’ right to privacy. It could be used to punish people by deciding to sift through all their data to find a crime. This type of behavior is referred to as lawfare.
Another danger of allowing this accumulation of information is that the data may be used in a partisan manner. In 2007, ABC News exclusively obtained the DC Madam list, which was a list of phone numbers and call logs of clients of a prostitution ring in Washington, D.C. ABC News head of investigative reporting Brian Ross then publicly stated he was only releasing the names of prominent Republicans from the list because Democrats didn’t take a stance on morality, so they weren’t hypocrites. All the people on the list had committed crimes, but only a few names, including a Republican Senator and a Republican State Department official, were released.
Yet another danger of these giant datasets is what happens if they are stolen and they will be stolen. Hackers and state actors will have lots of information to use to manipulate victims.
For these reasons, we need to have a conversation about the datasets that are allowed to be created. How much data is allowed to be accumulated? What information is not allowed to be put together? Is law enforcement allowed to be given data if they are not allowed to collect that same data in their own dataset?
Rocco Maglio is the co-publisher of The Hernando Sun and earns a living as software engineer and cybersecurity specialist. He has been a software engineer for 30 years and has a master’s degree in cybersecurity.