On Day One of the Pandemic, they Should have named a "Data Czar"

At the start of the pandemic, we had poor information. Many of the choices made in many countries were wrong. I held heart, though, that because good data about the virus would be worth billions, even trillions, that we would quickly move to get it, and have a coordinated plan to get it. We didn't. Even a year later, there are so many things we don't know about how the virus is transmitted, what the best steps are to stop transmission and many other questions. We know a bit more about how to treat it, but we've learned that very slowly and still are far behind where we could be.

The consequences are dire. Not just in lives lost, but in economic ruin because we aren't able to more finely tune our policies on things like masks, lockdowns, workplace activity and more. Some people have lost their livelihood who perhaps didn't need to, other businesses have kept operating when it was dangerous to do so.

In addition, we've developed tremendous pandemic fatigue, where the public is no longer willing to follow good policies, and our crazy political times made medical decisions about pandemic fighting into political issues and badges of tribal affiliation. While at the beginning, when we know little, it is necessary to take an overly strict, risk-averse approach, the goal should be to switch as quickly as possible to a precise approach that generates the least economic hardship and fatigue with the most prevention of transmission.

I will formalize something I hoped would happen from the beginning: The establishment of a Pandemic Data Czar's office. This office would be given an immense budget, and certain special temporary powers, with the goal of giving us good, actionable data on which to make personal decisions and to set policies.

When a pandemic starts, nobody knows how dangerous it is or how it spreads. As soon as it shows signs of being dangerous, there is no choice but to be conservative in your choices, taking a "just in case" approach. If you guess, you might guess wrong and end up killing vast numbers. It's not an option. You work based on general common sense. Immediately, though, the goal should be to get hard scientific data and fine tune polices to the right mix of disruption of daily life and mitigation of spread of the virus.

The Data Czar would get an immense budget because the alternative is trillions of economic damage and millions of deaths. Getting it right is important. Almost any reasonable research proposal which would produce actionable information about the right steps to slow the virus should get a chance, and quickly. If something is not done, it should be because there's not a case for it, not because there's no money.

Data Czars should exist in many countries and share information to avoid too much duplication and for the benefit of all.

Controversially, the Data Czar would have powers to temporarily alter some of our usual procedures regarding privacy and experimental design. They would be concerned with privacy, to be sure, and have a dedicated privacy team that understands privacy issues well, understands how difficult it is to anonymize data and have concerns with the risks. At the same time, they would take a view that certain privacy risks that would never be tolerated in ordinary times should get temporary exemptions when the benefit is clear. As such, the Data Czar would have the power to suspend certain privacy regulations and practices for a limited time with appropriate justification, and a plan to minimize privacy risk.

One goal would be the creation of a large database of information on every known case and every known transmission. Much more effort should go into contact tracing transmissions, including, if appropriate the promotion of mobile phone based contact tracing protocols that are shown to be effective. Efforts should be made to minimize, but not necessarily eliminate, their privacy risks, with only a limited time available to get it out.

Data with privacy exposure should become less available over time, so that one must jump through extra hoops to get it once its value has been extracted, then after a further time it should be destroyed, and all such data destroyed post-pandemic. Scientists would not actually get the data, but rather be able to run programs to do analysis of the data on the servers where it resides, to extract aggregate information. All their code and outputs would be preserved to find attempts to violate privacy, even in the past.

For every case, data should be collected on all the circumstances of the case -- genotype, phenotype and living situation information on the patient, including all drugs and supplements they take, and their activities around the time of infection -- plus of course their outcome. Data collection of this sort would ideally be universal, but portions could be made opt-out as needed, under the guidance of the privacy officers, who also will know when trying to collect to much discourages participation.

The Data Czar would understand that you break privacy rules with care. For example, you don't want to make people scared to tell things to their doctors or researchers.

Such a database could not be made public, but authorized researchers would be able to (with vetting) do research queries on it, asking questions like, "Compare fatality rate for patients with hypertension taking Amlodopine to those not taking it" and similar (and much more complex.) Doing complex queries on such a database can den-anonymize data if not careful, but part of the budget pays for people to oversee the queries looking for risks and rules binding the (non-anonymous) people making them. While retrospective studies have risks of error much higher than randomized controlled trials, they can quickly show where to look.

Similar principles may also apply in a more limited way to allow suspension of other experimental rules, such as some which are the province of IRBs at universities and research labs. I'm not naming any particular principle, just talking about a streamlined process to make temporary exemptions that clearly make sense, and make them fast. Errors would be made, but the cost of not having the data is a much greater error.

Many months ago we would have learned what we need to lock down and what we don't. We could even predict how much lockdown would be needed based on the current prevalence of the virus. And we would have been doing a lot more testing to know what that prevalence was -- testing patients, testing sewage and more. We would have more info, sooner on whether spreading took place by touch, through aerosols or other means. We would know if outdoor dining was safe or dangerous, or about walking together outside, or about hugging family while holding your breath. We would know what level of exception can be done to "essential" services and know what's really essential. We would know if closing borders is effective or overkill or needs to be even stricter. We would know what types of masks work best and why. All of these decisions would be backed up with data, data and data -- and maybe could avoid becoming political sometimes because of that.

We would also have data to help decide who to vaccinate first, combining both the data on who is most vulnerable with who are mostly likely to be vectors, be it grocery store cashiers, police or others.

Things we might learn, and learn fast:

  • How much spread does outdoor dining cause and indoor dining?
  • How much actual risk is there from fomites on surfaces, and how much cleaning is needed?
  • How dangerous are taxi rides and public transit?
  • What levels of viral load are delivered by various exposures and how do they affect disease outcome?
  • What types and durations of quarantine, combined with testing, are effective?
  • How much transmission is caused by travel and open borders vs. different levels of border closure?
  • Which "essential jobs" trigger what levels of risk? What about the "less essential jobs?"
  • How much compliance is there with government advised or legally mandated policies and how does that affect transmission?
  • What are the true causes of higher incidences of the virus among minorities and lower income populations?
  • What are the risks of opening schools, and different practices of distancing in the schools?
  • How safe are outdoor gatherings?
  • How safe is a hug if you both hold your breath?

The list goes on and on and on.

Treatment Data

The Data Czar would focus on transmission and prevention. A different data Czar would focus on treatment, though of course data from both offices would be used. The main reason to have two offices is that the rules for research on therapies are too well established to easily overturn them. They are often wrong, and that should be fixed, but these are two different fights. Certain research would fall within both realms, so naturally there should be coordination and sharing where needed.

Not that we don't need to overturn a lot of the rules on medical research in a situation like Covid. It is now clear that if we had different rules around the world, most of the two million dead need not have died. It's hard to figure out how we'll get the political will to change them.

Challenge trials and ethics

When it comes to vaccines and other prophylactics, the controversial topic is challenge trials -- testing by deliberately infecting low-risk volunteers. We would not go that far to get data on transmission, but we might do some similar things, such as:

  • Finding groups of people who already want and plan to ignore protective advice about things like distancing, masks, indoor dining, cleaning surfaces.
  • Finding people who reside with an infected person and can't move elsewhere during that person's infection, and getting them to document all they do.

There is a fine line. The most desired studies randomly put people into groups, so that they can show any correlation is causal. If you just examine natural behaviour retrospectively, you may find correlations that are not causal. You don't want to deliberately encourage or pay people to take behaviour that may be risky, so there are limits but we should get data up to those limits.

Comments

Cruise has btw 300 and 900 PhDs, and about 1900-2000 employees(?) .

How about the following cos -

Waymo
Aurora
Zoox
Mobileye
Argo
Momenta
Ponyai
Arriver
Motional
TuSimple

Add new comment