2023/09/01
Michael Samsom

A brief description of where the idea for the Silx Digital Health Assistant product came from. The purpose of this post is to provide an introduction to the scientific method and some unifying concepts in making decisions based on incomplete information. Many concepts are glossed over and will be explored in more detail in later blog posts. Further reading links are also provided.
Working as a medical doctor
While working as a resident Neuropathologist, co-founder Christopher Newell needed to know whether two drugs that his patient was taking interact with each other in a way that could cause unintended problems for the patient. To determine this, Christopher accessed a prepared data source that indicated that the drugs “may interact”. This led to the obvious questions, what does “may interact” mean, and how do we know. Determining interactions between drugs are done by performing scientific experiments where these drugs are administered to human or “human-like” entities and their outcomes are measured. A statement like “may interact” could mean that some parameters were measured but outcomes didn’t change for the duration of the experiment, or outcomes changed for some entities but not others, or a number of other things. These findings are published for the doctors and taught or provided to them through a variety of media. In this same way, relationships between drugs, diseases, patient demographics, behaviors and many other things are studied to create a giant web of knowledge. All these findings are compiled to form a body of knowledge known as biomedical scientific literature.
If we zoom out, a doctors job is to recall scientific and experiential data that are relevant to improving patient health, integrated them together, and apply to the goal of the best possible outcome for the patient. This way of doing medicine is sometimes known as “evidence based medicine” and the process of attaining the aforementioned evidence is known as the “scientific process”.
Science
The modern scientific body of knowledge is arguably one of the greatest achievements in recorded history. This body of knowledge has allowed humanity to make amazing developments in many fields. One high level description of science is as follows, “the process of gathering information about natural phenomena, with the purpose of explaining, predicting, and controlling these phenomena”. These 3 related tasks, explain, predict, and control are central to understanding how to do and apply science. Explaining a given phenomenon is done by creating a model (sometimes called a hypothesis) of the system containing that phenomenon. The model serves to compartmentalize the phenomenon into parts that we can understand, specifically, things and their interacting relationships that we believe cause that phenomenon. The notion of causality needs to be well understood to do this, and will be described in detail in future post. Interested readers can refer to The Book of Why for an introductory treatment or stay tuned for future posts. The quality of this model is judged based on its ability to predict the occurrence of the studied phenomena given the expected causal conditions. Predictions from experiments or observations are used as evidence for the quality of a model. Armed with a good model, we can then modify the inputs of the phenomena to obtain desirable expected outcomes. A simple example of this is the germ theory of disease. For a given phenomena (disease), a model that claimed that germs cause that disease was constructed. The presence of that germ should predict the presence of the disease. When everything is done, these results are published in scientific journals.
Making decisions with incomplete data
In the real world, these causal connections cannot be made with certainty. We can never have complete data about any system, and the biological systems that interest doctors are notoriously variable and difficult to study. To deal with this we use tools from probability and statistics to express the degree of belief that we have in the validity of our models. Each scientific paper involving experimental, or observational data will provide their own statistical analysis. The claim from the model of the scientific paper will be caveated with some statistical statement meant to impart information about the uncertainty in the statement. Various analysis techniques are not always compatible, so generally there is not a simple way to keep information about uncertainty when integrating multiple studies. To simplify this, decision makers will generally use a “truth” threshold, sometimes referred to as statistical significance. This can be effective when applied correctly, however there exists a mathematical framework for updating uncertainty for probabilities from data see Caticha 2012 for a great treatment on the subject.
Putting it all together
A doctors job requires them to integrate an incredible amount of information in an environment with incomplete data to improve patient health and save lives. The data they draw on are themselves trying to make sense of incomplete data. Modern information technology and mathematical tools can help with this. Scientific knowledge is all about relationships and is structured like a series of webs, with learned “causes” connecting different entities. If we can leverage language AI, we can represent large amounts of structured scientific knowledge in a computer system. This connected data can be recalled for specific purposes, and the uncertainty can be integrated on the fly to support doctors decision making.