Monday, August 22, 2016

EBM, precision medicine: literal but more

Often the metaphorical meaning of a word is used more often than the literal one.

Medicine is a technical domain, but lots of terms in it are metaphorical.

EBM, or evidenced based medicine, is a label for a suggested desire to literally base medical practice on evidence rather than 'what you've always done'. Does extra sugar intake cause hyperactivity in children? It's pretty obvious it does. Except when studies are done, there is no appreciable difference in activity afterwards between children ingesting more sugar and those who don't.

We have preconceived plausible notions, but it's always good to check more scientifically. Isn't medicine always working on evidence, and equally dismissive of unscientific, non-evidenced based things like homeopathy, or ingrained myths like feed a cold starve a fever? Of course, but still there are reasonable plausible things that just may not have an actual effect. EBM came to refer to a trend in RCT, randomized controlled trials, which means a trend in a particular kind of government funding. Also it became associated with expensive measures to confirm really idiotically obvious things and parodied by the idea of an RCT for the efficacy of parachutes. So sometimes EBM sounds like a good thing, and sometimes it sounds like a dumb thing. But it mostly means 'be skeptical, do an RCT', whatever the nuances of funding and sample size are.

Personalized medicine, as a term, is also problematic. Literally it's saying medicine should be directed towards the individual differences. But that's so obvious, you're not going to treat someone for a broken left arm when it's their right that's broken. How people use the term 'personalized medicine' nowadays is for when gene variations are known about the patient. That's it. PM, the non-literal version, is for the handful of medical situations where the different gene variation (on a small set of genes) suggests a slightly different therapy. The expectation is that the science will expand to include lots of gene variations and problems. Yes, doctors have been using medicine personalized to a patient's family history, environment, social situation, problem itself (duh!), etc. forever. PM currently refers to doing that same thing but with some gene knowledge.

These two terms, EBM and personalized medicine, are not incorrect, but they have a much more specific meaning than you think if you've never heard them before. If you use them all the time, then you (implicitly or not) know their narrow usage. Luckily people whether they know it or not, don't use these in the broader situations.

Tuesday, August 2, 2016

SQL JOIN Venn diagrams are only sort of Venn diagrams

SQL is a standard for querying databases. Despite questionable pronouncements that SQL is Turing complete, I hesitate to call it a language because its power is in using boolean logic in dealing with tables of data whose columns point to each other.

And often Venn diagrams, the go-to visualization for set operations, are used to help explain the process of table JOINs.

The interesting things is that set operations and table joins are not really the same thing. They're related but just not the same. Set operations, which are pretty much the same as boolean/logical operations, are simple to visualize. The picture is the universe of elements, a circle surrounds a group (a set) of elements with a property, and a set operation does something to one or more sets to make a new set.

(from Modern Dilettante)

SQL also has set operations that combine tables as though they were sets: UNION, INTERSECTION, DIFFERENCE. They simply do the same as the set operations; two tables with identical column labels have their rows combined into a single new table (UNION means all rows in both, INTERSECTION where the column/row entries match in value, etc).

But this is not how Venn diagrams are usually presented to explain SQL. UNION, INTERSECTION, etc, are not the most useful of operations (the WHERE clause of a SELECT is where the booleans are most commonly used). Venn diagrams are most often used to explain JOINs. A SQL JOIN first matches on a field from one table and a field from another (presumably a field of the same type or kind).


(source Codeproject)

These Venn diagrams explain the difference between inner, outer, left and right joins perfectly...except they are just from a different world than the traditional set operations.  A JOIN is intended to merge the information appropriately in the n by m relation (where the size of A is n and size of B is m). The universe isn't the set of rows of both A and B together. The universe is the product of rows in both. And the difference between inner, outer, etc, is purely with how the JOIN deals with NULL/missing elements in A or B.

An INNER JOIN keeps rows of AxB only where both A and B rows exist. A LEFT JOIN is only when the A part exists (B may or may not), similarly for RIGHT JOIN. An OUTER JOIN doesn't care if either a corresponding A or B exists. So the boolean idea does apply but in a strange way, only with respect to the NULL condition of the matching field. If the value of the field from A has no matching value in the field for B, then B is NULL or missing then (and vice versa).

So the Venn diagrams for SQL operations, I can't really say they are true Venn diagrams; they don't show the state of a consistent property over all elements of the universe. Or rather the universe is a bit more complicated (depends on A and B, their cross product) and the property being booleanized is whether element of one table is NULL. You can't just take an arbitrary universe of elements (with properties. With JOINs, you have to create the universe, the product, first before examining the elements (and whether the A part or B part of the new row is null or not.