Least Uninteresting Number: August 2015

Sunday, August 30, 2015

Literally vs figuratively

Kleptomaniacs never get puns because they're always taking things literally.

Cannibals have a lively social life because they're always having people for dinner.

Haddock's Eyes or What things are called.

The name of the song is called 'Haddocks' Eyes.'"
"Oh, that's the name of the song, is it?" Alice said, trying to feel interested.
"No, you don't understand," the Knight said, looking a little vexed. "That's what the name is called. The name really is 'The Aged Aged Man.'"
"Then I ought to have said 'That's what the song is called'?" Alice corrected herself.
"No, you oughtn't: that's quite another thing! The song is called 'Ways And Means': but that's only what it'scalled, you know!"
"Well, what is the song, then? " said Alice, who was by this time completely bewildered.
"I was coming to that," the Knight said. "The song really is 'A-sitting On A Gate': and the tune's my own invention."

(from Lewis Carroll, Through the Looking Glass)

Once you have the ability name something, even names can have names and then it's turtles all the way down. Once you have the ability to point, you can point at the pointer. Like the mirror test, understanding pointing doesn't guarantee you the ability to grasp all the nth-order pointing.

Friday, August 28, 2015

Just Stop It. Website complaints

A little constructive feedback to web site designers.

Stop it. Please just stop it.

Website designers, stop adding crazy stuff and stop changing my defaults 'for' me:

Stop changing things that I can set up locally. Allow me to set the font size rather than fixing it to what you think is best. Don't change the scroll speed on me, I set it already the way that's easiest for me. I don't want to swipe to move down a paragraph but then you make it skip a page or two.
Stop it with all the moving images. just a few are OK. Well no not really. One is already almost too much.
Stop it with the audio. I'm listening to something else. Also, with multiple tabs that I move around, your audio is randomly starting when Im not on your tab. Then there's a frantic search for your goddam tab to kill with a vengeance and remember never to visit anything of yours again.

Tech rationalization: All these things also take up lots of memory and processing time on the local computer running them. Also, they waste my time.

So stop it.

PS IMDB, you're the worst. I love wasting my time on your site. But I don't want to waste away my time-wasting time on waiting for your candy crap ads to load. I want to see them immediately or move on to finding out what the movie was with the thing that that actor (who was in with the actress from that TV show (no not that one, the comedy, no the other more serious comedy) who had that thing happen to him. It was a couple years ago. I think it was a remake?

Thursday, August 27, 2015

Another Design Pattern: Layers

Though I usually talk with software as the underlying example, I call this a general design pattern since it works in many fields.

Layers or layering is a way of separating a large goal into smaller pieces. It is not necessarily by sequence of execution or design but by amount of detail of domain expertise. That is, the

The 'layers' pattern is really a pattern using another pattern, the interface. When the terms 'vertical application' or 'horizontal design are used', horizontal means a general purpose set of tools, vertical means using a ladder of tools, one depending on the next to solve a niche problem.

horizontal vs vertical scaling

The canonical example is the OSI model for networking (computer communication) popularized by Tanenbaum (fig 1-20, Tanenbaum, Wetherall, Computer Networks (2011)):

Note - level 7:mailActual communication over a cable is through the physical layer. There are rules and restrictions on the physical layer that the layer above it conforms to in order to get certain behavior but is easier to think in in the higher abstraction.

It is a stacking of individual components, a higher one depending on a lower one. The higher layer is dependent on the lower one, and is specified in the language of the lower one.

Layers of science - mathematics, physics, chemistry, biology, psychology. Here one may notice that the interactions of the layers is not necessarily one of interfaces: though understanding of physics ostensibly underlies chemistry, understanding the former doesn't guarantee an understanding of the latter. Physics may give insight but won't necessarily determine how chemistry works. This just shows that layers is a general strategy for dealing with a large domain (here science is about as large as it gets).

Purity from xkcd

The benefits of the 'layers' pattern is that it cuts up a larger monolithic area into many smaller manageable pieces. Instead of trying to understand the meaning of life, these 'smaller' areas are more manageable. Also, each individual piece can be worked on with a limited vocabulary (that of the lower piece). You don't have to understand all the complexities of the lower layer.

This is another pattern that encourages modularity: the higher layer (the abstract layer) has its own language and ways of doing things implemented in the interface given by the lower layer. The lower layer (and so also layers below it) could be entirely replaced by a different implementation. Or one layer could be replaced given that it respects the interface of its higher and lower layers.

Another example from computers is architecture: electronics, microprogramming, machine language, assembly language, higher level programming language, specification language. A new layers in design can come from either direction: higher level programming languages like C or Fortran came from wanting to abstract away common patterns in assembly language (like subroutines), and microprogramming came from trying to implement machine language more easily in the electronics.

A higher layer is often an abstraction from a lower one. The common patterns can be set in (parameterized) stone in lower ones, with enough freedom to connect those things in the higher layer to get applications done. That way, the higher layer can worry about its problems without having to worry about some implementation details, and the lower layer can worry about its own problems. This is a way of explaining the benefits of encapsulation/modularity/data hiding: reduce the knowledge requirements at each level while maintaining functionality.

One difficulty with layering is making sure the knowledge of the layers is sufficiently compartmentalized, that the user/designer in one layer needs to know as little as possible of other layers. A leaky abstraction occurs when too much is required of those outside a layer.

Without these layers, there can easily be efficiencies that are possible by thinking fast, by the designer of the monolith using fine local information to get benefits in the larger architecture. But this can lead to spaghetti code (spaghetti design), where small changes in a small but common/overused element can cause large changes (or rather problems/bugs/disasters).

The layers may not be a deliberate separation of affairs, but reached organically. One layer s formed naturally, and then others see that the items at that layer can be used to create a layer on top. Or the implementation of a bottom layer may be rethought in order to create a more useful layer below.

A monolith is a great big ball of cleverness. Every small action depends on efficiencies gained by side effects of other small actions far away. This often leads to very brittle designs, but think of the efficiency gain! Using layers may remove such efficiencies at the expense of ease in modification.

Or it may turn out that a hidden layer implements things in such a way that the patterns of a higher layer can take advantage of the much lower layer if only that information was exposed, either to use directly (dancing links which reuse presumably freed pointers more efficiently) or implicitly.

Wednesday, August 19, 2015

Even docs replaced by robots? Only for boring operations

Will technology replace us with robots? (us = 'billion year DNA-developed flesh-covered endoskeletal devices')

A new automated anesthesiology device has recently made the news: Automated anesthesiology for colonoscopies. There's the obvious fear of high-priced docs losing their jobs "How dare they assume a machine could replace a physician with years of education and knowledge?'.

But for the moment, what's the situation? Colonoscopies for polyp screening and removal are very routine procedures. For the colonoscopy part, only 5% of patients have a polyp removed. So most of the time the GI doc is doing boring work, looking for polyps that mostly never there.

And similarly for the anesthesiologist except moreso. Even if the GI doc find polyps that are removable, that doesn't change the sedation. If something is found that needs more than just the colo tool, then hey, we ain't doing that here, we're backing out anyway, no need for more anesthesia. All they are doing is conscious sedation over and over and over again.

Every patient needs oversight. Things go wrong. "I didn't know the patient would have a seizure, allergic reaction, is used to the sedation drugs" These things need tweaking. For the most part, the every day stuff and these few weird things are extremely well-known (there's been a high tech assembly line of patients getting colonoscopies forever!). So this is the perfect place for automation to both reduce cost and time and effort. And the machines are going to have extra sensitive alarms, a good buffer to stay away from the bad situations.

There'll still be a need for lots and lots of physicians, don't worry about it, freshly graduated MD. Hopefully family practice, where the real medicine happens, will become more respectable = more highly paid, because it is already high in demand but nobody is going into it because it won't pay for med school tuition loans.

---

The whole point to science is to make things repeatable.

The trend then is that if you do something enough times and for what variation there is, it can be parametrized, then it can be automated and packaged.

We do it for medications: an expert gives very simple instructions on use, and then you do it yourself. Simple first-aid for even life threatening situations doesn't need to be handled by a full physician. Anyone who can read directions and gets a couple hours training can do CPR and use a defibrillator.

Medicine is progressing towards knowledge constantly. Radiology is miniturizing image taking to the point where soon you really could have a Star Trek tricorder to wave over someone to see and judge any internal problems.

Look, there's already the DaVinci robotic surgeon. Of course it doesn't do every thing and needs to be operated by a full surgeon.

(from Medical Devices)

But, soon enough you'll be able to go to your local drugstore and go down the pain-relief aisle, turn on the cough and cold section, then come to the Surgeon-in-a-box aisle:

Wart-Removal-In-A-Box - wait, don't they have these already, some freezing solution?
Stitches-In-A-Box - for non-serious cuts that are too deep to heal themselves, place the box opening over the wound and the sensors will be able to see where to close up. Applies flesh knitting goop reducing scarring (Dermabond, based on superglue, it's real).
Colonoscopy-In-A-Box - you'll still need to take the prep, robots can't see through poop either. Send to the lab any polyps removed in the enclosed vial.
Lasik-In-A-Box - just place against the affected eye for ten seconds and hold your breath.

OK for most of these you'll need a prescription for them. But still you'll be administering them at home yourself.

Yes, I agree, the last three I'm not sure I'll ever be comfortable with. But none of them exist so I'm off the hook for now.

Was I played?

I don't naturally do this, but I filled out an online comment card for an internet service. And they called me back the next day. Holy crap, they actually listen to those things?

So, to remove content (fill in an example of content yourself):

I had made an appointment, by phone, to a service (where I had to go to the service) Let's call this S (for service)
A week before the appointment, I got an email asking me visit a website, to fill out an online form, all very relevant info for the appointment, and would save time at the Service (you always have to fill out forms for the first fifteen minutes at the Service and it's info that never changes, so why the hell am I filling out this stupid form by hand? Again?). This was a third party website, directed at lots of these kinds of services. Let's call this WFC (for web form company). I thought the site was well-done or at least better than average, asked only necessary questions, good UI.
I went to my appointment. There were good and bad things about the appointment.
After my appointment, I happened to be web-surfing and I saw an article about the WFC (the company behind the form) that was disparaging about their sales strategy. Or more specifically, the WFC sales people tend to hard sell the S's so much that, despite the efficiency the WFC offers, S's don't like to work with WFC (90% turn away). Also other shady sales practices and culture.
And after that, I happened to get an email from WFC asking to me answer a survey about how my appointment went, to essentially review the Service (so a bit like Angie's list or TripAdvisor).
Because of the negative news about WFC, I decided to go fill out the survey. I said both the good and bad things about the Service, but there were also some survey questions about the WFC site/experience itself which were mostly good, but because of the news I had read, I also mentioned some qualms I had (I used that phrasing) about their privacy and sharing of user info, giving the negatively sales behavior as reason for my qualms. My review was copious about the Service, and only a one liner about my qualms about WFC.
The next day, I got a call. Yes, a direct cold call. From customer service at WFC. Very breathlessly concerned about my ... concerns. They weren't wondering about my review of Service. They were addressing my concerns. We spent about 15 minutes discussing the issue, me giving more detail about my qualms, and they being somehow apologetic about the situation saying that they wouldn't do what I feared (on the internet, you can say anything, and it may even be true!).
They gave me an Amazon gift card (online).

So I don't know who's right, who's telling the truth, everybody could be telling the truth in the small narrow view of things, or everybody could just be doing their job the best way possible and others are misinterpreting or interpreting correctly for their own small narrow perspective (everybody is an adversary if it's a game).

The point of this story... I want to know if I was played. Getting a gift card (no idea what the value is) out of the blue like that, I'm paranoid why would they do that unless they want to assuage me. I'm not the one giving bad reviews of WFC. ... maybe it's just what people do in customer relations whenever a call back is made. Maybe all they superficially see is 'negative about WFC' and they jump on that to plug the hole/stop the gap/nip the bud/unmix the metaphor. Because once you accept a gift, there's a psychological balance. You always owe whether you're giving or receiving.

Tuesday, August 18, 2015

Optimization design: Must you trade off between two things you want?

A common trope with presenting solutions to a complicated problem is the tradeoff: you can have this at the expense of that (and vice versa), or of three desirables, you can have only two (the negative of the third is what you'll get.

For example, you can have low inflation or low unemployment, but since one can cause increase in the other (if everybody is employed, most people will have more expendable income and so prices will rise to take advantage), you can't have both at the same time.

(the Philip's curve, from SparkNotes)

Many algorithmic problems show solutions with this feature, which is usually the tradeoff between runtime and memory. If you have very little extra room (little available data manipulation space) you might have to make many passed over the data than if you could just have another full copy of the data. One strategy is to build a big table (often in linear time) in comparison to an algorithm that uses only a couple registers and taking a long time to compute.

Or for a given technology, you want three things: quick production, low cost, and high quality, but any two of these means the third will be bad (see the CAP 'theorem' in distributed database design), you supposedly can only have two of the following, the third being prevented.

Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response about whether it succeeded or failed)
Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures)

(from CAP theorem-Why does it matter)

Supposedly in distributed systems you an only have two of these, because any two denies the third, where a single database on a single machine guarantees all three (the same as the ACID properties).

As rules of thumb, these are all interesting and useful observations to be made about a system. It gives you constraints so that that you shouldn't feel you have to worry about overcoming them.

But this is where design and cleverness come in. The space-time tradeoff or the pick any two of three restriction is just another hurdle to overcome. To solve a problem from the start, you used cleverness to solve it. But it has some drawbacks. In trying to remove the drawbacks, you introduce another drawback, and so you think there is an intimate relationship between the two drawbacks.

Don't stop there! Your next step of cleverness is to overcome both at the same time, and, as problems go, unless you can prove otherwise, there's no reason you can't make both good at the same time.

For example, take computing Fibonacci's number. F(n) = F(n-1) + F(n-2), F(1) = F(2) = 1.

The first answer is to use recursion and simply compute F(n--1) and F(n-2) directly. That has the drawback that, if left to do it all automatically in the recursion, F(n-k) is repeatedly computed many times.

(from UCR CS 141 course notes)

Notice there's lots of repeats.

Instead you can be clever and, once you notice how you compute it by hand, starting from F(3), using the known values F(1) and F(2).Then compute F(4) because you can now that you have F(3) and F(2) and similarly F(5) and so on till you reach F(n). That is linear time but takes linear space:

(from Archimedes Lab)

So we're stuck right, with a tradeoff between time and space, the faster the algorithm the more memory it will use?

Of course not. With all these 'rules', they are just patterns that may or may not follow. It's not necessary that there's a constraint.

For Fibonacci, another observation cuts through both. Instead of going from F(1) to F(n), computing every single item, you really only ever use 2 values at a time (the two previous ones). You don't need to maintain the entire previously computed values in the list. By judicious swapping (using a single temp variable), you can run through the entire table quickly without having to save the whole table, ratcheting upwards quickly.

And for two-out-of-three, there's the example of car manufacturing. Making a car by hand in the late 1800's early 1900's was a painstaking slow process, high cost, and questionable quality. But the assembly line process made all three better: produced cars much faster, much cheaper, and quality much better because every car was made the same way.

Note that with design there's also possibly a tradeoff with simplicity: the more constraints a design tries to handle, the less simple it tends to be. But as with the assembly line method, a simple design, all three are made better. There's no guarantee that more constraints means more complex.

Sure, sometimes the constraints are intertwined, one push must mean another gets pulled. But with appropriate design there may be away around it.

Saturday, August 15, 2015

There -are- realistic moon base plans

I was wrong. Lack of evidence is not evidence of lack.

I lamented the lack of moon base plans recently, but it was an error of not looking around enough.

Recently the European Space Agency got a new director, Johann-Dietrich Woerner, starting July 1.

But even before he started, he had stated his plans for what to do next on the way to other space plans

"the moon station can be an important stepping stone for any further exploration in deep space,"

He states this in the context of ESA's targets after the ISS project finishes.

"In any case, the space community should rapidly discuss post-ISS proposals inside and with the general public, to be prepared,"

I can't tell yet how these plans relate to NASA's stated plans for manned mission to Mars.

Friday, August 14, 2015

Turing and Kahneman believe wrong things! Sort of, not really.

Andrew Gelman in his blog post Turing and Kahneman and statistical evidence seem to trash the two giants. But really he isn't.

It looks like Alan Turing (AT) supports ESP:

I assume that the reader is familiar with the idea of extra-sensory perception, and the meaning of the four items of it, viz. telepathy, clairvoyance, precognition and psycho-kinesis. These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately the statistical evidence, at least for telepathy, is overwhelming.

And of course ESP is wrong! (By 'wrong' I mean of course that there's lots of evidence against it, despite AT's statement to the contrary).

It also looks like Daniel Kahneman (DK) supports ~~general priming effects~~ age-word priming on walking speed.

When I describe priming studies to audiences, the reaction is often disbelief . . . The idea you should focus on, however, is that disbelief is not an option. The results are not made up, nor are they statistical flukes. You have no choice but to accept that the major conclusions of these studies are true.

But wait! To their defense! Priming in general is a well established phenomenon, just not in this particular instance. (it just turns out that the age-related priming on gait was a statistical fluke).

The two experts are really just aying that as far as they know, from the data, one might need to believe in implausible things. Science is filled with such implausibilities. Afterall, it doesn't feel like the Earth is turning, the Sun obviously is just moving across the sky. Things are complicated, you have to actually see additional data to be more accurate (Copernicus's model was not as accurate as Ptolemy's when first proposed).

Tuesday, August 11, 2015

My colonoscopy!

What really happens in a colonoscopy.

I went in for a 50-yr-olds screening colonoscopy. I got the usual medical euphemising in the patient education stuff they give you. Here is what really happens:

- you have a diet the week before ('low fiber' which means mostly food that's not usually considered healthy, no fruit or vegetables, yay BBQ and mashed potatoes!) and the night before (and morning of) you drink about a gallon (4 liters) of saline water (w/ PEG) in order to cause extreme emptying diarrhea (the good kind?) to clean your digestive tract of everything, everything, mostly poop. Nope, not mostly. All poop. You'll spend a lot of time on the toilet with a grumbling abdomen wondering if maybe you're gonna... oh... oh... just went. Don't try to judge if it's a fart or poop. It's gonna be poop. Drinking the prep is not bad at first, doesn't taste bad at all. Just nearing the finishing mark on the humongous jug, you just really don't want the next swallow, twice as worst as the last.

And then they put a tube up your butt.

(from UVA Health, there's a camera and light to see, and tool to cut off a polyp. How they get it back out for a pathology analysis I'm not sure, I don't see a grabber)

- The instructions and descriptions say things that are recognizably true from a technical point of view or afterwards (oh... that's what they meant) but are confusing, misleading, or incomprehensible without that experiential knowledge. So here goes for full disclosure:

the procedure is they put a long metal snakey tube up your butt, slowly, for about 6 feet (roughly the length of your large colon up to the cecal valve to the small intestine. The tube has a camera on the end to look for abnormal growths (polyps). If they find one, they have a tool at the end (next to the camera) to cut off or burn off he growth.
the 'prep' is to clean out your intestines of shit. All those instructions are to make sure that you can get rid of any shit. When they say 'low fiber' in the diet and on food packages and all that? Fiber just means 'isn't digested' or more literally 'comes out in your poop'. You think your shit don't smell? You'll realize it does.
Make sure you do the prep right. Because if there's shit in your colon, and the doc can't see the walls of your colon well, then they'll say 'fuck it, bad prep, I'm taking a smoke break' (ha ha, they don't take a smoke break, doctors as a whole don't smoke). And you have to do that prep all over again.
But don't freak out about the prep. (I'm using the prep' for the few days of preparation by eating a low fiber diet and the gallon of liquid to drink the night before and morning of). Not eating, and only drinking the day before wasn't bad at all. I had jello for meals and italian ice for dessert. But you kind of forget about hunger. People would be polite and say 'Oh this pulled pork BBQ sandwich with hot sauce and a side of collard greens and butter smothered garlic mashed potatoes isn't very good." I knew it's great, but I can take a day off. Also, despite it not having a taste at all, you'll get a but sick of the liquid by about quart two. But that's why you take it in stages, half the night before and half the day of.

- the process and procedure at the hospital I went to was frankly luxurious, I felt like I was at a Marriott. OK, I probably have low standards. You get a gown (that's the embarrassing butt-exposing one) and a robe (that's the one that makes you feel like a king even though it is essentially the same material and color as the gown just opens the other way. Full disclosure, I spent a lot of mental time trying to distinguish the gown and the robe, both the words and the objects. I agree with what the nurses claimed, that the prep was the words except for maybe putting the IV on my wrist (it hurt, and people kept coming up to shake my hand and I really didn't want to use that right hand). Oh also, I didn't care for the vital stats monitor which, though accurate (says I) on my blood O2 and pulse (both excellent!), gave me HBP (I don't think I was nervous about the procedure), and the respiratory rate monitor whose alarm kept going off as though I was inhaling at 6 times the normal rate. I believe the monitoring that makes me look good.

- the forgetting medication (Versed) really worked. Before the drug was administered, I vaguely remember the nurses all talking real fast, when giving me info or even small talk (so the drug was not retroactive, erasing events before ). At some point in the procedure room, I was laying on my side joking with the nurses. The joke was... dammit, I can't remember! was it about how I said I had practiced subtracting by 7 starting from a hundred and they said "oh, you won't have to do that", and I blinked and I was a bit woozy on my back with a nurse telling me it was all over I could go now. Presumably my hour in the recovery room after the procedure (I mean extremely invasive butt tube exploration) was already over. As an experiment, I had my son (who drove me back) ask me (I asked him before hand) three random words for me to try to recall later in the afternoon without internal repetition techniques. I remember asking him to tell me the words in the car. But later that afternoon, I could only remember one. I remember asking him to ask me this (on the drive back, and I also remember most of our conversation then). So essentially I remember most things about after I 'woke' up, except maybe a few details. I'm writing this only a couple hours afterwards (in case the forgetting drug is actually working and at some point the whole day will be erased (or never fully stored (or whatever the current metaphor is))

The words he came up with were 'big giant possum', and I could only come up with "the second word is 'big', the third word is weird, and the first is sort of boring like the second one". It seems strange that I wouldn't remember the one word that really stands out. Maybe it stood out that the first two words weren't weird (I know... weird right?) But for the most part, I am fairly confident I remember the car ride back home with him, telling him which lane to be in to be prepared for the exit, not this one but the next.

Anyway, the whole process is a great mix of low-tech and rocket science (drink salt water? camera on a tube? where brain surgery is all rocket science), and it saves peoples lives. If you have polyps, which are what become things that are cancer (technically: polyps are precursors to adenomatous neoplasms), then the polyps are removed, and that's that. You may have a predisposition to them, so you'll probably be scheduled for a follow-up colonoscopy much sooner if you have polyps (in three years rather than ten. But the magic (sorry, the science) is that it's sort of .. cured. Like skin cancer, if you remove it early (and that's the whole point of the colonoscopy), you're removing the cancer before it has spread so it is 'taken care of'.

Monday, August 10, 2015

Women told 'just' not to use it.

Words have meanings but we don't always know how to say exactly what they mean. We can use words themselves, but using words o describe words is so much harder.

The latest in workplace advice, the difference between male and female speaking styles, says that there are a few key words that women should just not use.

The word is 'just'. Women tend to use it more than men. And they are advised that using it makes them look weak. Not confident.

I'm here to mansplain that they're doing it wrong. That they're explaining the use of the word wrong. They have a point, that 'just' is a kind of 'weasel' word, one that weakens impact, that deflects confrontation, that pulls its punches, that

It's just that in my testosterone fueled autistic-spectrum inspired pedanticism, I must point out (as opposed to 'I just want to say') exactly which kind of 'just' is weak mousy wall-flower, and which is the manly man's cudgel.

Some examples

- I just wanted to ask you

Weasel. Just ask. Ha ha. 'Ask'. No that doesn't sound as good. 'Just ask' is just the right thing. Argh. Is the right thing.

- I just finished the report

Not weasel. It's more precise timing. 'I finished is not precise'. Of course you probably should have finished it at 3am, not last night because that shows procrastination, but 2 weeks ago because that shows you will work non-stop.

- I just happened to notice you've been coming in later.

Weasel ('happened to' is somewhat weak too). Say "I noticed you've been...". No, say 'You've been...". No. Say "Come in earlier". Oh, this is to your boss? Say "You've been working so hard lately".

- I just came by to ...

Weasel. This is classic weak deflection. I agree that this is not confident. For this I think they are right. Use "I'd like to talk to you about..." not too intrusive but not so easy to dismiss (nobody wants to be annoyingly interruptive, but sometimes you have to).

- I know just where those files are.

Not weasel. Means 'exactly'. Perfectly appropriate. Be confident and stick with this usage instead of blindly removing all uses of 'just'.

- I am just trying to relay this information to you.

Weasel. Translated "I don't have enough time to explain. Take your hand out of the blender before you turn it on".

- A just tuning does not allow the freedom of changing key as a well-tempered one

Not weasel. This is a technical term in music. I had to throw that one in. Words are complicated.

- Just a minute.

Weasel. Instead use "No, Thursday's out. How about never - is never good for you?"

- My decision to fire half our firemen was well thought out and just.

Not weasel. No one thinks this is a bad use of ;just' (except for the firemen). I just wanted... grits teeth)... I want to give an example of where blind elimination is catching perfectly strong uses. 'Just' as a synonym of 'fair' is slightly manly because it comes from 'justice' even though an eye-for-an eye is a little more macho.

And since we're thinking of false positives and false negatives, why aren't dudes just told to use 'just' more?

Executive summary: don't use 'just' as an adverb, all other uses are fine. Or just use it all you want and embrace cooperation and nuance.

Relevant links (where this started for me):

"Just" say no
This word can damage your credibility

Examples of bad^H^H^Hfemale^H^H^H^H^H^Hdeprecated usage from "Just" say no:

"I just wanted to check in on …"

"Just wondering if you'd decided between …"

"If you can just give me an answer, then …"

"I'm just following up on …"

Friday, August 7, 2015

What is wrong, terribly wrong, with wordles

I love wordles! They're so cool, like making artwork out of a big long text that I don't want to bother reading! I can see what's really important in a text by what's most common! And I get that in a flash!

I can't stand wordles. They're so mindless and dumbing down. Any good text will have a variety of vocabulary. frequency is misleading, texts are not just dumb bags of words.

These are extremely tendentious. I believe them both. But what I'll explain is what is problematic with them as data visualization.

What's a wordle? Also known as a tag cloud or word cloud, it's a graphic design method that takes a document, determines the frequencies of the unique words in that document, and mooshes the text of the words into an image, some vertical, the size of the word text in proportion to its frequency in the document. So from some document we get the dry list of individual word frequencies:

Wordle 127
word 35
words 30
cloud 28
students 22
clouds 22
Day 18
lessons 12
fused 12
adjectives 6
historical 5
classroom 5
even 4
see 4
...

This can be converted into a barchart:

which is the Zipf curve of the document.

Now comes the cool graphic the wordle. instead of boring bars, make the word itself and its size tell you how important it is. Mushing them all together and letting the natural instinct of readability draw your eye to what's important:

It is certainly esthetically pleasing, a bit Mondrian, with a jazzy visual rhythm. The algorithm to lay out the words is clever in simplicity, and the resulting image allows some simple inference about a text.

But what is the point of a wordle and how successful is it for what ever points it might have?
If the point is that it is a piece of art, then I've made a case for it already. A new wordle for each new document is a bit derivative though, with too many barely distinguishable varieties. One here or there is great, but a number of them is numbing.

How is it as a data visualization? How well does it relate the data?

The ostensible purpose of a wordle is to show you the relative frequency of words in a document. What is actually done is to show you the obvious top two or three most frequent words. All other words are essentially ignored.

That may very well be the best part of the wordle, that it presents essential information (the two or three most frequent) in an esthetically pleasing manner. The size of a word pulls your eye towards it because it is easier to read, and if it is readable, there's no unreading it (it forces its meaning on you).

- the eye is encouraged to dance around. this may account for the esthetics, but it is an annoyance for comparison.
- Vertical presentation of a word almost guarantees that you can't read it.
- comparison of size is even more difficult than a pie chart. two words not even exactly next to each other are difficult to compare (the word length itself is not the frequency but it accounts for the relative noticeability.

So really the information that can be pulled out of a wordle is: the most frequent word (which does usually outweigh all others in most documents), the second and third most frequent, but you're not sure which is which, and maybe one or two in the top ten but maybe you missed some.

Under this analysis, this is a Type V error in Fung's Visualization Trifecta Checkup, where the data and questions are well defined, but the visualization (the V) just isn't right.

So instead of complaining, what would be a better method, one that would actually address the stated purpose of showing relative frequencies?

The simplest (and least graphically pleasing) is the source list of stats: a text list, one word per line followed by its count in the document. Because numbers themselves are hard to judge easily in a list (but lengths are), maybe using a barchart sorted by frequency, and then maybe cut off at about 10 or so. The screen space taken up by the frequency list is about the same as the wordle image itself and allows extraction of a lot more information. All the information is in this list, and it is all readable, and all comparisons can be made very easily. Surely there are frequency questions that can be asked that are not easily answered by the list, but what might be slightly difficult for the list is impossible for the wordle.

What this says is that wordles are really good at showing you the top couple of words in an esthetically pleasing manner; what it puts in your head is mostly 'X is the most common, and Y is maybe a little less common' and thats the extent of its specificity.

But if you want to know even minimally less vague comparisons, and more than 2 words, a wordle does not do it that well.

Or to put it more bluntly, a wordle is popular because it is beautiful, not true.

TL;DR: A wordle is estheticaly pleasing but is not even as good as a piechart for transmitting information.

Least Uninteresting Number