You are currently browsing the tag archive for the ‘Statistics’ tag.
It has been years since Tim Allen starred in the sitcom Home Improvement. It’s been even longer since he first started his stand-up comedy act that provided superlative insight into the mind of a wanna-be handyman. In both his stand-up act and the sitcom, we learned that, according to Tim, the most important factor in choosing anything from autos to cordless drills to vacuum cleaners was power; the more the better.
While I would like to think of myself as highly evolved, when push comes to shove, I fear I am a “more power”, “go hard or go home”, “drive it like you stole it”, “get a bigger hammer” kind of girl.
Every car I have ever owned has had a manual transmission, and, while I am happy to cruise along most days and get as many miles to the gallon as possible, I like being able to downshift and accelerate when I need to. In my sailing days, I would grow impatient with days spent on the water carefully trimming the boat to get the best speed in light winds. I greatly prefer a day where the crew has to batten down the hatches, put a third reef in the mainsail and hang on for dear, friggin’ life.
I am the same way with work stuff, too. I can appreciate finesse and subtle persuasion. I am all for careful planning and thoughtful discussion. I am even capable of employing those techniques when appropriate, but, it must be said, I am more likely to work harder, stay longer or argue more vehemently when faced with unforeseen challenges.
I’m definitely not saying that I do all the work myself. I am saying that when circumstances have changed, and the plan you started with no longer is going to work, and you are 700 miles out to sea, keeping the 400 people in my department all motivated and pulling in the same direction is hard work.
Some situations simply require you to lower your center of gravity and push. More power! ErrrHHH. ErrrHHH. Errrh.
I am a “more power” girl with data analysis as well.
Power studies in statistics are used to determine the size of the sample needed to get the degree of clarity required for certain studies. Because sample size can drive the cost of the study, frequently power studies are used to determine the minimum possible number of test subjects required. Where cost is not an issue, though, I always say more data is better.
Although I grumble about the one dataset that I have with 680,000 tests, I cannot argue that it has “powerrRRR” (enough to crash my computer periodically).
In today’s NYTimes, Gina Kolata’s article, Vast Gene Study Yields Insight on Alzheimer’s previews medical news being published today outlining the discovery of 5 additional genes common to Alzheimer’s sufferers. There is hope that greater understanding of the genetic components of the disease will shed light on the biology of the disease as well.
What I found more interesting than the identification of genes, was the reference, once again, to the revolution in research that has fueled many of the latest developments. A number of year ago, Dr. Gerard Schellenberg of the University of Pennsylvania argued that the medical world’s fundamental approach to genome research needed to stop. With support from the National Institutes of Health and the National Institute on Aging, small genome studies were curtailed in favor of a massive collaborative effort. Researches around the world were convinced to conform their data collection to required information and share their data across institutions.
Recent gains are the result of a dataset of more than 50,000 subjects.
More power!!
I am stuck.
On several occasions in the last 72 hours I have been this close to a solution, only to fall some indeterminable distance short of success. I was explaining my issue to a friend of mine the other day. She looked on sympathetically, nodded sagely, and confessed that she had no idea what I just said.
But she felt my pain.
Which may be all I can ask for, I suppose. I any case, it was nice of her to listen while I went over the options I had tried and where I thought I needed to head next.
It’s a spreadsheet problem, and, no, I don’t really think anyone else can help me either. I just wanted you to know that I have been typing a lot lately; just not, you know, words…
Although I come here every day (or almost every day) to write and despite my addiction to crossword puzzles; I am at, my very core, a numbers girl. My living these days depends on sorting out what, if anything, the numbers are telling researchers. Upon observing the same WordPress announcement of a milestone, a friend of mine observed that it is an interesting social phenomenon, I calculated that my share of the milestone was 0.000195%.
Each day that I write, I also look at the statistics for my blog. In the information that is provided, I can see how many people click on my homepage each day. I am not told who is reading, unless that person chooses to comment, so, despite all the numbers, I know little about my readers. I know that Mondays usually have more hits than any other day of the week and that there will be a slump on Saturday and Sunday. From this, I am guessing that many people read from work (because it is too depressing to think that reading what I post is work and therefore is put off until Monday). Although I only have two years of data, it seems that fewer people read in July, which is good. We should all take a little vacation or at least get out and enjoy the sunshine, I think.
In addition to the raw number of hits each day, WordPress also provides a list of search terms used to find my blog and referrers, other websites with links this page. Oftentimes folks search on my name, occasionally they are searching under some of the topics that fill these pages, some are more obscure. My most frequently read post in the last three months was written more than a year ago, but comes up several times a week in what I would definitely term one of those obscure searches.
It fascinates me.
As I said, I look at this information every day, most days it is just idle curiosity; my usual fascination with numbers.
Occasionally, I am mystified. Yesterday, for example, my blog had more than twice my usually number of hits; very good, even for a Monday. The reason??? A large number of folks were referred here from a website that (a) I can’t imagine could have any common link to my writing, and (b) I really don’t want to go to to check.
So it will remain a mystery… That’s life in the cyberworld, I suppose.
I have been doing some statistics work for a group of medical professionals lately. I really enjoy the work, but as my background is in quality control and production, I am less well versed in biostatistics. I was asked recently about equivalence testing.
Hmmmm…. I’ve read about it, but I’ve never done it. In my previous work, I’ve generally been looking for differences in populations or treatments. There are simple tests in statistics to handle how to prove there is a difference in two sets of data. I’ve used those forever. The downfall with these tests is that there exists only two possible outcomes;
- Either your data demonstrate sufficiently different behaviors in the two samples as to prove that the two populations from which those two samples were taken would also behave differently
- Or your data do not prove that the two populations are different.
You should note that the second option says nothing about the two populations being the same; only that you can’t say they are different. If your conclusion is that there is no statistically significant difference, you generally need to get more data. With more data you can make more accurate predictions for the general population; with that you may possibly prove the difference you were trying to show.
Why can’t you just say the two populations are the same??? Well, it is really true that all you really showed is that you’re not sure they’re different. In other words, with too little data you can fail to prove a lot of things. Make no mistake; while ignorance may be bliss, it is still ignorance. Better to get more data.
What do you do then if you really want to show that two different groups are behaving the same way? I can guarantee that anytime you take sample you will get some variability, so even if you are sampling two groups that are behaving identically, you will get slightly different results in your samples, so what we really do is attempt to prove that the differences are small enough.
How small is small enough? It depends on what you are testing… If you are attempting to estimate finances certainly the nearest penny should be close enough for even the most miserly. If you are estimating grains of sand on a beach, certainly you should have a little more leeway. If you are dealing with life and death decisions, you want to be pretty damn close.
The trick, no matter what the subject matter is, is to determine how much difference you can still have that will make no practical difference.
Still with me? The whole reason for my punishing you with a statistics lesson is this: the width of this band is called the Zone of Indifference.
It is my new favorite mathematical term. It may be my favorite mathematical term ever (replacing erf, for “error function”). I see the Zone of Indifference having both statistical and personal significance; as in I’d like to help you but I am afraid that falls in the zone…
I have days where I am in the zone for hours and hours.
How about you?


Recent Comments