Odd twists

Mar. 28th, 2007 08:58 pm
hrj: (Default)
[personal profile] hrj
So today's fire to put out was, "Put together a memo addressing the question of whether extended frozen storage time of component X correlates with failures of parameter Y, suggesting that we need to reduce the maximum allowed storage time. It shouldn't take you more than 10 minutes." Well, ok, the "ten minutes" thing is as standing joke between my boss and I. It took me pretty much all day just to put the data together because in all our several massive data systems, there is no straightforward way to identify all instances of component X nor all failures of parameter Y. And then just as I'm explaining how the data show that there's no correlation between frozen storage time and parameter failure, I realize that there is, in fact, an extremely strong correlation between the two -- it's just a negative one. Store X frozen for a short period and if fails Y; store X frozen for a long period and it doesn't fail Y. Counter-intuitive, but extremely fascinating. (Use X directly without freezing it and it may pass or fail. So things are a bit more complicated than that.) It was almost fascinating enough to have been worth staying late for an hour and a half to finish. I expect that first thing in the morning I'll be assigned some new aspect of the question. (I can think of a few ....) This is why I love my job.

Date: 2007-03-29 04:49 pm (UTC)
From: [identity profile] albionwood.livejournal.com
What analysis did you perform? What program(s) do you use for statistics?

I'm still new enough to statistical analysis to find it interesting...

Date: 2007-03-30 05:23 am (UTC)
From: [identity profile] hrj.livejournal.com
This wasn't a particularly sophisticated analysis! There were 17 data points, each with a numeric value (days frozen) and a binary judgement (pass or fail parameter Y). When I sorted them out by the numeric value, the first half of the list was (almost) all "fails" and the second half was all "passes". But I'm also getting to do some linear regressions to smooth out the results of assay variability. I'm using the "Minitab" statistics package, which is what I got for my class last fall. But for most basic statistics I just use the functions in Excell.

Date: 2007-03-30 03:50 pm (UTC)
From: [identity profile] albionwood.livejournal.com
Since I got Minitab and partly learned to use it, I've almost stopped using Excel for statistics. I can't trust it, for one thing - it makes undocumented assumptions sometimes. For another, it doesn't handle censored data very well, if at all. The two programs are complementary - I often send data back and forth between them for certain tasks.

I'm guessing one of the next steps will be correlating X with Y instead of the P/F category. If Y is a continuous variable that isn't measured beyond some P/F criterion, then it's amenable to some of the censored-data techniques in Minitab. I've been doing similar things with water-quality data, which are usually left-censored (minimum values constrained by detection limits). There are some Minitab macros written by Dennis Helsel of the USGS that extend common statistical techniques to censored data sets (boxplots, probability plots, Kendall's tau, Kruskal-Wallis, etc). Massively useful - let me know if you are interested and I'll point you to them.

Profile

hrj: (Default)
hrj

January 2026

S M T W T F S
    123
456 7 8910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 10th, 2026 10:48 pm
Powered by Dreamwidth Studios