So, our warehouse server is sick. VERY complicated story short…the warehouse database got pooched. Like, went red and was inaccessible for TWO WEEKS pooched.
That was over a month ago. And we’re still not back. We’re limping along on partial processors, blah blah blah. Everything is taking, like, four-five times longer to run than it used to…which when you’re talking about things that “normally” take anywhere from five to fifteen hours to run…is a big deal.
WHICH LEADS ME TO MY ACTUAL STORY. (Which is rather boring, actually, if you’re not…um…me…)
I’m doing some load jobs this weekend, getting some of the stuff that has been ‘on hold’ since this all erupted into a fiery ball of death last month finally moving again.
One of those things is an allocation process I wrote that takes a certain dollar figure that in all reality has no direct relationship to sales, and “allocates” those dollars out to appropriate customer sales according to the weight of said sales – in other words, if Record 1 has $90 and Record 2 has $10, and the amount is $200, Record 1 gets 90% of the $200 or $180, and Record 2 gets the remaining $20.
Only, it’s picking apart over 1.7 billion sales records, couple hundred thousand customers, thousands of suppliers, etc. etc. etc.
It took a lot of fine-tuning to get it working fast enough to be actually usable. And I was rather pleased with it, really, because yeah, I’m THAT level of bad-ass, thank you for noticing.
And then, the box broke. “Tables” became “unioned views across multiple tables on different boxes(!!!),” the bandwidth has the server equivalent of strep-frickin-throat, we have a fraction of the tempdb space, blah blah blah a bunch of stuff that only fellow developers/DBAs are going, “Ohmygah, you poor THINGS!” over right now.
I was sitting there last night glumly watching records that used to process in, eh, six to twelve seconds max…now taking…wow, really, 380 seconds…? And I’ve got 11,000 of these things to go?
…that’s 48 days before this will be done…and that’s tooooooo loooooong…
So I did one of those “stare at the ceiling thinking” things for a while. Debating with myself about pros and cons. Risks and rewards.
Then I changed the part that grabs the sales data so that the destination table is uncompressed before loading, then recompressed to page level afterward. Data and indexes both. And then the allocation process runs against it.
I wasn’t sure it would help, but, I thought it just might. And it did. I’m clicking at 4-10 seconds per item again. Whew.
The reason I wasn’t sure it would work is because this is actually a trade-off proposition. (And this is where it gets really boring and confusing, so, feel free to drop off…but I just hafta share for the, like, two people in the whole world who might find it interesting.) (Plus, I’m just jazzed. Because. IT WORKED. Woot!!!)
So, keeping it way simplistic and not at all tech-textbook (read: for more technically-accurate and complete information on This Kind Of Stuff, I’d recommend hitting up the MSDN books online) …the way a database stores things is kind of like this: Imagine the biggest, ugliest textbook you ever had to haul around with you in school. The words are like the fields in your table. The sentences are the rows in the database. The pages are, well, the pages. The chapters are ‘extents’ – a collection of pages.
Now. Each page can only hold so many words. And in the database world, each chapter also has a limit – at some point, things will spill over into new chapters for lack of room. And there’s only so many chapters that can fit in an extent.
Whenever you’re running a query, the server will find the information you want by drilling through all this information. It will look at what you’re asking for, look at the server-equivalent of the table of contents for the extent and the back-of-book index for the page, and then it will begin to scan through the page looking for the specific information you’re after. (Table-level indexes would therefore be…um…hmm…like the bold words on the page? I guess?)
Just like a textbook, if you’ve got good indexes that pinpoint information as precisely as possible, you can find things faster.
So. When you compress a table or partition to page level, you’re actually doing two things: First, the rows get compressed (more words per page).
Then, the page gets compressed (more pages per extent).
NOW. Here’s where the tradeoff is: When you run a query against a compressed table, you will frequently find that it burns more CPU and/or has more input-output (I/O) hit. The data has to be uncompressed to be displayed, and that takes a bit of server-effort.
HOWEVER. The finding of the data itself requires less effort and happens faster. Instead of having to scan thirty pages to get everything it needs to return, maybe it only has to scan five.
This time, it looks like the tradeoff is in my favor.
I think this kind of thing is what makes me love what I do so much; what makes me sometimes sit back and chuckle about being paid to romp around figuring out how to shave a few seconds here or there off a process, or get data that is stubbornly refusing to budge to zip across to its new home.
There’s always more to learn.
There’s always more that I don’t know yet.
There’s always a better, faster, more efficient, less painful way to do whatever-it-is.
There’s science. There’s art. It’s frustrating, infuriating, confusing, complex, simple. Things that take only five minutes to fix take five days to find.
The answer to the question “will this help this process go faster” is just about always “it depends…hmmmm…probably…let’s try it and see!”
And for me, all the frustration, irritation, and hours-spent-scouring-through-white-papers-trying-to-connect-the-dots is well worth it for the moment when I can sit back, smirk smugly to myself and mutter, “That’s right, baby. Whose kung fu is best? Mine is.”
Keeps me coming back for more, even when everything else in the average working day is enough to make me scream…