BPuhl’s Blog

A little bit of everything without actually being much of anything

Archive for January, 2010

Forget your password?

Posted by BPuhl on January 20, 2010

Read an interesting article at http://redtape.msnbc.com/2008/08/almost-everyone.html on the issues/weaknesses of password recovery schemes. 

Most everyone remembers when Sara Palin’s Yahoo mail account was hacked, because her password recovery questions were easily discoverable.  One thing that I thought was interesting in the article though, was the idea of a “black market” for personal information – let me go buy a profile to find out the name of your dog, your favorite restaurant, etc…  How would people come up with this information in the first place, are there secret spies in black trench coats following everyone around taking notes on everything they do?  I have no idea…

…in other seemingly unrelated news – has anyone else taken all those funny Facebook quizzes where you answer questions about yourself, and they tell you how long you’re going to live, what your zodiac sign means, and things about your shopping habits and sexuality that you never even realized you knew?  There’s got to be a thousand of those things out there…  I’m sure glad that Facebook is much safer than the dangerous “internet”!

hmmmmm……

Posted in cloud, Digital Identity, Identity and Access, Random Tecnical Stuff, Randomness | 1 Comment »

Posted by BPuhl on January 19, 2010

If you were any more short-sighted, you’d be looking into the past!”
                                                                                                          -Unknown

Posted in Quotes | 2 Comments »

Invitations…

Posted by BPuhl on January 8, 2010

More fun in the federated cloud world.  Traditionally, with EASI ID’s, the Email As Sign In meant that your user name was your email address.  However, with federated ID’s, we’re sending a users UPN as their login ID, which may or may not map to a valid email address (in many cases, it doesn’t).

So what do you do then, if you have an application where a user can invite another person to access a resource?  This is pretty common, I want to share a file on my skydrive, so I allow foo@foobar.com access to the file, which also triggers a mail to foo@foobar.com inviting them to sign up for a new Live ID if they don’t have one already, and if they do have one, then they can log in with it and access the file.

Unfortunately now, the person ACL’ing the file knows the users email address, but NOT their login name. 

The answer will likely be some form of “click here” key in the invitation which will allow the application to associate an email address with an ID, but because this hasn’t been required in the past, it’s going to take some time for applications to adjust.

Posted in ADFS, cloud | 1 Comment »

Personal Limits

Posted by BPuhl on January 6, 2010

We all have limits, we usually just don’t think about them as such.  Or when we do think about them, it’s usually because somebody else is pushing them and it’s making us mad.  But this is more about personal limits, especially with respect to the way that we run our servers in MSIT.  (Important to note that MSIT is probably unlike any other environment out there, so your limits may be different, but the idea is probably the same)

On Sunday, Rey Diaz included some of the following thoughts in a conversation about Wisdom versus “right”

Do you try to come as close to breaking the law as you can, without breaking it?
Do you try to push your morals as far as allowed, without being immoral?
Do you try to move as close to disaster as you can, without actually feeling the consequences of disaster?

This evening, I was reading an article in one of my flying magazines, and Rod Machado had an article where he talks about one of my favorite factoids in aviation:

On page 6-26 of the FAA’s Pilots’s Handbook of Aeronautical Knowledge, we find that, “Aircraft certification rules require accurace in fuel gauges only when they read empty

In fact, if you look up the actual Federal Air Regulation, it says:

FAR 23.1337 (b.1) Each fuel quantity indicator must be calibrated to read “zero” during level flight when the quantity of fuel remaining in the tank is equal to the unusable fuel supply determined under part 23.959(a)

Is there anyone around that actually wants to be flying in an airplane when it’s fuel gauges read the only accurate calibrated measurement that they have to?  I like flying gliders, but not like that!

Ok – but this is a geek blog – so what does this have to do with anything remotely interesting to you?

Well, in MSIT, the question often comes up about what are our DC performance.  Sure, you can go graph hundreds of counters and things, but see my earlier post about situational awareness and then tell me how easy it is to keep yourself aware of all that flaff.

Instead, we’ve got some personal limits when it comes to our DC performance that have worked pretty well for us over the past few years.  They are:

20-40% Target sustained CPU utilization
40-50% CPU utilization and we start checking for unusual causes of load, but if this is just normal trend growth then we either bring it back down with hardware replacement, or additional servers
> 50% CPU utilization – evaluate the trends, this is indicative of a potential problem – may need to start budgeting process for new servers
> 60% CPU utilization – we consider ourselves “broken”, and we go into break/fix mode to either reduce load or increase capacity

Of course, nobody’s saying that AD is broken at 60% CPU, these our just our personal limits.  After all, if I wanted to wait for AD to break before I did anything, I might as well spend my free time polishing up my resume.  The idea of course, is that you want to decide when it’s broken for you, and think about what you’re going to do ahead of time – this way, you’re not thrashing around, and management isn’t surprised when you have an out of band budget request.

A few other tidbits while I’m thinking about this. 

- These numbers are purely based on our experience in our environment.  We know that when we run over 75-80% CPU, we’re running very hot and some sensitive applications can be impacted by latency.  We also know that our standard operating procedure is to have 3-5 DC’s offline at any given time.  We have to account for the fact that when servers are offline (dogfooding, debugging, etc…) we need the headroom for the load on the other boxes.   

- We consider “sustained utilization” to be averages over 15 minutes, across all DC’s, but we’re also applying the human element to the data.  We’re looking for the trends, not the spikes…we know that spikes happen…

-  In a perfect world, you’d at least know where the load came from.  More often than not, there isn’t a single smoking gun, it’s just increased utilization as other systems in the environment are leveraging AD.  At the moment, I can only think of one time when we could trace the increased utilization back to a single project, and that was our IPSEC deployment – of course, we couldn’t roll that back anyway, we still had to increase capacity – so it’s not like it really mattered, but it was nice to know what caused a 20%+ jump across the board.

Personal limits, a good thing to bring to work with you.

Posted in Random Tecnical Stuff | Leave a Comment »

Passive Safety & Situational Awareness

Posted by BPuhl on January 5, 2010

I find that there are a lot of concepts which I bring to my job in MSIT, from my hobby as a private pilot.  In this case, I am “borrowing” the title for this post from an article by Bruce Landsberg in one of the magazines I subscribe to.  He starts:

… Compared to machines, the homo sapiens’ conceit of being masters of the universe shows us to be consistently unreliable when it comes to repetitive tasks.  We do excel, however, in thinking up ways to get out of mindless chores to refocus our short attention spans on really important stuff…

This hit me, because earlier today I was talking to some of our engineers about our “team server” – which is the box that we use to run all of our recurring scripts from, collect data to, store utilities/tools/scripts on, and generally dump stuff.  Appropriately named Dumpster (does that make us dumpster divers?  probably).  We run a lot of scripts to collect a lot of data for our own use.  Although we’ve got the full blown monitoring infrastructure in place and we own all the settings for alerting, etc…  SCOM is owned by one team, the alerts go to our 24×7 operations center (who resolves the bulk of them), etc…  So if the administrators are abstracted from most of the chaff, how do they maintain situational awareness?

Situational awareness, another one of those terms I picked up flying.  Basically, the understanding of what’s going on around you.  Easily demonstrated with the following question to your administrator:  “How’s AD doing today?” – by default, the answer will be, “AD’s running great” (that is their job after all…) – the follow-up question though, “How do you know?” is usually the zinger.  If the answer is, because nobody from Help Desk is screaming at us, then that’s probably not a good sign.  If the answer is, because there are no trouble tickets, that’s probably also not a good sign… 

Lack of bad doesn’t necessarily equal good.

When I have a chance to talk to the MS Directory Masters classes, I usually try to work in the following story:

In 2002, I was one of a small group of AD administrators for MSIT, we were knee-deep in dogfooding Whistler, which shipped as Windows Server 2003, when one day my GM walks by, sticks his head in the door (never a good sign), and asks “How’s AD doing today?”.  Default response at the time was something like, “Looks good, couple of DC’s being upgraded, so far so good… why do you ask?”  It’s at this point that he says, because I just got a call saying that our Extranet is offline, nobody can authenticate to any applications, our partners aren’t able to do business with us, and I was wondering what you were doing about it?  If I remember correctly, it was about that time that he looked a little worried about his hiring decision, turned and walked away…

Quickly (trying) to log onto the domain controllers, all 6 of the DC’s were running at 100% CPU utilization.  Perfmon, SPA traces, expensive/inefficient query logging – nada/zero/zip – we were in trouble.

Within a couple of hours, we’re all in a big room – techies around the table, managers looking over our shoulders, and we had the AD rock stars from the product group (the developers) sitting in the room, taking apart the DC’s in the debuggers.  They were all shaking their heads, when someone mumbled under their breath, “this almost looks like normal load…just a lot of it”

That’s when we decided to pull some perf data for the past 6 months, which looked something like this:

perf

Sure enough, we had been growing load for the past year or so, all the DC’s were running at 100% CPU, we stole 4 servers which were racked & built for some other application, DCPromo’d them and perf dropped down to a reasonable level…

oops…

As you can see from my MSPAINT representation – WE ACTUALLY HAD THE PERF DATA!  The problem was, that we had lost situational awareness of what was going on in our other environments, because we were so focused on dogfooding.

The moral of the store then, being that it’s good to HAVE data, but it’s much better to LOOK AT the data occasionally…  

Posted in Active Directory, Random Tecnical Stuff | 1 Comment »

Bad Combination…

Posted by BPuhl on January 5, 2010

(Non-technical rant in progress…)

Interesting trend happening around Redmond lately.  Over the past few months, there have been 5 different traffic circles built, 3 feeding into one another in Woodinville, and 2 on East Lake Sammish drive in Sammamish.  Both cities border Redmond, and both happen to be roads that I drive frequently.  In fact, I have to go through the 3 circles in Woodinville each morning after dropping my daughter off at school.

So what.  Isn’t it a good thing when a city takes out half a dozen consecutive stop lights, and replaces them with a slow but smooth flowing traffic circle?  Well, if this were in California where I grew up and learned to drive, sure, it would be great.  But this is Washington, and if you’ve ever had the chance to hop on a freeway in or around Seattle, then you’ve probably noticed:  The stop & merge.

Yeah, it seems that drivers in Washington don’t actually know how to merge with traffic, instead, they stop…sit…wait for the orderly flow to slow to a crawl and somebody to wave/honk at them, and then they gun it to try to catch up to speed… 

So kudo’s to the traffic engineers (who I suspect live out of state) for picking a control device which allows for a nice, orderly flow of traffic through these intersections.  It’s too bad they have to get screwed up by the stop and merge.

Oh yeah – and to the beige Toyota Highlander on East Lake Samm this afternoon – Although the sign is red & white – it’s an upside down triangle that says YIELD, it’s not an octagon that says STOP – there is a difference!

Posted in Randomness, Rants | 2 Comments »

Happy New Year!

Posted by BPuhl on January 1, 2010

Happy New Year everybody.  Let’s make 2010 a year of peace and prosperity, where “us & we” takes precedence over “me, myself, & I”

Posted in Nuggets | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.