Wednesday, November 26, 2014

Active Directory lockouts with Citrix Receiver on HP ThinPro

Workaround found - see very bottom for how!

Working on deploying new thin clients and encountered an issue where a single "bad password" would cause the account to become locked out.  That shouldn't occur since the domain is set to lockout after 3 failed attempts.

Background:
Windows 2012 Active Directory
Citrix XenApp 6.5
Citrix Web Interface 5.4 (in this case hitting a services site aka PNAgent)
HP T520 Thin Client with ThinPro 5.1.0 build 07
The Citrix client installed was Receiver / icaclient 13.0.3
Active Directory set to lockout after 3 attempts

Issue:
User attempts to login from the thin client and with even a single mistyped password causes the users account to be locked out and ignores the AD three attempts policy.

Details:
The thin clients are replacing various flavors of thin clients (Wyse C30LE, HP t5530, HP t5540) all running Windows CE 5.0 and 6.0. 

My understanding is that these older clients use the old style Program Neighborhood which enumerates the applications via XML.  Typically out of the box hitting the dns record "ica" which was normally setup with round robin dns to multiple XenApp / Presentation Servers.  This gave a list of possible applications to the end user prior to authentication. 

The newer style thin clients based on Citrix receiver are a little different in that they authenticate at the thin client through Web Interface or Storefront and then present the application list to the user.

The problem with the new receiver method is that the user can authenticate, enumerating the apps available, and then walk away.  Then another user can walk up, launch the desktop or app they want and they just got access to the wrong user.  The old style prevented this because the user would launch the app and then have to authenticate.  The auto launch feature that some of these new thin clients helps with this alongside the "logout on last application close" options that many of the good ones are including.

HP took this a step farther and made it so that you could have multiple "connection profiles" in their connection manager!  So now we can make receiver profiles for various apps / desktops with their respective auto launch options we want based on the target user.  So, user walks up, clicks the familiar app / desktop they want and it prompts for credentials, they enter them and their desktop starts launching.  When they are done they logout and it automatically logs them out of the thin client once the app closes.  It mimics the old style, no need to train 50 - 65 yr old users how to do it differently! WIN

Problem:
Issue #1
The issue is that when "auto start resource" field is populated in ThinOS 5.x it will attempt to auto start the resource regardless of an authentication failure.  This results in 3 consecutive login attempts with the bad password and depending on the domain lockout threshold causes a lockout. 


It looks to me based on the thin clients logs that the following is occurring. 
  1. Attempts to use credentials - strike one
  2. Attempts to auto launch resource even though credentials failed - strike two
  3. Attempts to auto launch resource a second time! - strike three you're locked ou
Connection starting
2014-12-09 09:12:33.256109510: XEN_WRAPPER: Starting xen_wrapper
2014-12-09 09:12:33.259483293: XEN_WRAPPER: Setting global vars
2014-12-09 09:12:33.390955220: XEN_WRAPPER: --UUID: {23285ceb-40f5-45f2-a09b-022148aa6608}
2014-12-09 09:12:33.394073686: XEN_WRAPPER: --ADDRESS: http://pna/Citrix/PNAgentTest/config.xml
2014-12-09 09:12:33.397168672: XEN_WRAPPER: --AUOSTARTRESOURCE: Desktop
2014-12-09 09:12:33.400411749: XEN_WRAPPER: --FORCE_HTTPS: 0
2014-12-09 09:12:33.403555042: XEN_WRAPPER: Finished setting global vars
2014-12-09 09:12:33.418721583: XEN_WRAPPER: Current XEN_CONN_METHOD: pnagent
2014-12-09 09:12:33.422322255: XEN_WRAPPER: Xen_wrapper_lock started
2014-12-09 09:12:33.433700476: XEN_WRAPPER: Xen_wrapper_lock finished (lock obtained)
2014-12-09 09:12:33.437209663: XEN_WRAPPER: startConnection started
2014-12-09 09:12:33.440670478: XEN_WRAPPER: clearOldData started
2014-12-09 09:12:33.565426467: XEN_WRAPPER: clearOldData ended
2014-12-09 09:12:33.568579181: XEN_WRAPPER: verifyPrereqs started
2014-12-09 09:12:33.584631374: XEN_WRAPPER: Skipping server connectivity check
2014-12-09 09:12:33.588346773: XEN_WRAPPER: Getting credentials
2014-12-09 09:12:33.604419882: XEN_WRAPPER: Attempting to use credentials from SSO manager
2014-12-09 09:12:33.685276288: XEN_WRAPPER: Saving the credentials
2014-12-09 09:12:33.737677029: XEN_WRAPPER: Finished saving credentials
2014-12-09 09:12:33.741336400: XEN_WRAPPER: Finished getting credentials
2014-12-09 09:12:33.747522104: XEN_WRAPPER: verifyPrereqs finished
2014-12-09 09:12:33.750782250: CONFIGURATION: setting up config files
WARNING /etc/templates/xen/appsrv.in/64: Could not find regkey root/ConnectionType/xen/general/type
WARNING /etc/templates/xen/appsrv.in/66: Could not find regkey root/ConnectionType/xen/general/application
WARNING /etc/templates/xen/appsrv.in/69: Could not find regkey root/ConnectionType/xen/general/directory
lpstat: No destinations added.
lpstat: No destinations added.
lpstat: No destinations added.
lpstat: No destinations added.
2014-12-09 09:12:34.200041676: CONFIGURATION: finished setting up config files
2014-12-09 09:12:34.203440412: SETUPUSBR: Setting up USBR
2014-12-09 09:12:34.504068780: SETUPUSBR: Finished setting up USBR
2014-12-09 09:12:34.508365874: CONNECTIVITY: Autolaunchresource started
2014-12-09 09:12:34.568366574: XEN_WRAPPER: Calling: hptc-citrix-connect -g 'CitrixReceiver Linux HP ThinPro' -f /tmp/citrix/{23285ceb-40f5-45f2-a09b-022148aa6608} -c /tmp/{23285ceb-40f5-45f2-a09b-022148aa6608}.credentials '-L' 'Desktop' '-a' 'pnagent' 'http://pna/Citrix/PNAgentTest/config.xml'
/etc/xen/helperscripts//xen_err: line 98: 19959 Terminated              nice xmsg -pixmap /usr/share/icons/hptc-icons/48x48/hourglass.png -message "$msg" -caption "$caption" > /dev/null 2>&1
2014-12-09 09:12:36.918657058: XEN_WRAPPER: Processing Citrix connect error output in file /tmp/citrix/{23285ceb-40f5-45f2-a09b-022148aa6608}/error.log
2014-12-09 09:12:36.922922526: XEN_WRAPPER: Error info: Exit Code 2 ERR_CRE_BAD_CREDENTIALS ERR_INFO_URL: http://pna/Citrix/PNAgentTest/launch.aspx ERR_INFO_HTTP_CODE_ERROR: 500 ERR_INFO_DP_ERROR_ID: CharlotteErrorBadCredentials (V1.0.3-26636-19972-C.138-C.351-L.166-M.611)
2014-12-09 09:12:36.929542118: PNAGENT CONNECTION: pnagent launchapp function ended
2014-12-09 09:12:36.933493717: CONNECTIVITY: Failed to autolaunch resource: Desktop
2014-12-09 09:12:36.936785230: CONNECTIVITY: We will try again later after obtaining the full resource list
2014-12-09 09:12:36.940211790: CONNECTIVITY: Autolaunchresource finished
2014-12-09 09:12:36.943772291: CONNECTIVITY: Getresourcelist started
2014-12-09 09:12:36.947373640: PNAGENT CONNECTION: PNAgent list function started
2014-12-09 09:12:36.960696717: XEN_WRAPPER: Calling: hptc-citrix-connect -g 'CitrixReceiver Linux HP ThinPro' -f /tmp/citrix/{23285ceb-40f5-45f2-a09b-022148aa6608} -c /tmp/{23285ceb-40f5-45f2-a09b-022148aa6608}.credentials '-E' '-a' 'pnagent' '-i48x32' 'http://pna/Citrix/PNAgentTest/config.xml'
2014-12-09 09:12:37.140691247: XEN_WRAPPER: Processing Citrix connect error output in file /tmp/citrix/{23285ceb-40f5-45f2-a09b-022148aa6608}/error.log
2014-12-09 09:12:37.144597435: XEN_WRAPPER: Error info: Exit Code 2 ERR_CRE_BAD_CREDENTIALS ERR_INFO_URL: http://pna/Citrix/PNAgentTest/enum.aspx ERR_INFO_HTTP_CODE_ERROR: 500 ERR_INFO_DP_ERROR_ID: CharlotteErrorBadCredentials (V1.0.3-26636-20050-C.138-C.351-E.425-M.607)
2014-12-09 09:12:38.685672448: XEN_WRAPPER: Xen_wrapper_unlock started
2014-12-09 09:12:38.695702911: XEN_WRAPPER: Xen_wrapper_unlock finished
Connection stopped


Issue #2
Regardless of whether "auto start single application" checkbox is marked or not it will attempt to auto start the resource.  According to HP support, you should have to populate the "auto start resource" AND check mark the "Auto start single application".  In the below image attempting to launch the connection will auto launch Desktop even though the box is not checked.

I see this as a "so what" issue since you can simple blank the resource field to fix.





















HP Support: 
Working with HP support has been... challenging.  This is my typical experience with HP support.  In fact, some 4 or 5 years ago we had been deploying HP t5530 / 5540 units and we had a horrid, no good, very bad experience which led us to start buying Wyse C30LE's instead.  Currently there is an open ticket and we've finally after much back and forth to sort out what the issue really is have gotten to where we have a call in a few days to talk directly with what they call "3ls" techs regarding the issue.

Update: 12/11/2014
Our call today went extremely well!  The 3ls techs looked at the issue and acknowledged that it this is not intended and is not correct functionality.  They are reviewing the Receiver launching scripts and debugging.  These techs where wonderful to work with.

In addition, it helped that I had just received more of these units in the mail yesterday with ThinPro 5.0.0 build 34 installed (Receiver 13.0.1) and they do NOT have this issue.

So, hopefully we should see a fix for this very soon.

Update: 12/15/2014
I received a potential fix from our tech.  After replacing one of the xen scripts it now does 2 login attempts on a failed password.  So, closer, but still a little ways to go.  I've let the tech know, but have not gotten a response back yet.  I'm very impressed with the amount of time it took HP 3ls from our call until getting a potential fix back to me, only 4 days (2 working days)!




Update: 1/27/2015
As much as I wish I could say that my experience with HP support only went uphill since the 12/11 remote meeting I can't.  After that meeting the communication with HP Level 1 support was no good, horrible, very bad.  In that communication they said that due to the way Citrix enumerates they believed this would be a difficult issue to resolve.  They then said that I could work around the issue by enabling Auto reconnect under the Xen Connection General Settings Manager.  I enabled the setting there and still no go.

I of course responded regarding the poor communication and slow speed on the issue and copied our reseller stating we would be looking at our "options".

Never have I had such a quick and excellent response to a "nasty gram" that I've sent.  Next day we now have a workaround to the issue.  3LS got in touch with me directly and let me just say that working with them (twice now) has been a treat.  I just wish I could get straight to them faster / easier as this issue probably would have been resolved within the week.  They are the support that I would expect from HP.  As for level 1, communication skills are severely lacking.  Half the responses I couldn't even translate into something that made sense!

It turns out that the "enable auto reconnect" was actually a pretty close call, but level 1 communicated the incorrect location to enable this setting to me.  There are actually 2 locations this is set.  The one that resolves the multiple lockout issue is located under each individual connections settings!  Just ensure that the following is enabled for each connection "Auto reconnect applications on logon" and bingo, all is well again.

Friday, November 21, 2014

Domain Controller High CPU - Service Host

In an earlier post I talked about how XenApp 6.5 sessions would start and then disappear.  In the end I had determined this was due to our Domain Controllers having their CPU's pegged out, at least partially due to insufficient RAM.


http://didyourestart.blogspot.com/2014/09/xenapp-65-session-starts-then-disappears.html


Doing this absolutely solved the XenApp issue, but the DC's continued to have high CPU usage.  Basically the pattern was that CPU would sit at 50% for 10 - 15 seconds, then drop to 2%, then back to 50% and the pattern continued. 


Under processes you could see that the issue was with the Service Host: Local Service which wraps TCP/IP NetBIOS Helper, Windows Event Log, and DHCP Client.  Jump over to the Performance tab and click Open Resource Monitor and click the CPU tab. 


Here we see three processes using high CPU in my case:
  • svchost.exe
  • WmiPrvSE.exe
  • perfmon.exe
Under Services the primary eye catcher listed:
  • EventLog
So really two things caught my eye here.  The WMIPrvSE.exe (perhaps some WMI monitor?) and EventLog.  My first suspicion was WMI so I turned off several monitoring applications we have with no effect. 


Next I looked at Eventlog clue.  This lead me to two posts online which nailed it.


Jump into the Eventvwr and look at security log and sure enough it's full.  Clear events and instantly the issue resolves...  Jump over to the other DC with same issue and clear security log with same result. 


Appears that this occurs when the log is full and set to overwrite.  I'm still researching if this is caused by some service doing excessive logging which I highly suspect.

Tuesday, October 28, 2014

Exchange 2010 - The certificate is invalid for Exchange Server usage

After attempting to open OWA I received a lovely message about the certificate being invalid today.  Huh?  That can't be right.  Unfortunately we don't utilize OWA very often, so the error had gone unnoticed for a long period of time.


First things first, look at the cert. 
  • Certificate path is fine
  • Still within the valid date timeframe
  • SAN cert and all the DNS names look fine
  • As far as the certificates MMC all is swell.
But Exchange still shows "The certificate is invalid for Exchange Server usage"
After some browsing on the old google I find lots about this when the cert path is wrong.  So I play around with the intermediate / roots, but feel pretty confident that it's correct (and the cert is showing the path valid).


Finally, I assign the Exchange roles to the self signed cert, delete the third party cert, and reimport it.  Same error, but now I of course can't assign the roles back to it because it's invalid.  So, of course after a few minutes people get a popup about the self signed cert.  Doh.  No problem though.  We can force that with the shell.
  • Get-ExchangeCertificate | fl
  • Find the cert wanted and get it's Thumbprint
  • Enable-ExchangeCertificate -Thumbprint [thumbprintfromabove] -Services "SMTP,IIS"  (we don't use POP or IMAP)
Okay, at least now we're back where we had been, but what's going on.


Opening up the shell I do a Get-ExchangeCertificate -Thumbprint thumbprint## | fl.  It shows a RootCAType of unknown.  Eh?  That's definitely not right.


I pull up https://www.digicert.com/help/ and do a cert check.  Uhm, pretty sure it shouldn't say "SSL Certificate is revoked".  Yikes!


After some more head scratching I recall that with the latest project that I'm working with in my off hours (Exchange 2013) I had rekeyed the cert.  Of course when I rekeyed the cert I did import the new cert onto the old Exchange 2010 box, so that shouldn't be the issue.


So, I look at the new Exchange 2013 box cert and compare it's Serial Number to the one on the Exchange 2010.  They should be the same, but what the heck they are not!  Somewhere in the process I messed up the import into the 2010 box. (and I know I did the import, I logged it in our tickets with the steps)


Export the cert again from Exchange 2013, quick import into 2010, reassign the roles and all is happy!


So:
  1. Exchange doesn't specifically complain that the cert is revoked.  It just states it's invalid.
  2. If I had paid more attention to the OWA error I would have seen that it specifically said "The organizations certificate has been revoked" and it was correct.
  3. The certificates snap-in mmc doesn't, as far as I can tell, show when a cert has been revoked.
  4. Certificates can be dang confusing, double check that you've got the right one (serial number seems to be a good way).

Wednesday, October 8, 2014

Java Security Prompts not saving

With a recent upgrade to one of our software's we needed to update to a newer version of Java.  In particular Java 1.7_67.  It was quickly brought to my attention that users would be prompted for the Java security warning every time they logged into the application.  So, clearly it wasn't saving the setting that stored the "never ask again" settings.


After a huge amount of googling I found a lot of information, but nothing ever pointed me at this new location.  (note: after I found the location on my own and did a google search of appdata\locallow\sun\java\deployment\cache\6.0 I got THIS site in my results, unfortunately I had already essentially resolved the problem at that point in order to know to google that... DOH).




We're using Citrix XenApp 6.5 with Profile Manager.  Of course we've set our profiles up so that they do not store junk temp data located in AppData\LocalLow or AppData\Local locations (after all that is the purpose of AppData\Roaming, if it's important store it there).  So, I immediately assumed that Oracle must be storing something in those locations when they should be in Roaming in order for the settings to be retained. 




After some trial and error I had my answer as to the location of the files in question. 


AppData\LocalLow\Sun\Java\Deployment\cache\6.0\34\ (the .lap file)
(note: the \34\ directory may change for different applications)
and
AppData\LocalLow\Sun\Java\Deployment\security\trusted.certs



Both files must be retained to save the "Do not show this again for apps from the publisher and location above" and also "Do not show this again for this app and web site" options.








Before this issue I greatly disliked Oracle's Java, this just pounds in my belief even more that it will be a grand day when I no longer have to install Java RTE on my servers.  Getting very close at this point.  Netscaler (being moved away from Java is my understanding), ProCurve switches (which I rarely have to manage and a single workstation is sufficient anyways), one other application (the one this came up with which is moving away from Java).  I do believe I will throw a party the day these three are done with Java. :)   Then hopefully Crystal Reports death will be next! Haha









Friday, September 26, 2014

XenApp 6.5 Session starts then disappears

We started experiencing issues where a user would launch their XenApp 6.5 session, but the application never showed on screen.  When watching in the AppCenter (using quick refreshes) you can see the application start, drop into a disconnected state, and then logoff.


After some very fast Google searches and looking at Citrix forums there is a lot of information on this.  Unfortunately, I found that none of them hit on the root cause in my environment.


Possible causes found from various posts and Citrix KB articles:


Possible caused by VMware EVC - not in my case
In my case EVC was turned on, but it had always been turned on and I haven't seen this issue before or rather it had been extremely rare and usually resolved by rebooting a XenApp server in error.  So why would EVC suddenly cause me an issue when we've been running for a year with no issue?


Possible caused by Citrix EUEM (Edgesight)
Not in my environment.  At one time I attempted to use Edgesight, but quickly found that the C++ libraries conflicted with our main application.  It's been disabled since and I haven't gotten back around to turning it on since the fix was released.


Default 1 minute time-out exceeded for long logons
This one doesn't actually really hit on a cause imo, rather the effect.  The effect here is that logons are taking a long time and so it hits the timeout.  The cause is unknown, that's where we need to dig in and find out why in order to make logins more timely.  Just increasing the "timeout" while helpful in the short term doesn't fix the root cause.


The root issue in my case
After having looked at all the above my partner in crime says to me "AD users and computers seems slower to load lately".  That's strange.


After a quick look at all the Domain Controllers in the environment the issue becomes very obvious.  The DC's CPU's are pegged out, as is their RAM...  We can all do the math there, run out of RAM and it's going to start paging and burn up CPU.


After a quick addition of RAM (and added an additional CPU for kicks) all is wonderful again.  XenApp launches significantly faster and the timeout is no longer reached causing the timeout.




Don't just increase the timeout!  See if there is an underlying issue first.
















Friday, August 15, 2014

Adobe Reader XI Freezes

Yesterday out of the blue 2 users call me complaining that when they open Adobe Reader 11.0.7 it freezes after 5 seconds and then goes to "not responding". 




Looking in eventvwr you see:
Log Name:      Application
Event ID:      1002
Level:         Error
Description:
The program AcroRd32.exe version 11.0.7.79 stopped interacting with Windows and was closed. To see if more information about the problem is available, check the problem history in the Action Center control panel.
 Application Path: C:\Program Files (x86)\Adobe\Reader 11.0\Reader\AcroRd32.exe


This would occur regardless of if I just opened Reader without a document, or with a document.  I quickly found that if I unplugged the network cord and then opened it then it wouldn't crash.  So clearly it was attempting to find something on the network, likely a recent item.


I tried a number of fixes, but in the end blasting the users Adobe settings is what fixed it for me
  1. open regedit
  2. Navigate to HKCU/Software/Adobe  (ensure you are under Current User key!!)
  3. Delete the whole blasted Adobe key and subkeys.
Of course the best fix is to give Adobe the boot and use an alternative, but as many of us know that isn't always a valid option.





 

Tuesday, August 5, 2014

Exchange 2010 Search and Restore deleted email

Story goes something like this...  (we'll use a fictitious name of Mary Lost to protect the innocent here)
Mary: "I never got super important email".
Me: "Are you sure they sent it?"
Mary: "Definitely"
Me: Type the send or subject in the search and click "all outlook items"
Mary: "Can't, I never got it at all"
Me: "Okay, let me see what I can find, who sent it"


And so my quest began to find the "missing" email that was never received but definitely had been sent.


To start I do the simple search the users mailbox.  Attach mailbox, search, nothing...


Now we can look to see if the system ever received it.  In my case we use MXLogic (now McAfee) so to start with I could run a message audit at that level.  Yep, mxlogic shows it being delivered to the Exchange server.  Don't use MXLogic?  No problem, just go straight to the Exchange server to look.
Check Exchange for receipt:
  1. Open the EMC
  2. Toolbox
  3. Tracking Log Explorer
  4. Enter recipient, Subject, dates and look for the email.
  5. Note: I found it best to do this from the Exchange server itself.  Otherwise you have to properly populate the "server" field, and even then I got mixed results.
Once you've found the message you can see the EventID which will likely be "RECEIVE".  If you can't find it here then likely the message was never sent.


Great, now I KNOW it was delivered, perhaps it was deleted?  After some research I found that it may not be easy to tell if it was deleted, etc unless audit logging is turned on before hand.  Bleh, we don't need to know that bad.  But can we restore it...?  That's what is really important here. 


So we know:
  1. It was delivered to the mailbox
  2. It's no longer there
  3. Thus it was likely deleted
  4. Since it's not in the deleted items folder, it was either SHIFT Deleted or also deleted from deleted items or the computer monster ate it. 
  5. All that really matters is it's gone and we want it back.


Fire up the EMS (powershell)!
In this case I'm going to search Mary Lost's mailbox for the SearchQuery string (subject, from, etc) and then give it a TargetMailbox that is my mailbox so that when it's restored it goes into my email instead under a TargetFolder named "recovery".  (Targetfolder will be autocreated if it doesn't already exist)


Search-Mailbox mlost -TargetMailbox Me -TargetFolder "Recovery" -SearchQuery "from:important@dude.com" -LogLevel Full


Now Mary's missing email is in my mailbox under a folder called "Recovery".  Copy it back to her inbox and all done :)




http://blogs.technet.com/b/exchange/archive/2010/04/26/item-recovery-in-exchange-2010.aspx


Tuesday, March 25, 2014

Outlook 2010 Cannot Send this item

Occasionally I have a very select set of users that would call in unable to send emails.  When they click send they get "Cannot send this item".  Simply copy pasting the contents into a new email would fix.

The common response on the internet is "switch to rich text", which I view as a workaround and at that one I don't like.

Thomas Vuylsteke posted this and has a great solution...  Look at your network and ensure that you're not having network issues.

http://setspn.blogspot.com/2011/10/outlook-cannot-send-this-item.html

In my case it appears to have been due to Linksys devices at a few select desktops.

Sharepoint 2013 slow - Distributed Cache issue

After a fresh install of Sharepoint 2013 I found that navigation was very slow.  It would take 10 - 30 seconds between pages depending on scenario and page.

Looking online I found that some had found that stopping the distributed cache would fix, but after starting again it would slow down.

I finally found this link which helped: http://microsofttouch.fr/default/b/vincent/archive/2012/12/22/service-de-cache-distribu-233-de-sharepoint-2013-spdistributedcacheservice-comment-ne-pas-se-manger.aspx

Translated version:
http://translate.google.com/translate?hl=en&sl=fr&u=http://microsofttouch.fr/default/b/vincent/archive/2012/12/22/service-de-cache-distribu-233-de-sharepoint-2013-spdistributedcacheservice-comment-ne-pas-se-manger.aspx&prev=/search%3Fq%3Dhttp://microsofttouch.fr/default/b/vincent/archive/2012/12/22/service-de-cache-distribu-233-de-sharepoint-2013-spdistributedcacheservice-comment-ne-pas-se-manger.aspx%26biw%3D1280%26bih%3D894

Opened the Sharepoint 2013 Management Shell (run as administrator).
Stop-SPDistributedCacheServiceInstance -Graceful
Remove-SPDistributedCacheServiceInstance
Add-SPDistributedCacheServiceInstance

Then start the distributed cache server.  Now it runs lightening fast.

Wednesday, March 12, 2014

Nagvis - The must value "host_name" is not set

When working in Nagvis you get an error

The must value "host_name" is not set in an object of type "service" in map "mapname" 
(note: can be service, host, etc)

This is caused when you accidentally modify an object and it removes required fields in the config file.

To fix, go to your maps config file.  In opsview it's located at /usr/local/nagios/nagvis/etc/map

nano mapname.cfg
remove the offending entry.  In my case you can see that it defines a service with no contents.



Tuesday, March 11, 2014

WSUS 100% CPU by sqlservr.exe

Recently my WSUS 3.2 server pegged the CPU out at 100% by the SQL server. 

With a small amount of research I found the following:
http://technet.microsoft.com/en-us/library/dd939795(v=ws.10)



In my case i was running Server 2008 R2 so download and install the prereqs.
Server 2008 R2 = http://www.microsoft.com/en-us/download/details.aspx?id=16978
Don't click the download link as that just gives you a worthless txt file.  Instead scrol down to "install instructions" and expand.  Here you'll see a full list of download sites.  Grab the following:
  • Microsoft SQL Server 2008 R2 Native Client x64 Package
  • Microsoft SQL Server 2008 R2 Command Line Utilities x64 Package
Then create the scripts as needed (note: you need to include the full path to the sqlcmd.exe)
"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqlcmd" -S np:\\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query -i C:\Scripts\WsusDBMaintenance.sql
Ensure you "run as adminitrator" or you'll get an access denied error.

Issue resolved.

Remote WMI monitoring Windows Service permissions with non-admin account

I've been trying to get remote WMI to check if a service is running or not and I want to use a non-admin account to do it. 

I found a lot online about setting up permissions for Remote WMI and it "mostly" gets you everything, but in the end I found that a lot of services still didn't show properly.  http://community.zenoss.org/thread/12048

Using this Excellent post https://msmvps.com/blogs/erikr/archive/2007/09/26/set-permissions-on-a-specific-service-windows.aspx we can begin to understand what the different options mean and run the sc config with the proper permissions (rather than running the Zenoss or MS cmd blindly).

  1. Open cmd prompt on the server in question
  2. type sc sdshow scmanager
  3. Take note of the existing permissions.  Notice that they are different for each OS version.
    1. Windows 2012 = D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)(A;;CCLCRPRC;;;SU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)(A;;CC;;;AC)S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)
    2. Windows 2008 R2 = D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)(A;;CCLCRPRC;;;SU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)
We also know that AU (Authenticated users) has limited permissions compared to pre Windows 2003 SP1.  Zenoss and MS articles give AU access to all (with a big caveat of it's not really all services which I talk about in a minute).  So, rather than doing this we can add our own account / group in instead of just blasting AU if desired.
  1. use pstools to get the SID of the account you want to use
    1. psgetsid username
      1. This gives you the SID for your username
    2. sc sdshow scmanager
      1. gives you the existing permissions
    3. Merge the permissions together for your new command (example on 2012 Server).  Note that you ALWAYS want to APPEND what already exists.
      1. sc sdset scmanager D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)(A;;CCLCRPRC;;;SU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)(A;;CC;;;AC)(A;;CCLCRPRC;;;YOURSIDHERE)S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)
      2. Note that this is appended prior to the S: section!
    4. Now if we test using that user account we see that we get back results, but wait surely I have more than 14 services set to auto... Yes, yes I do.

I don't know why, but this doesn't show all the services by far!
But, I found that adding permissions to the services in particular that you do want to monitor will fix.

For instance, query dfsr and you get the following (Found 0 Services)


So, lets get the permissions for the specific services and modify. 
  1. sc sdshow DFSR
    1. D:(A;;CCLCSWRPWPDTLOCRRC;;;SY)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;BA)(A;;CCLCSWLOCRRC;;;IU)(A;;CCLCSWLOCRRC;;;SU)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;SO)S:(AU;FA;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;WD)
  2. Merge in your permissions
    1. sc sdset dfsr D:(A;;CCLCSWRPWPDTLOCRRC;;;SY)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;BA)(A;;CCLCSWLOCR
      RC;;;IU)(A;;CCLCSWLOCRRC;;;SU)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;SO)(A;;CCLCRPRC;;;YOURSIDHERE)S:(AU;FA;CCDCL
      CSWRPWPDTLOCRSDRCWDWO;;;WD)
  3. Test :)
 
 
Either I'm missing something dumb or this is rediculous imo.  I would have never thought that on my journey to setup a non-admin account for remote monitoring that I would be messing with permissions like this.
 
 

Friday, February 28, 2014

Opsview Email "From" / Reply To Address

One of the first things I noticed when setting up alerts from Opsview was that there isn't a quick easy method for changing who the emails come from.  (seems so basic to me)

  1. Change the user info
    1. chfn nagios
    2. in particular you want the Full Name.
  2. Setting the @domain.local took me as a non-linux guy awhile to find.  I did find a LOT of conflicting / wrong info on this though.
    1. The appliance at least has "postfix" out of the box
    2. cd /etc/postfix/
    3. nano main.cf
    4. scroll down to the bottom
    5. modify "myhostname" with what you want the @domain.local to say
    6. if you didn't earlier change the relayhost to your smtp server
    7. Ctrl X to end, Y to save, just hit Enter to keep the same file name.
    8. postfix reload (restarts the service)
    9. Mine comes back as "Alerts" from nagios@mydomain.com  (I couldn't find where to change it from nagios, and didn't care to spend the time on it)

Note: I also just found this which "helped", but wasn't completely correct as I found that myorigin and mydomain did not do what I would have expected.  http://www.opsview.com/technology/guides-help/opsview-configuration/emailsmtp

Remote WMI security via GPO

I recently wanted to create a limited access user account for accessing WMI remotely on servers. 
I came across this blog post http://blogs.msdn.com/b/spatdsg/archive/2007/11/21/set-wmi-namespace-security-via-gpo-script.aspx for deploying the WMI security via GPO and a script.

Unfortunately this wasn't the entire pictures for me with either Server 2008 R2 or 2012.  (in addition I found that it's important to ensure that propogation is set properly before deploying)

To get it to work for me I had to do the following extra steps:
  1. When setting the security, in order to get propagation, I had to click add permissions via the following steps
    1. Do this before you retrieve the security descripter
    2. Click Security Tap
    3. select the level (ie root)
    4. click Security
    5. click Advanced
    6. Click Add
    7. ensure that the Apply to: is set to "This namespace and subnamespaces" is selected
  2. I also had to put the user in the "Performance Log users" security group.  This can be done in GPO or at the local level.  For GPO:
    1. Open GPO and select the policy that you want this in
    2. Under Computer Configuration - Policies - WIndows Settings - Security Settings - Restricted Groups
    3. Right click and add
    4. "Performance Log Users"
    5. In members of this group add your WMI user
    6. gpupdate /target:computer on a server that it's linked to.

Performance Log Users
http://technet.microsoft.com/en-us/library/cc749154.aspx

Note: Performance Log Users have more permissions than Performance Monitor Users.  I tried using just the Performance Monitor Users group without success.

Wednesday, February 26, 2014

Opsview Core Agentless WMI Setup

Recently I set to building Opsview Core setup to monitor my network.  I had found this really nifty setup here: http://community.spiceworks.com/how_to/show/2832-create-an-lcd-network-monitor-using-opsview-nagios-and-nagvis written by awesome spicer Jamin289.

Unfortunately along the way I found that for a non-Linux person like me it left some of the installation steps for me to stumble through.  I don't know linux at all, so some of the below may be obvious, but it wasn't to me so I included everything I could.  This includes adding in WMI for agentless checks :)

Very important to follow the steps to the T, as a lot of this has prerequesites.  Remember CaSE SensiTiVE (if you don't use the right case on some of the setup then an error can occur that requires digging in config files to fix)

NOTE: For logging out of opsview I found that I couldn't with IE v9, the screen was flaky.  But, I could with Firefox.  On the flip side of that I found that editing the Nagvis maps SUCKED in Firefox, but worked great in IE v9.  HAHA, have fun.

I did this with the Opsview Core Appliance v4.4
  1. Import the appliance
  2. Log into the appliance with conf / conf
  3. 'sudo su' to get to root access
  4. Change the IP address with netconf
  5. use 'passwd' to change the password of conf
  6. Download the gadgets you want and install (download from the link in the spiceworks guide)
    1. extract the contents
    2. Use winscp.exe and login with the conf user
    3. copy all the gadget files to the conf users directory
    4. go to the opsview appliance console and ensure you're at root still
    5. run the following:
      1. cp /home/conf/scale_thermometer.php /user/local/nagios/nagvis/nagvis/gadgets/
      2. cp /home/conf/rawWords.php /user/local/nagios/nagvis/nagvis/gadgets/
      3. cp -r /home/conf/rawWords /user/local/nagios/nagvis/nagvis/gadgets/
  7. Skip the section about Installing the opsview Agent Install, remember we're going agentless with WMI!
  8. Now we need to install WMI options
    1. First we need autoconf, Type 'apt-get install autoconf'
    2. Next we need C Compiler, Type 'apt-get install gcc'
    3. Now we need WMI (http://www.edcint.co.nz/checkwmiplus/InstallationTerminalSession)
      1. type 'cd /tmp/'
      2. 'wget www.edcint.co.nz/checkwmiplus/wmi-1.3.14.tar.gz'
      3. 'tar xzvf wmi-1.3.14.tar.gz'
      4. cd wmi-1.3.14
      5. make
    4. Now we test WMI
      1. wmic -U computername/administrator%adminpassword //computername "select * from Win32_ComputerSystem"
      2. You should get WMI info back on that system.
    5. Now we'll install check_wmi_plus.pl to the nagvis location (so that it shows in the dropdown list)
      1. cd /usr/local/nagios/libexec
      2. wget http://edcint.co.nz/checkwmiplus/sites/default/files/check_wmi_plus.v1.54.tar.gz
      3. tar xzvf check_wmi_plus.v1.54.tar.gz
      4. Reset permissions: (remember, I'm not that great at linux, so probably a better way to do this)
        1. chmod -R 555 check_wmi_plus*
        2. chown -R nagios check_wmi_plus*
        3. chgrp -R nagios check_wmi_plus*
        4. ls -la check_wmi_plus.* (shows the permissions)
      5. Now we need some extra CPAN modules (Perl) otherwise we'll see "Can't locate Number/Format.pm" with the nagios plugins and other like errors. This may not all be required, idk. 
        1. cpan Statistics::Basic
        2. cpan Config::IniFiles
        3. yes anytime it prompts
        4. cpan Module::Build
        5. yes anytime it prompts
        6. cpan (to get to cpan shell)
        7. force install DateTime
        8. yes anytime it prompts
        9. q (to quit cpan shell)
        10. cpan Getopt::Long
        11. cpan Data::Dumper
        12. cpan Scalar::Util
        13. cpan Number::Format
        14. cpan ExtUtils::Config
        15. cpan ExtUtils::Helpers
        16. cpan ExtUtils/InstallPaths
        17. cpan TAP::Harness::Env
        18. cpan Module::Build::Tiny
        19. cpan Package::Stash
        20. yes anytime it prompts
        21. cpan CLass::Load
        22. cpan Storable
      6. Now we can test :)
        1. /opt/nagios/bin/plugins/check_wmi_plus.pl -m checkcpu -H computername -U computername/administrator -P password
        2. Run it again.  Should get cpu average on second run.
      7. Cleanup!
        1. rm check_wmi_plus.v1.54.tar.gz
        2. cd /
        3. cd /tmp/
        4. rm cwpss_checkcpu_SMonitor___.state
  9. Open opsview
  10. login to opsview with admin / initial
  11. In the top right corner click admin
  12. Change the Admin password  (note: I found that Firefox works best for opsview)
  13. Go to Settings - Service Checks
    1. Click + to add
    2. We'll do a test for Average CPU Utilization
    3. Name: Average CPU Utilization
    4. Server Group = enter a new one called "OS - Windows Agentless WMI" or whatever you want to group your WMI checks by.
    5. Check Period: 24x7
    6. Plugin: check_wmi_plus.pl
    7. Arguments: -H $HOSTNAME$ -m checkcpu -u %WINCRED:1% -p %WINCRED:2%
    8. Submit
  14. Go to Settings - Attributes
    1. Click + to add
    2. Name: WINCRED
    3. Default Value: leave blank
    4. Default Arg1: USERNAME
    5. Default Arg2: PASSWORD
    6. Submit
  15. Go to Settings - Host Templates
    1. CLick + to add
    2. Name: Windows Agentless
    3. Monitors tab
    4. Drill into "OS - Windows Agentless WMI"
    5. Select the Average CPU Utilization (green +)
    6. Submit
  16. Go to Settings - Hosts
    1. Add Host
    2. Enter hostname, title
    3. Change Icon
    4. Host Templates: Windows Agentless and click the arrow to add it.
    5. Go to Attributes tab
    6. Click the grey +
    7. Select WINCRED
    8. Click the Eye icon
    9. Value: none
    10. Check arg1: Enter domain/username (example computername/administrator or domainname/username)
    11. Check arg2: Enter the password (remember no special meta characters unless you escape them, ie !)
    12. Submit
  17. Settings - Apply Changes
  18. Reload Configuration
  19. Monitoring - Hosts
    1. Click your new host
    2. Click the Mass Re-Checks icon in top left
    3. Toggle all checkboxes - Submit
    4. Do the mass re-checks again.
    5. Should show your cpucheck (or error if you missed a step / got your username or password wrong)
  20. Now you can add the rest of your WMI checks!
    1. http://mastermonsvr.smartmon.com.au/mp-bin/public/public.cgi?mode=public&mode2=showplugindetails&plugin=check_wmi_plus.pl
    2. http://www.edcint.co.nz/checkwmiplus/?q=MakePerfRawDataClassCheck
    3. https://wmie.codeplex.com/
  21. Now you can do your other steps with ease since you know a little Linux :)
  22. Copy your nagvis map as described in Spiceworks post (and icons)
    1. Back to Winscp.exe
    2. Once again copy your jpg to the conf home
    3. While your at it unzip the Icon that Jamin289 posted and drop them into the conf home
      1. I unzipped to a folder named blocks and thin_blocks and copied those folders to the conf home
    4. Back to Putty to copy them to the proper location (doing them to these locations allows them to populate the opsview dropdown lists)
      1. cp /home/conf/NagvisLayout.jpg /usr/local/nagios/nagvis/nagvis/images/maps/
      2. cp -r /home/conf/blocks/* /usr/local/nagios/nagvis/nagvis/images/iconsets/
      3. cp -r /home/conf/thin_blocks/* /usr/local/nagios/nagvis/nagvis/images/iconsets/
    5. Apply Changes
    6. Reload Configuration
  23. Load the Nagvis map
    1. Modules - Nagvis
    2. Edit Current Map
    3. Right click in the text area and select Manage - Maps
    4. Under the "Create Map" area
      1. Name Map
      2. User with read Permissions: EVERYONE
      3. User with write Permissions: EVERYONE
      4. Map Iconset: Select Blocks or Thin_Blocks
      5. BackGround: Select NagvisLayout.jpg
    5. Click on your new map and begin loading the iconsets
      1. Edit Current Map
      2. Right click - Add Object - Icon - Service
      3. Select Host_name and service_description
      4. Place it where you want it.
      5. Save
    6. Continue as you desire with the guide from Jamin289
Good Luck!!

Don't forget to donate if you like what they've done to get you WMI.   http://www.edcint.co.nz/checkwmiplus/?q=donations_and_sponsorship

Tuesday, January 14, 2014

Citrix Receiver Progress Bar / Application open in background

In the newer versions of Citrix Receiver the connection progress bar / status bar launch in the background by default.  In addition once the application opens it is launched in the background.

Application opens in background:
There is a seamless flag that can be set that allows the application to open again in the foreground.  This is set at the XenApp server level.
HKLM\SYSTEM\CurrentControlSet\Control\Citrix\wfshell\TWI
dword=SeamlessFlags
value = 0x4


You can also set this at the Receiver client level.
http://support.citrix.com/article/CTX131977
HKLM\Software\Citrix\ICA Client\Engine\Configuration\Advanced\Modules\WFClient
HKLM\Software\Wow6432Node\Citrix\ICA Client\Engine\Configuration\Advanced\Modules\WFClient
Reg_SZ = TWISeamlessFlag
Value = 1


Progress Bar launches in background:
With Citrix Receiver 4.1 there is now a registry key that can be added to force the progress bar to the foreground.  This is set on the client.
http://support.citrix.com/article/CTX138197
HKLM\Software\Citrix\ICA Client
HKLM\Software\Wow6432Node\Citrix\ICA Client
dword = ForegroundProgressBar
Value = 1

Tuesday, January 7, 2014

CAS Array Object / RpcClientAccessServer

About a year and a half ago (somehow I forgot to post this) I needed to decommision an old Exchange 2010 server and move all the mailboxes to a new one due to a Hypervisor switch.  I learned a very good lesson then that I wish I'd known when I originally setup my first CAS server...

I quickly found that even though all the mailboxes where moved and all clients had connected to the new box that turnning off the old Exchange server caused Outlook to lose connection.  After looking for a few brief moments I found that they where still connecting through the old CAS box. ACK

http://blogs.technet.com/b/exchange/archive/2012/03/23/demystifying-the-cas-array-object-part-1.aspx

I always thought of the CAS Array by what it's name sort of indicates, more than one CAS, but I was wrong and I paid for it.

You want to setup the cas array object to populate outlook with an FQDN that isn't server specific.  For instance outlook.domain.com.  You would then have DNS setup to tell Outlook which server to point Outlook.domain.com (or which load balancer).  Thus if you migrate to a new server you just update DNS.

Failure to do this results in having to touch each and every outlook install or using a prf to update (or some other method).

Do yourself a favor, setup a CAS array from the begining or if you already missed this step go ahead and setup the cas array and then begin slowly changing all your outlook installs to point to the array.


also read
http://blogs.technet.com/b/exchange/archive/2012/03/28/demystifying-the-cas-array-object-part-2.aspx

Monday, January 6, 2014

NETLogon not replicating - Replication service stopped replication on volume C

After a dirty shutdown of a Windows Server 2012 DC I found that my NETLogon was no longer replicating

Event log had event ID 2213 listed under DFS Replication:

The DFS Replication service stopped replication on volume C:. This occurs when a DFSR JET database is not shut down cleanly and Auto Recovery is disabled. To resolve this issue, back up the files in the affected replicated folders, and then use the ResumeReplication WMI method to resume replication.

Additional Information:
Volume: C:
GUID: guidofvolume

Recovery Steps
1. Back up the files in all replicated folders on the volume. Failure to do so may result in data loss due to unexpected conflict resolution during the recovery of the replicated folders.
2. To resume the replication for this volume, use the WMI method ResumeReplication of the DfsrVolumeConfig class. For example, from an elevated command prompt, type the following command:
wmic /namespace:\\root\microsoftdfs path dfsrVolumeConfig where volumeGuid="GUIDofvolume" call ResumeReplication


In Server 2012 the default behavior has changed to a manual recovery from dirty shutdown.

http://blogs.technet.com/b/filecab/archive/2012/07/23/understanding-dfsr-dirty-unexpected-shutdown-recovery.aspx


In my case just executing the wmic command resolved.
wmic /namespace:\\root\microsoftdfs path dfsrVolumeConfig where volumeGuid="GUIDofvolume" call ResumeReplication


At that point you can either start monitoring your eventvwr on your DC :)  or set this back to autorecovery
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set StopReplicationOnAutoRecovery=FALSE

Thursday, January 2, 2014

DHCP options for SIP server and SIP port

In a recent deployment of a VOIP system (NEC sv8100) I wanted to setup DHCP to hand out the SIP options.

This is pretty simple although some of the references I looked at made it look confusing.
For the system I was working on only two options are necessary, SIP Server IP address and SIP server port (if different than the default).

  1. Open up DHCP, right click IPv4 and choose "Set Predefined Options"
  2. Note that "option name" for 120 doesn't exist (unless added previously).  Click "Add"
  3. Put in a name of SIP Server IP Address, Data type should be binary, code equals 120, and a description as you see fit.
  4. Click OK
  5. Go to your server options (or scope options depending on what you want). 
  6. Click "Configure options"
  7. Check mark option 043 Vendor Specific Info
    1. This option specifies the port to use
    2. enter the HEX value under the binary section with A8 02 appended to the front. For instance for port 5080 it would look like A8 02 13 D8
      1. A8 = 168 sub option
      2. 02 = required first byte
      3. 13 D8 = 5080 in HEX
      4. if you wanted port 5060 it would be A8 02  13 C4
  8. Check mark option 120 SIP Server IP Address
    1. Here you enter the SIP server IP address in HEX format with a 01 appended to the front.
    2. Then the ip, so 192.168.1.2 would be C0 A8 01 02.
    3. Put the 01 in front and get 01 C0 A8 01 02
    4. 01 = indicates that it's the first sip server
    5. C0 = 192
    6. A8 = 168
    7. 01 = 1
    8. 02 = 2
    9. Put the hex value in the binary section.  (note, the ASCII will look like nonsense)
Your all set.  Bounce your phone so it gets DHCP from your server and ensure it finds the SIP server.  If you get an error "SIP Server not found" then you either have the IP address incorrect or the SIP Server port incorrect. 

Note: troubleshooting DHCP is very easy with wireshark, just filter for the Bootp.  This way you can see what options it's handing out.  There is also a handy tool out there called DHCPtest
http://blog.thecybershadow.net/2013/01/10/dhcp-test-client/