Okta vs. SharePoint Server On-Premise

Okta vs. SharePoint

The Case of The Missing Manual:

Okta vs. SharePoint Server On-Premise

Dr KinleyWarmest greetings to you! Dr Kinley here again. In this chapter of Dr Kinley’s Facebook, we look at some of the configuration pitfalls with Okta and SharePoint Server in “The Case of the Missing Manual”.

A customer came to us having already embarked upon a project to migrate their SharePoint Server farm, and several thousand existing users, to Okta, the single sign-on cloud application.

If you are considering migrating to Okta for an existing SharePoint on-premise farm, you are doing something that is possibly very, very hazardous to the well-being of your SharePoint farm. You are potentially about to make a C.L.M. (Career Limiting Move).

We managed to avoid a lot of pain for our customer by filling in the gaps left by Okta.

If you are considering such an integration yourself, and haven’t started the process yet, read this first!

The Art of the Possible

Don’t let anyone – especially Okta – tell you otherwise. This is what is definitely achievable with SharePoint and Okta:

  • Existing Windows users can be migrated to Okta users
  • Existing Windows groups can be migrated to Okta groups
  • Permissions can be retained

Here’s what you can’t do:

  • Configure SharePoint services to run as Okta accounts
  • Switch off NTLM entirely

Required Reading

Before you get started, you’ll need to read these:

Missing from the Manual

There seems to be a manual missing. I’ve looked, I’ve Googled, I’ve asked. The configuration steps for SharePoint on-premise and Okta seem to not be adequately documented.

Okay, well… some of these things Okta tells you. But they don’t tell you loud enough. And they don’t tell you in nearly enough detail. Also the professional services teams at Okta don’t seem to know how to do some of these things themselves. Expect to be on your own. And expect their documentation to be out of date and full of holes.

The rest of this blog post describes some of the missing pieces of knowledge you’ll need to configure SharePoint Server on-premise and Okta without losing precious data.

#1: One SharePoint Web Application equals one Okta App equals one Realm

Create one Okta app per SharePoint web application. You get told the Realm from the Okta App. On the Sign-On tab there is a button that takes you to something that looks for all the world like a knowledge base article. Even down to the formatting and styling of the page. There is literally no clue that the strings and values on that page were uniquely generated for your app, and aren’t the generic manual pages they seem to be.

Somewhere halfway down, way below the fold, you’ll find an example piece of code like this:

$ap = Get-SPTrustedIdentityTokenIssuer "Okta"
$uri = new-object System.Uri($sharepoint_app)
$ap.ProviderRealms.Add($uri, "urn:okta:BIGNUMBERHERE")
$ap.Update()

The Realm for your specific web application is the big, red string on line 3 above. You’ll need to copy and run their block of PowerShell for each additional web application you need to support.

#2: Users need to be migrated to their Okta equivalent

If you are converting an existing SharePoint farm with Active Directory users, you need to migrate them all to Okta users.

You can migrate a user from one identity provider to another with Move-SPUser. You pass it arguments for the current user name to the target user name (e.g. DOMAIN\UserName). Because we’re dealing with claims, the current user name is likely to look like: i:0#.w|[email protected], and the target would be something like: “i:0<UNICODE_CLAIM_TYPE_CHARACTER>.t|okta|[email protected]”. The “.w” means Windows authentication mode, whereas the “.t” means trusted token issuer.

So, how do you find the correct value of <UNICODE_CLAIM_TYPE_CHARACTER>?

#3: Every SharePoint Farm has a different Claim Type Encoding character for Okta

SharePoint has a bunch of built-in claims that it knows of already. The built in claims encoding characters are listed on TechNet. User Principal Name, Email Address etc.  When you install an additional identity provider, such as Okta, SharePoint creates another entry for its principal claim and assigns it a unique character (unique to your farm). You can retrieve the character with the following PowerShell:

Get-SPClaimTypeEncoding | ? { $_.ClaimType -like '*Okta*' };

Putting this all together, you get:

$oktaClaimChar = Get-SPClaimTypeEncoding | 
    ? { $_.ClaimType -like '*Okta*' };

Get-SPUser -Web https://contoso.local 
   -Identity "i:0#.w|[email protected]" | 
   Move-SPUser -NewAlias ("i:0" + $oktaClaimChar +
      ".t|okta|[email protected]");

Important safety tips:

  1. Migrating the same user twice is bad *
  2. Migrating service accounts is bad **

* …Really bad. SharePoint deletes all instances of the original user and replaces it with a new blank one that owns nothing. You lose all permissions information about the user.

** …you’ll kill the User Profile Service stone dead, for instance.

#4: Groups need to be migrated to their Okta equivalent

The Okta engineer assigned to us didn’t seem to realise there might even be an issue with groups.

“After migrating the users, what should we do about groups?” I asked.

“Oh, I don’t think there should be a problem,” he said.

Obviously, wrong.

The next Okta engineer tried to explain that Okta only used groups to advertise your Okta apps to users, and we should replace the Active Directory groups with SharePoint groups.

“You can select the Okta groups that will be shown the SharePoint icon when they sign in. It works based on the first group you’re a member of. You need to add your Okta users to SharePoint groups.”

Also wrong.

You definitely, positively need to migrate your Active Directory groups in SharePoint to their Okta equivalent. However, there is no PowerShell cmd-let equivalent of Move-SPUser. Instead you need to do some object model work:

$f = Get-SPFarm;
$f.MigrateGroup("contoso\group name", "c:0-.t|okta|group name");

To help, here’s how you can get a list of Windows groups to migrate from a specific site collection:

$s = Get-SPSite https://contoso.local;
$s.RootWeb.SiteUsers | ? { 
	$_.IsDomainGroup -and $_.UserLogin.Contains("c:0+.w|s-1-5")
} | % { 
	$group = $_;
	$oldGroupName = $group.UserLogin;
	$newGroupName = "c:0-.t|okta|" + $group.DisplayName.ToLower();
	$f.MigrateGroup($oldGroupName, $newGroupName);
}; 

By the way, the in the “c:0-.t|okta|” prefix above, the dash character is the claim encoding type symbol for the built-in ‘role’ claim, and is not specific to Okta scenarios.

#5: Search configuration with claims Web Applications is poor

There is no satisfactory solution to this one.

SharePoint Search needs to crawl an NTLM zone. It cannot meaningfully crawl documents from a claims-only zone. If you configure Okta and NTLM on the same zone, SharePoint will prompt the user to select an authentication scheme when they sign in. If you have just migrated several thousand users to Okta, that would be a tremendous mistake. If a user chooses “Windows Account” by accident, it effectively creates them a separate user record within SharePoint that seems to have the same name as theirs but with zero permissions on anything.

SharePoint Search also needs the zone it crawls to be the one called “Default”. This is bad behaviour by design.

Why is this bad? Because when SharePoint doesn’t fully recognise the host header mapping used by the user (e.g. they type the machine name or IP address into the address bar instead of the fully qualified domain name) it pretends you accessed it via the “Default” zone. And it applies all the policy and URL-related shenanigans that come along with that decision.

Best practice in SharePoint architecture is to make the Default zone the most secure one – and the one that has the policies and authentication schemes you want people to default to. But any scenario with claims providers plus SharePoint Search turns that on its head.

The takeaway:

  • Make the Default Zone NTLM only.
  • Enable Okta on a different Zone.
  • Be consistent in those zones for all web applications you host in SharePoint.

For extra points, configure a User Policy at the web application level for each Default zone to deny read/write access to any normal Windows (NTLM) user.

#6: Okta app configuration documentation is poor

The “Group filter” setting tells Okta what group membership claims from AD should be passed to SharePoint. In other words, if I am a member of AD groups “JFDI Administrators”, “JFDI Portal Owners” and “Domain Users”, and I would like to be able to authorise access to a specific list or site based upon membership of these groups, I need to get the value of this filter right.

Some important points: It’s a ‘filter in’ rather than ‘filter out’. It’s a regular expression. So if you want *all* AD groups to be available to SharePoint, this needs to hold the value of .* and nothing else. Definitely not * or left blank. In my example above, if I only wanted to allow SharePoint to know if users are members of groups with a name starting with the letters “JFDI”, then my group filter value would need to be “JFDI.*”.

Okta Group Filter Dialog

#7: Okta is not compatible with SharePoint Publishing Site Collections

OK – as of March 2017 it is.

In November 2016 I uncovered and identified a blocking issue. I found a bug when trying to create a new site collection based upon a vendor’s site definition. Initially, Okta tried to blame the site definition developed by the vendor. Closer inspection of log files showed it was to do with activating the SharePoint Server Publishing Infrastructure feature.

Before we could persuade Okta to bother to recreate the issue themselves, I had to build a brand new farm with just Okta on it and show that any attempt to create a site collection that uses SharePoint publishing – e.g. the built-in Publishing Portal template – would fail, with a ULS entry pointing to Okta’s claim provider. At last they took the issue seriously.

We had to wait 4 months for that fix, but at least it does now work properly. Make sure you get the March 2017 update of their people picker solution.

#8: Migrate All Your SharePoint Web Applications

At the very least, you will have a web application for your content, and a web application for your My Site Host/OneDrive for Business sites. All your Windows web applications – i.e. all the ones which have users in the Active Directory domains you are migrating – and all their users – need to be migrated to Okta.

We wrote a script to iterate over:

  • All web applications
    • All site collections therein
      • All site users with a Windows account

At that innermost loop, we execute Move-SPUser. Doing it in this way ensures that no user is migrated twice; upon migration, a user is converted to Okta everywhere. This takes the user’s Windows account out of the site users collection of sites we haven’t iterated over yet.

#9: Test, Test and Test Again

I can’t stress this enough. You are about to risk serious harm to the ownership and permissions of everybody and every group across your entire farm, and all of the content that includes – sites, lists, libraries, folders and files. Everything.

You owe it to yourself to get this right:

  • Build a proof of concept environment and test the process.
  • Test all kinds of web application, including My Sites if you have them.
  • Build a carbon copy of live, and test with that.
  • Take full farm backups. Test your backups. No test equals no backup.
  • Seriously test you can restore from your backup. If you do not test your backup, you do not have a backup. You have some carbon on a disk. You have two hopes, Bob Hope and no hope… etc, etc.
  • Did I mention backups yet?

#10: Tidy-up Activities

There will probably be activities to do after the migration. Very soon after.

Here are some things that will break. Try and identify if you will be affected by any of these before you migrate.

RSS Web Parts with SharePoint Lists as data sources

You will have to replace these with something else. Maybe Content Query or Content Search web parts. Or JavaScript and REST or JSOM.

Secure Store Applications

You will see a lot of Access Denied errors. Move-SPUser does not seem to alter user information in the Secure Store. Individual and Group application memberships will need to be updated / replaced with their Okta user and group equivalents. The unattended execution (data access) accounts of Excel Services, Visio and PerformancePoint will be affected.

Reporting Services

Using SharePoint lists as OData sources? You will need to establish a Windows account to read that data from the zone you configured for NTLM.

User Profile pictures

If you’re displaying User Profile photos in search results from any web application other than the My Sites one, users will see broken images until they sign in to the My Sites domain. To avoid this you can enable cross-domain user profile pictures on each web application:

$wa = Get-SPWebApplication "https://contoso.local";
$wa.CrossDomainPhotosEnabled = $true;
$wa.Update();

If you’re unlucky enough to use a 3rd party web part to display profile images, it might not know how to honour the above setting. In that case, you’ll need to bust open your JavaScript skills. You can use JavaScript to latch on to the OnError event of the offending IMG element, and reload with the local version of the user’s photo URL: /_layouts/15/userphoto.aspx?size=M&url=<ORIGINAL_PHOTO_URL>

Here’s an example using jQuery:

$('#myContainer > img').on('error', function() {
    var url = "/_layouts/15/userphoto.aspx?size=M&url=" +
        encodeURIComponent($('#myContainer > img').attr('src'));
    $(this).attr('src', url);
});

Conclusion

If this seems all too much like hard work, that’s because it is. Retrofitting Okta to SharePoint on-premise is full of pitfalls.

Get in touch if you’d like help to plan or perform your Okta and SharePoint integration and avoid a potential disaster.

SharePoint Performance Tuning Emergency

SharePoint Performance Tuning Emergency

SharePoint Performance TuningHello. I’m Dr Kinley. In this chapter of Dr Kinley’s Facebook, we look at the Case of the SharePoint Performance Tuning Emergency. Last week, a new customer got in touch. Their business users were screaming: SharePoint had slowed to a crawl. Their SharePoint server farm, of three servers, used to perform well, but had recently started taking a long time to load pages, frequently returning time-outs and other errors. We managed to get them back on track in a matter of hours.

SharePoint Performance Tuning Triage

The first question we ask under these circumstances is this: what’s changed? As it turned out, there were quite a few recent changes to their environment.

Firstly, they had recently – in the last few months – moved from hosting the farm on VMWare to Hyper-V. That’s the virtual equivalent of changing datacentres.

Secondly, they had very recently removed their Search Service Application in an attempt to resolve other errors.

But how would these changes suddenly require SharePoint performance tuning?

Diagnosis: Internet Connectivity

After asking the old, usual chestnuts, “What customisations have you made? What custom software has been deployed?”, it was time for the dog to see the rabbit. We started a screen sharing session using the customer’s favoured sharing app.

Pages were definitely taking a long time to load. 30-60 seconds, usually accompanied by a time-out and an error.

A very cursory glance at the Windows NLB solution they had in place quickly eliminated that from the pool of likely candidates.

Next, we checked Internet connectivity on the servers. Could we launch a browser from the SharePoint servers and reach Google? No, we couldn’t. Although, anecdotally, the servers previously had this capability when they were hosted on VMWare. This is not a limitation of Hyper-V, of course; this was something omitted during the move from one environment to another.

Without investment in SharePoint performance tuning, you will get poor page load times if your servers have no Internet connectivity.

Certificate Revocation List Hell

The problem with not having Internet connectivity on SharePoint servers is twofold, and related to digital certificates.

Firstly, SharePoint uses .NET assemblies (DLL files) signed by Microsoft for security. When an assembly is loaded – pretty much whenever SharePoint does anything at all; load pages, run timer jobs, anything – .NET needs to check that the certificate used to sign the assembly is still valid. To do this, it reaches out across the Internet to crl.microsoft.com, and requests a list of digital certificates that have been revoked by Microsoft. Assuming success, it then checks this list and if the assembly’s certificate is in this list, it will prevent it from loading and give a security error. But, I hear you cry, what happens if there is no Internet connection? Surely it won’t prevent SharePoint from running? What happens is this: it tries to reach crl.microsoft.com; it fails after a 15-30 second time-out; it then shrugs its shoulders and says “what the hell”, and continues to load the assembly anyway.

This problem becomes obvious if you try and run the old STSADM tool from the command line. If it returns within a second or so, you’re good to go. If it waits 15 or so seconds, then you know you have a CRL problem.

The second case, SharePoint uses SSL certificates to encrypt all traffic between servers in the SharePoint farm, for instance when Server A needs to call a service on Server B – perhaps the Search Service, Managed Metadata Service etc. The certificate used for all the servers in the farm is called the SharePoint Root certificate. A default installation of SharePoint does not trust the issuer of this cert, therefore every time it is used, the server needs to confirm the certificate is still valid by reaching out to crl.microsoft.com again. If it gets no response, after a 15 second time-out, it will just error, and abort whatever it was trying to invoke on the other server.

Both these issues are resolvable. To make the problem go away, the servers can be given outbound Internet connectivity. However, not all security teams will sanction this, so it’s not a one-size-fits-all solution. There are also parts of SharePoint that insist on asking the server for a web/HTTP proxy before continuing. Commands like “netsh winhttp set proxy my.proxy.server:portnumber” will help with this, although your security team may wish to limit the Internet access of the SharePoint farm to crl.microsoft.com and nothing else.

But we still need to tell .NET and SharePoint not to try certificate revocation list checking. After all, if there is no Internet access, you can’t perform the check anyway. We can disable some CRL checks outright, and persuade Windows that the SharePoint root certificate is to be trusted without a CRL check by adding it to the Trusted Root Certificate Authorities store in Windows.

However, it’s never easy, and there’s no single master switch. Have a look at the references below to help you resolve them yourself.

Further SharePoint Performance Tuning

Having applied these various fixes, the server farm was back to its old self. In fact, faster for some pages. However, there were still pages that seemed to take a very long time to return results, every-so-often.

There were two further problems:

Data Volume Tipping Points

Their data seemed to have reached a tipping point. Without any SharePoint performance tuning, you will notice poor page load times.

They make heavy use of roll-up queries, and a mix of out-of-the-box and third party web parts that perform recursive scoped queries throughout their site collection. It wasn’t that their data was big in terms of gigabytes, just big in terms of numbers of rows. They had one site collection for most of their data, and it was less than 100GB. Their use of roll-up web parts, however, yielded queries that trawl through thousands of list items and SQL rows. Under default configuration of SharePoint that would trigger query throttling, and errors would be spotted much sooner. Close inspection through Central Admin showed that someone had previously increased the throttling limit from 5,000 list items up to 50,000. This was sufficiently high to effectively switch off throttling altogether.

The problem with this is down to the fundamental SQL table design used by SharePoint. If multiple site collections are in the same content database, all the list items and library items, in all the sites and subsites, in all the site collections – in effect everything – is stored in one, big, wide table called “AllUserData”. Any SQL query that attempts to lock more than 5,000 SQL rows – even for reading – will trigger an escalation to a whole table lock. When that happens, everybody loses. Every query on anything in that table (list/library items, lists/libraries, sites, everything) now has to wait for the query with the table lock to complete.

Setting the threshold back to 5,000 items, and (temporarily, until an alternative strategy can be used) disabling the third party web parts returns stability to the farm. In the medium term, the customer needs to visit each list used in these queries and turn on suitable column indexing for each one. In the long term, consider moving to a search-based strategy.

Rogue Antivirus Misconfiguration: Check Your Exclusions!

Does antivirus configuration on non-SharePoint servers count as SharePoint performance tuning?

Well, the customer spotted and resolved this problem themselves: they had installed an antivirus program on their database server. Although MDF and LDF (database and log) files had been explicitly excluded from scanning, there is a third kind of database file – NDF (“supplementary data files”) – used in SharePoint databases such as the Web Analytics and Logging. These files are usually smaller that the rest of your SharePoint content, but when on-access scanning kicks in, it effectively locks out everything else (including SQL Server itself) for the duration of the scan… which was around 30 seconds. You can resolve this by adding NDF files to the exclusion list in your antivirus program. They resolved this by uninstalling the antivirus on their database servers.

Conclusion and Wrap Up

Once their servers were performing again, we were asked to perform a full health check on their various SharePoint farms. We usually take a couple of days for small farms, and up to a week for larger environments. We run tooling to capture the configuration of your farms and bring those back to base for further scrutiny. We then write a report to document all the key settings and highlight areas of concern, and suggest appropriate changes to meet best practices.

SharePoint Performance Tuning References

CRL issues:

http://support.microsoft.com/kb/2625048
http://blogs.msdn.com/b/chaun/archive/2014/05/01/best-practices-for-crl-checking-on-sharepoint-servers.aspx
http://joelblogs.co.uk/2011/09/20/certificate-revocation-list-check-and-sharepoint-2010-without-an-internet-connection

Table lock issues:

http://joelblogs.co.uk/2013/02/15/sharepoint-2013-content-databases-and-the-alluserdata-table
http://msdn.microsoft.com/en-us/library/hh625524.aspx

Antivirus issues:

http://support.microsoft.com/kb/309422

Office 365 Support and SharePoint Online Too!

Office 365 Support

Planning a migration to the cloud? Or perhaps a hybrid solution, mixing on-premise Exchange, SharePoint or Lync with Office 365? Contact us for live Office 365 support.

SharePoint Online

The SharePoint Doctors don’t just know about on-premise SharePoint, running on your own servers in your own company… we know hosted SharePoint too! So if you’ve already made the move into the cloud and you’re completely stuck with the shape of your SharePoint, we can help you make the very best of SharePoint Online on Office 365 too.

Live SharePoint Support

Live, online SharePoint support, consultancy and training, available immediately whenever you need it from the SharePoint Doctors.

Read more…

The SharePoint Support Surgery is Open

SharePoint Support

Need SharePoint Support? Although it can be embarrassing, we’ll all experience some degree of dysfunction in the SharePoint department at some point in our careers. So don’t be shy.

Live SharePoint Help

Our SharePoint Clinic is open for live SharePoint support. That’s why the SharePoint Doctors are here, ready to help you solve any real-world SharePoint problems. Confidently leading you through the minefield of selecting which SharePoint technology to use, and always administering the appropriate course of treatment no matter what manner of SharePoint issue ails you.

Help Me Now!

If you need the help of a SharePoint Doctor, don’t delay and book now using the link below.

Book a consultation NOW!