Webhook POST fails but endpoint was successfully verified

I have implemented an endpoint to link up to Okta webhooks, but I’ve run into a problem that looks like a bug in the okta platform (at least I can’t think of anything else I can try at this end).

The endpoint is slightly unusual as it utilises a non-standard port (4433) for testing, so the URL looks something like https://server.company.com:4433/hook/

The weird thing is that the GET request made to verify the hook was received and the challenge was responded on so the hook shows as verified in the webhook manager.

The Okta system log simply shows that the event delivery failed (after 2 retries), but the server never receives the POST request. My handler for the POST requests simply logs the raw request and returns a 204 HTTP status within a few ms when I test it with Postman.

On that point, I can POST to my endpoint using Postman with the same Accept and Content-Type headers that the Okta docs say the request will make, so it looks like the actual request is failing from Okta rather than it being blocked by something within my infrastructure.

Neil, can you file a support case so someone can investigate this issue further?

Hi @NeilP

Can you please check that the SSL certificate used is valid and has a correct chain added? This can be checked through openssl s_client -connect server.company.com:4433.

If the certificate chain is incorrect, Okta can not establish the SSL handshake to post the details.

Hi @dragos - Thanks for the tip. I checked and the openssl client can connect and it does retrieve a full cert chain, and ultimately a CONNECTED (00000003) status. If this were the issue, I’d be surprised as I would have presumed the GET would need a valid SSL chain as well. Anyway, that doesn’t look like the problem I’m seeing, but it was worth ruling out!

@andrea I’ll open a support ticket now, thanks :slight_smile:

So, just feeding this back in case anyone else has the same issue. I have been working with Okta support and after going through various debugging steps, I stumbled on the issue which is not as clear-cut as you’d think. I managed to get it working on my home dev PC running a Let’s Encrypt SSL certificate which gave me a clue as to what was going on.

tl;dr - if your SSL certificate yields multiple certification paths, you must ensure that ALL of the intermediary certificates are served by your end-point web server.

The issue stems from how some CAs issue certificates with multiple roots to preserve backwards compatibility and support older web clients. Typically older clients bundle with different CA trusted roots to modern clients.
If you use a modern client (such as Windows 10, or a recent distro of Linux with curl), you will likely not see any problems connecting to your end-point, like I found with my tests. However (and, granted, this is a presumption on my part), it would seem that the Okta webhook client is using an outdated CA bundle, which means it relies on the server it is calling supplying all of the intermediary CA certificates.

In my case, I was supplying the bundled intermediaries, as my CA gave me them when the original certificate was issued. However it’s partially assumed that the old chains are no longer needed and so by default (at least with COMODO) doesn’t bundle the older support intermediaries.

In order to fix it, I used the excellent SSL scanning tools by Qualys (https://www.ssllabs.com/ssltest/) to see all of the certification paths that could be tried, and identified the missing CA intermediaries. After downloading and installing them on my Windows server, IIS automatically starts serving them. You likely have to register the full certificate bundle via configs if you’re using apache.

My only gripe is that this is caused (as far as I can tell) by an “older” client that is owned by Okta. Their diagnostic information is clearly lacking in this regard as the UI only shows as a timeout. Any TLS errors are hidden even from the back-end as the support guy didn’t have any more detail to give me when he investigated.

If nothing else, and presuming there’s no improvement to the webhook client forthcoming, I would have loved to see flashy neon lights on the okta documentation to state the SSL / TLS requirements (such as the server must accept TLS 1.2, for example), and a warning about how some CAs will issue certificates that are not immediately compatible with the webhook system.

I also wonder if this same issue will be present in the SCIM client (something I’m about to embark on).

Thanks for your detailed answer. I will resume with technical terms what I just got.

Let’s Encrypt supports more than one encryption chain. I was using only ISRG Root X1 and I had this issue, because Okta dues not include this root certificate in its trust list for the webhooks.

By adding the DST Root CA X3 chain, which uses an old root certificate, it works: Production Chain Changes - API Announcements - Let's Encrypt Community Support

How you are actually going to deploy both chains, depends on your server, however when checking the certificate, make sure that there is at least one chain with DST Root CA X3. For Okta, you don’t actually need both only the DST Root CA X3, as of September 28th 2021.

For future travelers, the DST Root CA X3 is going to expire as of Semptember 29th 2021.