When retrieving all users, what happens if new users are provisioned?

I want to retrieve all the users in my org, which will take a long time. During that time, new users will be automatically provisioned unless I turn that functionality off. Does the list all users pagination sort by the timestamp the user was added? Do I need to worry about new users messing up pagination during my retrieval process and should I disable my functionality to add new users while I’m running the retrieval process?

I should note that I don’t necessarily care about picking up new users, I just don’t want to miss already added users because new users were added into the middle of the page I was on and bumped old users off the page, so I skipped over them.

Hello,

It should not mess up pagination as far as I know. Pagination is based off of id.

You can also include filters such as created on a date before the current time so no new user adds would be included.

1 Like

Are okta id’s ordered lexicographically? Will new users get a higher lexicographically ordered id? If not could you explain further whether the pagination maintains an order from earliest user added to Okta to latest user added? I will work on testing this as well.

ETA:
Here’s my code, new to python so mind any peculiarities :sweat_smile:

from okta.client import Client as OktaClient
import asyncio
import os
from dotenv import load_dotenv
import json
load_dotenv()

config = {
    'orgUrl': os.getenv("OKTA_ORG_URL_PREP"),
    'token': os.getenv("OKTA_SSWS_TOKEN_PREP")
}

client = OktaClient(config)

async def main():
    try:
        # start from an empty file
        with open("output.txt", "w") as f:
            f.truncate(0)
        users, resp, err = await client.list_users()
        if err is not None:
            raise err
    
        with open("output.txt", "a") as f:
            while True:
                for user in users:
                    f.write(f"{user.id}\n") # Add more properties here.
                if resp.has_next():
                    users, err = await resp.next()
                else:
                    break
    except Exception as e:
        print(f"An exception occured: {e}")
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

I ran it once and get a list of customers (28 in my test org).
I noticed that the id’s returned by the script above are the same each time I run it, but they are not lexicographical.
Then, I added another customer and ran the script again. The new customer appeared 2nd to last, rather than last… messing up the previous order. Next I will try to add a filter to not include new customer as that may be my only solution.

I did some further testing and it looks like the default sorting is first by status and then by id. My last user is LOCKED_OUT, so the new user got added to the end of the “ACTIVE” users and just before the LOCKED_OUT user. I ran it with these query params with the outcome I was looking for:

    qp = {
        "search": "id pr",
        "sortBy": "id",
    }

I found that sortBy does not work unless you also have a search param. The simplest search I could think of that gets every user is “id pr” which just checks that the id value is set, which happens to be the value I am looking to extract from each user.

2 Likes

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.