EOS Online Subsystem: The roadmap planned for 2024

Outlining what we're working on for EOS Online Subsystem for the remainder of 2023 and first half of 2024.
June Rhodes posted on Nov 17, 2023

I don’t usually write up big “here’s what we’re planning on in the future” posts like this. In fact, it’s probably well known by now that I don’t provide ETAs on features or bug fixes because the roadmap for plugins changes frequently.

However, at this point I’ve got a fairly good idea of what the remainder of 2023 and the first half of 2024 is going to look like in terms of EOS Online Subsystem development. It means it will take a longer to get to the next big feature release and a little longer to address support requests while I do important work on the plugin internals, so I wanted to go through what is being done so you know how this benefits you in the future.

This is a very big post though, so for those of you that want the tl;dr, here’s the summary:

  • Foundational work so that we can support Online Services (OSSv2) in the future
  • A more modern low-level API for accessing the EOS SDK directly
  • An overhaul of sessions and lobbies that will add support for bUseLobbiesIfAvailable and address long-standing bugs and feature requests
  • Improved native platform integration for invites, including on console platforms
On a sidenote...
This is all subject to change, both in design and timeline, but I’m pretty confident that this is what the roadmap looks like at this point.

Where are we right now?

Before we get into what is being worked on, let’s take a look at where EOS Online Subsystem is today.

As of right now, the Marketplace Edition of EOS Online Subsystem is used by over 1,200 Unreal Engine teams worldwide, with an additional 2,000 students and low income developers leveraging the Free Edition in their projects. That’s a pretty big user base! It means if there’s a bug or an edge case, it usually gets reported to me and goes on the issue tracker (login required).

For some of these bugs, particularly around sessions and lobbies, the underlying code is quite old, dating back to 2020 when Unreal Engine 4.25 was still the latest release. Since then, there have been some big changes both on the Unreal Engine side and on our side:

  • OSSv2, also known as Online Services, was introduced in Unreal Engine 4.27 as a more modern replacement of the Online Subsystem APIs. You will have seen usage of OSSv2 if you’ve opened the Lyra project. EOS Online Subsystem needs to be able to support OSSv2 in the future, and right now the architecture of the plugin code can’t do that.
  • bUseLobbiesIfAvailable on the sessions interface, also added in Unreal Engine 4.27. We still don’t support this because the implementation of IOnlineSession is strongly tied to the session APIs in the EOS SDK, and can’t easily support the lobby APIs.
  • There’s a non-zero number of bugs and feature requests that we can’t easily deal with related to sessions, lobbies and parties, because the changes would be too invasive or unmaintainable in the current architecture.

The pressure from these items have been increasing over time, and with the release of EOS SDK 1.16 regressing the ability to view lobby members in search results (impacting the Matchmaking plugin), it is now at a point where these issues need to be dealt with properly. It means I need to spend a bunch of development time refactoring the internals of the plugin.

I know how important backwards and forwards compatibility is to game developers, so I want to stress: there are no breaking changes here. There might be a few minor migration steps in upcoming plugin releases where you need to update some config/INI files, but nothing that requires you to change your game code. This refactoring effort is purely so that I can address long-standing bugs and feature requests from customers.

Supporting Online Services (OSSv2)

Online Services (OSSv2) is the new online API surface introduced in 4.27, and it cleans up a lot of the poor API design from the Online Subsystem (OSSv1) API.

For example, to create a session in OSSv1, you need to register a global event handler, call the CreateSession method, and then unregister the global event handler when you’re done. This API pattern sucks, and it makes tracking state from “I want to create a session” to “the session has been created” far more clunky than it needs to be:

this->CreateSessionDelegateHandle =
    Session->AddOnCreateSessionCompleteDelegate_Handle(FOnCreateSessionComplete::FDelegate::CreateUObject(
        this,
        &UMyClass::HandleCreateSessionComplete));

TSharedRef<FOnlineSessionSettings> SessionSettings = MakeShared<FOnlineSessionSettings>();
SessionSettings->NumPublicConnections = 4;
SessionSettings->bShouldAdvertise = true;
SessionSettings->bUsesPresence = false;
SessionSettings->Settings.Add(
    FName(TEXT("SessionSetting")),
    FOnlineSessionSetting(FString(TEXT("SettingValue")), EOnlineDataAdvertisementType::ViaOnlineService));

if (!Session->CreateSession(0, FName(TEXT("MyLocalSessionName")), *SessionSettings))
{
    // Call didn't start, return error.
}

// --- Elsewhere in your code ---

void UMyClass::HandleCreateSessionComplete(
    FName SessionName,
    bool bWasSuccessful)
{
    IOnlineSubsystem *Subsystem = Online::GetSubsystem(this->GetWorld());
    IOnlineSessionPtr Session = Subsystem->GetSessionInterface();
    Session->ClearOnCreateSessionCompleteDelegate_Handle(this->CreateSessionDelegateHandle);
    this->CreateSessionDelegateHandle.Reset();
}

Compare this with the OSSv2 API for creating a session:

FCreateSession::Params Params;
Params.LocalAccountId = this->GetWorld()->GetFirstLocalPlayerFromController()->GetPreferredUniqueNetId().GetV2();
Params.SessionName = "MyLocalSessionName";
Params.bPresenceEnabled = false;
Params.SessionSettings.NumMaxConnections = 4;
Params.SessionSettings.JoinPolicy = ESessionJoinPolicy::Public;
Params.SessionSettings.CustomSettings.Add(
    "SessionSetting",
    FCustomSessionSetting{"SettingValue", ESchemaAttributeVisibility::Public});
Sessions->CreateSession(MoveTemp(Params))
    .OnComplete(this, [this](const TOnlineResult<FCreateSession> &CreateResult) {
        // Check CreateResult.IsOk().
    });

By every standard, this is a better API. There’s no global event to register or unregister, passing state is as simple as using lambda captures, and you can perform multiple operations in parallel. Every operation in OSSv2 is like this; not only sessions, but leaderboards, achievements, and so on and so forth.

So how do we support OSSv2? Well, one way would be to implement OnlineServicesRedpointEOS with brand new code, and have it completely separate to the existing OnlineSubsystemRedpointEOS for OSSv1.

This causes a problem however: you’d be able to use one or the other, but they’d fundamentally have different states. Perhaps this is manageable in game code when starting development on a new game, but it is impractical if you are trying to gradually migrate an existing game or if you have multiple plugins each accessing a different API (e.g. one plugin uses OSSv1 and another uses OSSv2).

What we actually need to do is store all the relevant state for EOS in an independent module that our OSSv1 and OSSv2 implementations can both use as a source of truth. That way, regardless of which API you’re accessing, you’ll see the same state: create a session in OSSv1, and you’ll see it the session present when using OSSv2. It means turning the current architecture which looks like this:

graph TD
  GameCode[Game code]
  OSSv1[OSSv1 Interfaces]
  EOSSDK["EOS SDK"]
  subgraph EOSOnlineSubsystem["EOS Online Subsystem"]
  subgraph OnlineSubsystemRedpointEOS["OnlineSubsystemRedpointEOS Module"]
  OnlineSessionInterfaceEOS
  end
  end
  GameCode --> OSSv1
  OSSv1 --> EOSOnlineSubsystem
  OnlineSessionInterfaceEOS --> EOSSDK

Into this:

graph TD
  PluginACode[Plugin A code]
  GameCode[Game code]
  PluginBCode[Plugin B code]
  OSSv1[OSSv1 Interfaces]
  OSSv2[OSSv2 Interfaces]
  EOSSDK["EOS SDK"]
  subgraph EOSOnlineSubsystem["EOS Online Subsystem"]
  RedpointEOSCore["RedpointEOSCore Module"]
  subgraph OnlineSubsystemRedpointEOS["OnlineSubsystemRedpointEOS Module"]
  OnlineSessionInterfaceEOS
  end
  subgraph OnlineServicesRedpointEOS["OnlineServicesRedpointEOS Module"]
  SessionsRedpointEOS
  end
  end
  PluginACode --> OSSv1
  GameCode --> OSSv1
  GameCode --> OSSv2
  PluginBCode --> OSSv2
  OSSv1 --> OnlineSubsystemRedpointEOS
  OSSv2 --> OnlineServicesRedpointEOS
  OnlineSessionInterfaceEOS --> RedpointEOSCore
  SessionsRedpointEOS --> RedpointEOSCore
  RedpointEOSCore --> EOSSDK

This means we’ll track things like the current login state and cached data inside a “RedpointEOSCore” module. This module also manages having different EOS SDK platform instances for different play-in-editor windows and ticks the EOS SDK platform as needed. Since we control the API surface of “RedpointEOSCore”, we’re no longer beholden to changes Epic makes in the OSSv1 or OSSv2 interfaces; we can simply update the wrapping code that implements OSSv1 and OSSv2 without having to refactor the plugin code in the future.

This alone is not a trivial amount of work. It involves taking the code inside classes such as OnlineSessionInterfaceEOS and moving them to an independent module that does not rely on the types specific to OSSv1 or OSSv2. This has to be done for every implementation inside OnlineSubsystemRedpointEOS, including achievements, avatars, entitlements, friends, identity & authentication, leaderboards, lobbies, parties, presence, purchasing & e-commerce, stats, title file, user cloud and voice chat.

However, once this work is done, you’ll be able to use both Online Subsystem (OSSv1) and Online Services (OSSv2) in your games, and incrementally upgrade to the newer OSSv2 over time as you write new code. Third-party plugin developers can write integrations against either OSSv1 or OSSv2, and EOS Online Subsystem will be able to work with them.

A better low-level API for EOS

The EOS SDK is a low-level C API. This means that when the plugin wants to call an asynchronous API, it needs to do the work of marshalling arguments and managing callbacks. We have a few helper functions in place to make the asynchronous callback management a bit easier, but there’s still quite a bit of marshalling code that needs to be written every time the SDK is called.

Take this condensed example for querying leaderboards:

EOS_Leaderboards_QueryLeaderboardUserScoresOptions QueryOpts = {};
QueryOpts.ApiVersion = EOS_LEADERBOARDS_QUERYLEADERBOARDUSERSCORES_API_LATEST;
QueryOpts.StartTime = EOS_STATS_TIME_UNDEFINED;
QueryOpts.EndTime = EOS_STATS_TIME_UNDEFINED;
QueryOpts.UserIds = nullptr;
QueryOpts.UserIdsCount = 0;
EOSString_ProductUserId::AllocateToIdList(
    ProductUserIds,
    QueryOpts.UserIdsCount,
    (EOS_ProductUserId *&)QueryOpts.UserIds);
QueryOpts.StatInfo = StatInfo;
QueryOpts.StatInfoCount = StatInfoCount;
// ...
EOSRunOperation<
    EOS_HLeaderboards,
    EOS_Leaderboards_QueryLeaderboardUserScoresOptions,
    EOS_Leaderboards_OnQueryLeaderboardUserScoresCompleteCallbackInfo>(
    this->EOSLeaderboards,
    &QueryOpts,
    EOS_Leaderboards_QueryLeaderboardUserScores,
    [WeakThis = GetWeakThis(this), StatInfo, StatInfoCount, QueryOpts, /* ... */](
        const EOS_Leaderboards_OnQueryLeaderboardUserScoresCompleteCallbackInfo *Data) {
        EOSString_ProductUserId::FreeFromIdListConst(
            QueryOpts.UserIdsCount,
            (EOS_ProductUserId *)QueryOpts.UserIds);
        for (uint32_t i = 0; i < StatInfoCount; i++)
        {
            FMemory::Free((void *)StatInfo[i].StatName);
        }
        FMemory::Free(StatInfo);

        // ...

There’s quite a bit going on here, including setup of the StatInfo pointer (not shown). Before every call to this API, we have to allocate an array of product user IDs to the heap, and then when the callback comes back via EOSRunOperation we have to release that heap memory. We also need to wrap the current instance in a weak pointer (with GetWeakThis) and check it’s validity in the callback with PinWeakThis in case the online subsystem has been shutdown by the time the callback happens.

Now thankfully this particularly complex API call is only made in one place, but the pattern is the same for every API call that the plugin makes to the EOS SDK, and it’s rather error prone. If I don’t initialize a parameter correctly, or miss the call for releasing heap memory upon callback, this impacts downstream customers. And if you’re a customer who needs to access the SDK directly (perhaps you want to query leaderboards between specific times), you also need to do all this marshalling and callback handling in your game code.

Of note, string marshalling is particularly painful. Each API parameter in the EOS SDK can either be UTF-8 or ANSI, and it’s not well documented which parameter is which encoding. There are some internal helpers in the form of the EOSString_ classes, but they’re not comprehensive enough to cover every API parameter.

To address these shortcomings, I’ve been adding a RedpointEOSAPI module, which provides a higher-level one-to-one mapping of the EOS SDK that uses Unreal types and modern C++ for parameters. For example, let’s look at calling EOS_Lobby_CreateLobby using the new API:

using namespace Redpoint::EOS::API;
using namespace Redpoint::EOS::API::Lobby;
using namespace Redpoint::EOS::Core::Id;

// ...

FCreateLobby::Execute(
    this->PlatformHandle,
    FCreateLobby::Options{
        GetProductUserId(Request.LocalUser),
        Request.MaxPlayerCount,
        EOS_ELobbyPermissionLevel::EOS_LPL_INVITEONLY,
        bPresenceEnabled,
        false,
        BucketName,
        bDisableHostMigration,
        bEnableRTCRoom,
        RTCRoomJoinFlags,
        bRTCRoomUseManualAudioInput,
        bRTCRoomUseManualAudioOutput,
        bRTCRoomLocalAudioDeviceInputStartsMuted,
        TEXT(""),
        bEnableJoinById,
        bRejoinAfterKickRequiresInvite,
        AllowedPlatforms,
        bCrossplayOptOut},
    FCreateLobby::CompletionDelegate::CreateSPLambda(this, [](const API::Lobby::FCreateLobby::Result &Result) {
        if (Result.ResultCode == EOS_EResult::EOS_Success)
        {
            // ...
        }
    }));

There’s no memory management required at the call site, we can leverage Unreal’s delegate system to handle this validity, and all parameters for the call are required to be set through the TRequired helper. Everything is namespaced so we avoid polluting the global scope, and each API call knows if the underlying EOS_HPlatform instance is still valid before making the SDK call through the new FPlatformHandle type.

Defining each of these API calls is made rather straightforward with macros as well:

namespace Redpoint::EOS::API::Lobby
{

class REDPOINTEOSAPI_API FCreateLobby
{
    REDPOINT_EOSSDK_API_CALL_ASYNC_BEGIN(Lobby, CreateLobby, EOS_LOBBY_CREATELOBBY_API_LATEST)

    class Options
    {
    public:
        const ParamHelpers::TRequired<EOS_ProductUserId> LocalUserId;
        const ParamHelpers::TRequired<uint32> MaxLobbyMembers;
        const ParamHelpers::TRequired<EOS_ELobbyPermissionLevel> PermissionLevel;
        const ParamHelpers::TRequired<bool> bPresenceEnabled;
        const ParamHelpers::TRequired<bool> bAllowInvites;
        const ParamHelpers::TRequired<FString> BucketId;
        const ParamHelpers::TRequired<bool> bDisableHostMigration;
        const ParamHelpers::TRequired<bool> bEnableRTCRoom;
        const ParamHelpers::TRequired<RTC::EJoinRoomOptions::Type> RTCRoomJoinFlags;
        const ParamHelpers::TRequired<bool> bRTCRoomUseManualAudioInput;
        const ParamHelpers::TRequired<bool> bRTCRoomUseManualAudioOutput;
        const ParamHelpers::TRequired<bool> bRTCRoomLocalAudioDeviceInputStartsMuted;
        const ParamHelpers::TRequired<FString> LobbyId;
        const ParamHelpers::TRequired<bool> bEnableJoinById;
        const ParamHelpers::TRequired<bool> bRejoinAfterKickRequiresInvite;
        const ParamHelpers::TRequired<TArray<uint32>> AllowedPlatformIds;
        const ParamHelpers::TRequired<bool> bCrossplayOptOut;
    };

    class Result
    {
    public:
        EOS_EResult ResultCode;
        FString LobbyId;
    };

    REDPOINT_EOSSDK_API_CALL_ASYNC_END()
};

} // namespace Redpoint::EOS::API::Lobby

In a future update of EOS Online Subsystem, you’ll be able to use these modern APIs in your own game code, so you don’t need to do manual memory management or string marshalling. As an added benefit, we’ll also be able to handle some types of EOS SDK API changes at this layer (such as the recent RTC changes in 1.16), so you have a consistent API surface regardless of which EOS SDK version you’re targeting.

Overhauling sessions and lobbies

As previously mentioned, the implementation inside OnlineSessionInterfaceEOS is strongly tied to the session APIs in the EOS SDK. The lobby and party interfaces are much the same in that they are tied to the lobby APIs in the EOS SDK.

At a minimum, we need to pull these implementations out of the online subsystem so we can support Online Services (OSSv2) in the future, and this is an opportunity to address architectural issues and bugs that have been around for a while.

At the EOS SDK level, sessions and lobbies have similar interfaces, but they’re still different in a few important ways. For example, sessions support the concept of a host address (known as a connection string in Unreal Engine terminology), but lobbies do not. This means that if we want to support bUseLobbiesIfAvailable we have to “backfill” this functionality for lobbies at the plugin level, so that regardless of whether you’re using EOS sessions or EOS lobbies, you can still get the connection string for a player-hosted listen server.

A trivial solution might be to set the player’s user ID into an “address” attribute when the sessions interface happens to create a lobby instead, but this quickly becomes impractical for a few reasons:

  • EOS Online Subsystem actively tracks whether or not the local game is actually hosting a server, and for the sessions interface, excludes results where the session does not have a listen server running for it. This prevents clients discovering sessions in search results that they can’t actually join. We also need to replicate this behaviour if you’re creating a session backed by EOS lobbies, since it’s quite possible a game might create a lobby-backed session and then start the listen server later.
  • The player that creates the lobby might not be the first local user that is responsible for incoming network connections. In EOS Online Subsystem, the first local user is the one that gets used for P2P network connections for listen servers, but if you’re making a game with split-screen support on consoles, it’s entirely possible to have the second or third local player be the one creating a lobby. In this case, the P2P address of the listen server is not simply the lobby owner’s user ID, but rather a different local user’s ID instead.

Instead of going down this route, we’re pursuing a general-purpose “rooms” concept which can encapsulate the APIs both sessions and lobbies. Internally when creating a room, we specify what we want to back it with (sessions or lobbies), as well as the requested features of a room.

Any number of room features can be attached to a room, and room features including things such as “server address room feature” and “voice chat room feature”. Then, the session room provider and lobby room provider can implement these features in a way that is suitable for the underlying EOS SDK API. Sometimes this will be a direct mapping; the session room provider can simply use the HostAddress field of the session, and other times it will be more involved, such as the lobby room provider storing and retrieving the host address as a custom lobby attribute.

Systems in the rest of the plugin, such as the listen tracking feature which hides sessions if there’s no server running, can then interact with the generalised room API. Those systems don’t need to know the specifics of sessions or lobbies to work.

In addition, unlike the current architecture where OnlineSessionInterfaceEOS directly calls the synthetic session/party manager when sessions change, room providers don’t need to be aware of native platform integrations. Instead, functionality such as “allow players to join the game via Steam’s Join Game button” is wired up through event sinks, which are again abstracted away from the underlying room implementation. It is the hope that this change will allow EOS Online Subsystem to better support native platform invites, such as adding support for representing parties as Steam parties (instead of as Steam sessions) and improving our support for native platform invites on consoles.

On a sidenote...
The timeline of exactly when we’ll implement things such as “EOS parties appearing as parties on Steam” is still to be determined; the important point is that the new rooms API makes it much easier to build these features in the future.

While the rooms API is not something you’re likely to interact with directly as a game developer, it means that the session, lobby and party interfaces exposed to you via OSSv1 and OSSv2 will be more consistent and stable, and we’ll be able to address some of those long-standing bugs and feature requests on the issue tracker.

Wrapping up this very long post

So, that’s a lot of work to undertake and a lot of refactoring to do. Hopefully this post gives you insight into why this work is important for the future and why it’s going to take a little longer to get to a feature release for EOS Online Subsystem than usual.

If you have any questions about this work, feel free to ask in the sales-questions channel or the eos-online-subsystem support forum in the Discord server.

All code examples are MIT licensed unless otherwise specified.