DUM Client Outbound Support

From reSIProcate
Revision as of 16:53, 19 November 2014 by Sgodin (talk | contribs) (→‎Server Side)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Client Outbound (RFC 5626) Support in DUM[edit]

In order to enable client outbound support in resip/dum you need to make use of UserProfiles. If the client UA only ever has one identity and/or one outbound/edge proxy, then it is sufficient to place the required settings on the single MasterProfile that will get used if no UserProfiles are explicitly specified in the DialogUsageManager::makeXXXX calls. For more information on Profiles, UserProfiles and MasterProfiles see DUM_Configuration_and_Profiles.

The following UserProfile options must be used to enable client side "outbound support:

profile->addSupportedOptionTag(Token(Symbols::Outbound));  // RFC 5626 - outbound
profile->addSupportedOptionTag(Token(Symbols::Path));      // RFC 3327 - path
profile->setInstanceId(instanceId);  // See RFC5626 section 4.1
profile->clientOutboundEnabled() = true;

You should also install a KeepAliveManager in order to get the client to send CRLFCRLF keepalive pings towards the server:

mDum->setKeepAliveManager(std::auto_ptr<KeepAliveManager>(new KeepAliveManager));

Operation with a Single Outbound Proxy/Edge[edit]

Typically a client application will perform a registration at startup - this establishes the outbound flow to the server that will be used for all future communication (or as long as the flow stays alive), assuming that the first hop proxy supports the proxy procedures in RFC5626. Note: if desired a client could establish a subscription dialog before registering (as discussed in RFC5626), resip/dum supports this scenario as well.

Upon reception of a successful registration response where the first hop proxy has indicated support for RFC5626, dum will store the flow key/tuple used to perform the registration in the UserProfile specified when sending the registration (or the MasterProfile if one is not specified). Any messages sent from DUM from this point on that are associated with this same UserProfile will be tagged to use only the stored flow. Any keepalives that result from a registration where the registrar or first hop proxy indicated RFC5626 support, will be tagged as RFC5626 supported keepalives. For such keepalives we will expect that the first hop proxy sends CRLF pong responses to our CRLFCRLF keepalive pings as specified in the RFC. If a pong response is not seen within 10 seconds of sending a keepalive ping, then the flow is terminated. Note: the default value of 10 seconds is recommended by RFC5626 and can changed by assigning a new value to static member: KeepAliveManager::mKeepAlivePongTimeoutMs

If the TCP/TLS connection/flow is terminated for any reason (ie. remote termination, socket error, or keepalive pong timeout), then dum will receive a connectionTerminated notification from the stack. DUM will then search through all of its existing DialogSets with the goal of notifying all long term client usages that are using the flow of the termination. It does this by checking the UserProfile associated with each dialogset. If the UserProfile has clientOutboundEnabled and the stored flow key matches the matches the recently terminated connection, then application notification will proceed. DUM will notify all applicable usages of the following types: ClientRegistration, ClientSubscription, ServerSubscription, and InviteSession. There is a new flowTerminated handler on each of these usages that the application can override to determine how it wants to handle the termination. Once the usages have been notified, the flow key stored in the User Profile is cleared out, and it is expected/required that the application re-register in order to establish a new flow. The default handling for flowTerminated for the various usages are as follows:

  1. ClientRegistration - an immediate re-registration is attempted
  2. ClientSubscription - an immediate re-Subscription is attempted
  3. ServerSubscription - the subscription is ended
  4. InviteSession (Client or Server) - there is no default handling, and application must override this - one potential recovery mechanism (assuming a new flow is immediately possible) would be to issue an INVITE with replaces to the far end, in an attempt to repair/replace the mid-dialog routing with the new flows

It should be noted that the wording in RFC5626 section 4.5 (Flow Recovery) is unclear as to whether an initial re-registration attempt should be attempted immediately or using the back-off retry time logic. The first paragraph states that if there is an error in forming a new flow, then the UA needs to wait a certain amount of time - and does not place any limitations on the timing of the first attempt to re-connect a flow. Other paragraphs "hint" that even the first attempt should be delayed - see the last paragraph on page 9. The default implementation in DUM is to retry the flow connection immediately, if this initial attempt fails then it is up to the application to control how/when the registration or subscription is retried via the onRequestRetry handlers or the applicable profile retry settings. Applications should use the back-off retry algorithm discussed in section 4.5 of RFC 5626. Applications that would like to delay the immediate initial attempt can do so by overriding the flowTerminated handlers.

For ClientRegistrations and ClientSubscriptions that are currently waiting for the re-try timer to expire, it is possible to call either ClientRegistration::refreshRegistration or ClientSubscription::reSubscribe before the retry timeout to force an immediate retry. This can be desirable in situations where an application is able to get notifications from the operating system when a particular network interface has come back on-line.

Establishing Multiple Flows through different Outbound/Edge Proxies[edit]

RFC 5626 recommends that multiple flows be simultaneously established to a registrar using multiple edge proxies in order to facilitate faster flow recovery. A registrar can then automatically route a request down an alternate flow when one fails, without waiting for a client to re-register a new flow when a connection error is detected. In order to support this scenario, a DUM client will be required to use one UserProfile for each registration. Each UserProfile should use a unique outbound proxy and regId (typically 1 for the first registration and 2 for the second). Any applications using this should implement logic to make use of the alternate UserProfile for any requests when there is a failure on the one currently being used. The RFC also talks about using multiple flows even if there are not multiple edge proxies available, if the computer running the client has multiple working network interfaces as it's disposal. resip/dum cannot support this scenario at this time, since the stack will chose to use an existing flow if one already exists for a particular destination ip address and port. It might possible to force the stack to form a new flow by pre-populating the Via header (using the Profile::setForceTransportInterface method) - but this is untested and will only work in theory for TCP connections (TLS connections currently by-pass the pre-populated Via header mechanism).

Flow Timer Support[edit]

Server Side[edit]

Repro can be configured to add a Flow-Timer header to a registration response, if it is either the registrar or an edge proxy supporting outbound. The repro command line option --flow-timer is used to enable this. Using this command line option will also enable a new aggressive garbage collection mode for connections. Normally the stack will only garbage collect TCP connections when the OS indicates that it is out of FD's. With aggressive garbage collection enabled, the stack will look for stale connections every time a new connection is formed. With the repro flow-timer command line option enabled, the stack will cleanup connections that are associated with a Flow-Timer if they haven't received a keepalive "ping" message (or any other message) by the configured flow-timer interval, plus a configured grace period. It will cleanup connections that are not associated with a Flow-Timer after a hard coded time of 2 hours without receiving any inbound messages on the connection. New settings/methods to support this are:

  • InteropHelper::setFlowTimerSeconds - if set to greater than 0 then a Flow-Timer header is added to local generated REGISTER/200 response or a proxied one if outbound is being used (this is the edge proxy case). Default is 0 (disabled).
  • InteropHelper::setFlowTimerGracePeriodSeconds - this value is used by the stack to decide when connections that are using a flow-timer have timed out. Default is 30 seconds.
  • ConnectionManager::EnableAgressiveGc static member setting - if enabled this will cause the connection garbage collection to run every time a new connection is formed, instead of just when we run out of FD's
  • ConnectionManager::MinimumGcAge - this specifies how many seconds a connection can go without receiving any traffic before it is considered stale/bad.
  • SipStack::enableFlowTimer method on the SipStack - this takes a Tuple parameter and will "tag" the specified connection as being one in which we want to implement flow-timer timeout logic on. Repro calls this method if it adds a Flow-Timer header to a registration response.

Client Side[edit]

A application doesn't need to do anything special to enable Flow-Timer support, DUM takes care of all of the handling. If a Flow-Timer header is present in a registration response that also indicates outbound support, then this value is passed to the KeepAliveManager, and used in place of the configured keepalive timeout in the UserProfile/Profile. If a keepalive message is for an "outbound" enabled connection, then the keepalives are sent sometime between 80% and 100% of either the Flow-Timer header (if present) or the configured keepalive time - as specified in RFC5626.

Handling 439 Outbound not supported[edit]

According to RFC 5626 Section 4.2.1, a client may attempt to restart a request if it receives a 439 error. To handle this with dum, use the following psuedo code

void ClientRegistrationHandler::onFailure(ClientRegistrationHandle crh, const SipMessage& msg)
	resip::SharedPtr<UserProfile> up(crh->getUserProfile());
	// If you use MasterProfile's only, then modify on those instead of up
	if(msg.header(h_StatusLine).responseCode() == 439) // 439 means non outbound supported path to Registrar. Retry without outbound support.
		up->clientOutboundEnabled() = false;
		resip::SharedPtr<resip::SipMessage> reg = dum->makeRegistration(...); // Pass in up, the modified UserProfile

Unimplemented Features[edit]

The following client side features of RFC5626 have not been implement in DUM at this point:

  1. Support for UDP flows and Stun keepalive messaging.
  2. Section 4.3 - for TLS the stack should use an existing TLS flow if the destination domain matches any existing any of the SANs in the certificate of any existing flow. Currently resip will only choose existing TLS flows if the domain used to originally establish flow matches the target domain of the new request - it will not match against any/all of the SANs from the server certificate.