I’ve been wanting to post this for a while. A while back I was involved in a support case that took a while to pinpoint and eventually resolve. It was an intermittent issue, which I hate. I used the usual method for troubleshooting and asked for specific examples with date and time stamps as well as caller and callee. Then got stuck in to the Lync Monitoring server logs delving deep into the guts of each call. The most frustrating thing was that Lync reported no failure, expected or unexpected, for any of the calls.
The issue reported was that calls which were put on hold seemed to disappear and couldn’t be retrieved. The strange thing was that the person having been put on hold was still on hold.
The Lync solution was pretty standard except for the addition of an Asterisk Free PBX which was providing hold music.
I know this is not a “supported” solution before you say anything. But what do you do when there is no other solution for providing consistent hold music and the customer really wants music on hold? And I also know there are better solutions available now with embedded music on hold on the Aries handsets and gateways with dedicated storage for music on hold and even MOH ports. This was an early project and these solutions didn’t exist.
Anyway, the only characteristic similarity I found for any of these calls and, as it turns out, a lot more calls which weren’t reported as problematic was
“Call terminated on mid-call media failure where both endpoints are internal”
Lync insisted that it was always the outside party that ended the call. But the error said that the call terminated where both endpoints were internal. The outside party also insisted that they could still hear hold music so assumed they were still on hold. Now the penny has dropped. Lync is dropping the calls because as far as it is concerned, it doesn’t have control of the call any more.
Why was this happening? What you should know is that the Asterisk is connected to Lync via a trunk. There are a couple of configurable attributes on a trunk that were key in resolving the issue.
Figure 1
RTCPActiveCalls – This parameter determines whether RTCP packets are sent from the PSTN gateway, IP-PBX, or SBC at the service provider for active calls. An active call in this context is a call where media is allowed to flow in at least one direction. If RTCPActiveCalls is set to True, the Mediation Server or Lync Server client can terminate a call if it does not receive RTCP packets for a period exceeding 30 seconds.
Set-CSTrunkConfiguration -identity “trunk name” -RTCPActiveCalls $False -RTCPCallsOnHold $False
You’ll get the following warning when you set the configuration.
WARNING: When RTCP active calls or RTCP calls on hold is false, it is recommended that you enable session timer to periodically verify that the call is still active.
So as well as the above you need to enable the session timer.
Set-CSTrunkConfiguration -identity “trunkname” -enablesessiontimer $True
The strange thing is that all three settings essentially check to see if a call is active. The session timer does it by sending periodic probes to the mediation server and waiting for a reply. And I’ve explained what the other does already. The point is that all three settings have the same goal. And here we have disabled one and enabled the other. Microsoft say in the description of the parameter disabling these is a bad thing and should be done only if necessary. But I wouldn’t have to use this setting if the gateway didn’t stop sending RTCP packets in the first place. At least there is a setting to change to fix it.
As soon as I changed these parameters the errors stopped and so did the dropped calls. We have also since had to change it on the trunk to the PSTN gateway for the same reason.
It also turns out that I am not alone in the issue. You’ll probably find more than a few forums mentioning the same thing. As I said, I have been wanting to post this for a while. And I really hope it helps you if you encounter the issue, but also to understand the mechanics of why it is happening in the first place. I also hope that gateway and SBC manufacturers and SIP providers take note and ensure their products consistently send RTCP packets.
Thanks for reading.