GroupShare 2020 CU4 – memory leak from Logging Service

Hello,

We are experiencing what appears to be a memory leak from the service sdl.groupshare.logging.service.exe – which is consuming all of the RAM (as recommended for this installation) on our application server.

The issue is causing the application server to become unresponsive, and is not resolved by restarting the application/server.

Below are a couple of the related logs written at the time the server becomes unresponsive.

Is this a known issue?

Many thanks,

Logs:

2021-06-11 02:10:39.6771|LOCAL HOST SERVER|Error|THREAD_ID:24|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EditorService is not registered to the Router. Trying to register again..

2021-06-11 02:50:23.0930|LOCAL HOST SERVER|Warn|THREAD_ID:17|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Rabbit MQ connection shutdown. queue: es-verification-result_v008 Initiator: Library, Cause: System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

   at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoop(), Reply Code: 541, Reply Text: Unexpected Exception

2021-06-11 02:50:23.1086|LOCAL HOST SERVER|Warn|THREAD_ID:17|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Disconnected from RabbitMQ : es-verification-result_v008

2021-06-11 02:51:20.5299|LOCAL HOST SERVER|Error|THREAD_ID:5|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EditorService is not registered to the Router. Trying to register again..

2021-06-11 03:30:42.0544|LOCAL HOST SERVER|Warn|THREAD_ID:22|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Rabbit MQ connection shutdown. queue: es-verification-result_v008 Initiator: Library, Cause: System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

   at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoop(), Reply Code: 541, Reply Text: Unexpected Exception

 

And EditorRouterService.log:

2021-06-11 02:08:15.6754|LOCAL HOST SERVER|Error|THREAD_ID:12|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EX:System.TimeoutException: The operation has timed out.

   at RabbitMQ.Util.BlockingCell.GetValue(TimeSpan timeout)

   at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)

   at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)

   at RabbitMQ.Client.Framing.Impl.Model._Private_ChannelOpen(String outOfBand)

   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.CreateNonRecoveringModel()

   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.CreateModel()

   at Sdl.Services.Common.MessageQueue.RabbitMQ.RabbitMqConnection.CreateModel()

   at Sdl.Services.Common.MessageQueue.RabbitMQ.RabbitMqMetricMonitor.ReportMetricCallback(Object state)|

And FPSexe.og

2021-06-11 02:05:27.5180|LOCAL HOST SERVER|Warn|THREAD_ID:15|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Rabbit MQ connection shutdown. queue: q-convert-to-sdlxliff Initiator: Library, Cause: System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

   at RabbitMQ.Client.Impl.InboundFrame.ReadFrom(NetworkBinaryReader reader)

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()

   at RabbitMQ.Client.Framing.Impl.Connection.MainLoop(), Reply Code: 541, Reply Text: Unexpected Exception

2021-06-11 02:05:27.5493|LOCAL HOST SERVER|Warn|THREAD_ID:15|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Disconnected from RabbitMQ : q-convert-to-sdlxliff

2021-06-11 02:05:42.1274|LOCAL HOST SERVER|Error|THREAD_ID:8|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EX:System.IO.IOException: connection.start was never received, likely due to a network timeout

   at RabbitMQ.Client.Framing.Impl.Connection.StartAndTune()

   at RabbitMQ.Client.Framing.Impl.Connection.Open(Boolean insist)

   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.RecoverConnectionDelegate()|

2021-06-11 02:05:57.6587|LOCAL HOST SERVER|Error|THREAD_ID:8|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EX:System.IO.IOException: connection.start was never received, likely due to a network timeout

   at RabbitMQ.Client.Framing.Impl.Connection.StartAndTune()

   at RabbitMQ.Client.Framing.Impl.Connection.Open(Boolean insist)

   at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.RecoverConnectionDelegate()|

2021-06-11 02:09:18.2854|LOCAL HOST SERVER|Warn|THREAD_ID:6|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|Worker Id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx PID 7064 STOP was requested by the FPS service.

2021-06-11 02:09:28.8792|LOCAL HOST SERVER|Warn|THREAD_ID:1|TR_ID:|The task for Metrics Http Endpoint has not completed. Listener will not be disposed

2021-06-11 02:09:49.6763|LOCAL HOST SERVER|Error|THREAD_ID:1|TR_ID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx|EX:System.TimeoutException: The operation has timed out.

   at RabbitMQ.Util.BlockingCell.GetValue(TimeSpan timeout)

Parents
  • Hi James,

    Thank you for your post!

    I would have a couple of questions on this matter - if you check in SQL->SDL Logs Database->dbo. Entries table how many entries do you have there?

    Regarding the described issue we have discovered bug CRQ-23414 which causes these issues.

    If the dbo. Entries table from the SDL Logs Database contains a large number of entries this would need to be cleared. Also, in case the ldf. file of the SDL Logs db is increasing rapidly in size this would also need to be shrinked. Please keep in mind that  before performing any kind of changes on the database create a backup.

    Please let us know how many entries you have in the SQL->SDL Logs Database->dbo. Entries table and we will inform you about further steps that you can take to solve the issue. Also, kindly note that by the activation of an SMA we can offer help much faster and Support could organize a remote session with you to solve the issue ASAP.

    Best regards,

    Ingrid Briscan

  • Ingrid,

    Thank you - we have checked the logs on the SQL server- there are 94 million dbo Entries.

    Can you give us the procedure to clear these, please? Is there a KB article on this matter?

    Thank you,

    James

  • Hi,

    Before proceeding with this please backup the SDL Logs database - you`ll need to periodically clear the dbo.Entries table from there with the SQL command "truncate dbo.Entries". This will clear the table. 

    Also, you could shrink the .ldf file of this same database as well.

    This issue will be solved in GroupShare 2020 CU5 as per information from our development.

    BR,

    Ingrid

  • Dear Ingrid,

    Could you provide a bit more information on the fix you mention.  I have an installation where we also notice the same problem, where TMContaner.log.ldf grew to 14 GB, for a small installation of 10 users and couple of 100.000 TU's.

    Could you provide some advice on how frequent (what time period) should these "truncate.dbo.Entries" be executed. 

    Would shrinking the .ldf be your first resort in such situation or would you go to truncate.dbo.entries" directly. 

    Thank you, Simon

  • Hi Simon,

    Thank you for your question - if you wish you may log a support case and we can assist you directly with solving this issue. There are a couple of steps to go through.

    I think that the best way to approach this is to set up some scheduled tasks in order to:

    1. Create a database backup for SDL Logs database(full backup if possible)

    2. Stop the SDL Logging Service

    3. Shrink the .LDF file of the SDL Logs database 

    4. Run a cleanup of the dbo.Entries table using the TRUNCATE command(maybe it`s best to do this manually)

    5. Start the SDL Logging Service

    The frequency of the cleanup depends on how many entries are created in the table.

    BR,

    Ingrid

  • Honestly this is not good we have to do this ourselves. There should be an embedded procedure within the console.

    When we subscribe to GS we don't wat to subscribe a MS SQL professional on top of this.

    Hope this will be taken care of in the future.

  • Hi Vincent,

    The issue will be fixed in Groupshare 2020 SR1 CU5.

    BR,

    Ingrid

  • Hi Ingrid, 

    thank you for the above steps. 

    THe procedure seems clear enough.  I'll check the instance to see if / how we can manage this. 
    In case we would anticipate issues, We will open also a support ticket.

    BR, SImon

Reply Children
No Data