Message dropped with reliability set ... PreviousNext
Data Distribution Service (DDS) Forum > DDS Technical Forum >
Message/Author
Next message Robert Tseng  posted on Wednesday, August 04, 2010 - 03:07 am
My system is finally designed and up-and-running however I am noticing dropped messages.

My system requires synchronous transfer between two components so I implemented a request/reply pattern in my message scheme on top of DDS so that the two components will never run ahead of each other. (Right now everything is on the same machine)

If one does run ahead the system will seize up because publisher is waiting on the subscriber's reply. What I have noticed is my system seizes up after random number of messages have been sent. It may run for 3 cycles, 100 cycles or even 5000 cycles before seizing up because sender is waiting for a reply from the subscriber. And it's always the same subscriber for unknown reason.

I went ahead and set the topic's qos to RELIABLE_RELIABILITY_QOS (durability is set to volatile), and similar to both the writer's and the reader's QOS. However this does not seem to resolve the problem. The reply from subscriber never makes its way back to publisher.

Am I not configure the settings correctly again, like forgetting to turn autodispose to off for late joiner to receive message?


When I have durability in TRANSIENT_DURABILITY_QOS mode, system will consistently send 300 round of messages before memory alloc fails in datawriter. My hunch is message is saved in TRANSIENT so message loss is mitigated but somehow memory fails. Are there settings for resources? It's weird I get a hard memory failure.
Next message Robert Tseng  posted on Wednesday, August 04, 2010 - 06:34 pm
Did some digging and read about observed phenomenon of message dropping is actually previous entry being overwritten by newer entry before the app has a chance to read it. Useful information, thanks for this forum. Making durability to transient and setting depth to 10 seems to resolved the issue.

I am still interested in learning how to set larger memory pool for the app/broker. My original depth was 10,000 for maximum and 1,000 at each pub/sub. Each message should be no bigger than 2-3K so I am surprised the datawriter ran out of memory.
Next message Robert Tseng  posted on Wednesday, August 04, 2010 - 06:41 pm
Actually I ran into the same memory problem again, now later in the simulation. Eventually the system run out of resource, splice.exe/ddsdatabase.dll errored out.

Each sub/pub reuse the same message instance I registered and the key does not change. I thought depth of 10 means it will only keep the last 10. Not sure why memory will fail eventually.
Next message Robert Tseng  posted on Thursday, August 05, 2010 - 02:59 am
This part may be specific to Opensplice. I have discover the location of the memory growth. It may be a memory fragmentation issue because I have ran a Memory debugging tool and nothing suggests of a constant memory leak.

The issue is I may have designed a overly complicated message structure no one has tested yet. What I have is a sequence within a sequence. Here's a skeletal code of my message definition. (I have modified the names so I am not sure if syntax is still valid) It's basically a variable length list of sets which are variable themselves. ie. {(1,2,3);(1,2);(2,4,6,6), ... }

struct Hdr
{
/* base types */
};

struct SegmentHdr
{
/* base types */
};
struct Detections
{
/* base types */
};
typedef sequence <detections> DetectionsList;


struct ReportInfo
{
/* base types */
};

struct ExtendedReport
{
/* base types */
};

struct ASet
{
unsigned short NumElement;
sequence <unsigned> ElementId;
};
typedef sequence <aset> ListOfSets;


struct ReportMessage
{
Hdr MessageHeader;
SegmentHdr SegmentHeader;

unsigned short ReporterId;
unsigned long ProcessIndex;
unsigned long ReportIndex;
char ReportType;
ReportInfo Info;
boolean ExtendedInfoExist;
ExtendedReport InfoExtended;
unsigned short NumDetection;
DetectionsList List;
unsigned short NumSet;
ListOfSets SetList;
};
#pragma keylist ReportMessage ReporterId


The field in interest is SetList in ReportMessage structure. If I abstinent
from sending this field the growth disappears and my app does not run out of heap space.


Originally I suspect DDS_DCPSUVLSeq<t,>::length function in uvl.h because it allocs new buffer if existing is too small. However this is not it. (I tested this by providing a max len initially)

The memory fails in generated XXXXSplDcps.cpp when calling the c_bool XXXXX__copyIn(c_base base, struct *from, struct *to)

dest0 = (c_ushort *)c_newArray(c_collectionType(type0),length0);

returns dest0 = 0;

I can work around this with different messaging design but hopefully this will help someone.
Next message Angelo Corsaro  posted on Thursday, August 05, 2010 - 08:44 am
Hello Robert,

Can you re-post your email on the OpenSplice DDS Developer list (developer@opensplice.org). That is the right forum for this kind of questions.

Something for you to check quickly, are you sure that the key on the ReportMessage (namely the ReporterId field) assumes a bounded number of values?

Meaning if you keep incrementing it, then you are going to create new instances which would explain the growth in memory. In that case you care not experiencing a bug in OpenSplice but an issue with your model.

Anyway, re-post your questions on the dev-list and we'll pick up from there.

Cheers,
Angelo
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Post as "Anonymous"
Enable HTML code in message
Automatically activate URLs in message
Action: