TRANSIENT worked as expected, PERSIST... PreviousNext
Data Distribution Service (DDS) Forum > DDS Technical Forum >
Message/Author
Next message Vincent David  posted on Thursday, March 18, 2010 - 05:55 pm
Hi,

Using OpenSpliceDDSV4.3-x86.win32-vs2008-HDE I can't get PERSISTENT to work as I _expected_ even though TRANSIENT works like a charm (but does not outlive the OpenSplice infrastructure obviously) :

- with TRANSIENT, late-joining readers always get the latest sample of every instance as long as the OpenSplice infrastructure is up and running ; when restarting the infrastructure, readers don't get anything => OK,

- with PERSISTENT, late-joining readers also get the latest sample of every instance as long as the OpenSplice infrastructure is up and running ; and when restarting the infrastructure, readers get instances *but* they get random samples of the instance (almost always the first sample), which is not what I need => I need the latest samples.

My first question is : which samples should I get ?

I expect (and need) to get the latest, but as mentioned above I get the first ones.


Next, I'll describe the test procedure below.

Topic QoS : RELIABLE, TRANSIENT (or PERSISTENT), service_cleanup_delay = 60s, max_samples_per_instance = 1

DataReader QoS : RELIABLE, TRANSIENT (or PERSISTENT)

DataWriter QoS : RELIABLE, TRANSIENT (or PERSISTENT)

I always switch the durability for everyone, e.g. when I set the Topic QoS to PERSISTENT, both the DataReader and DataWriter QoS durability is set to PERSISTENT.

Otherwise instructed, the test is realized in the 60 sec time-frame, meaning that the cleanup of instances is NOT the problem in the test.

The .idl looks like that :
module mymodule
{

struct mystruct
{
unsigned long mykey;
unsigned long myval;
};
#pragma keylist mystruct mykey
};

I have a single writer and multiple readers started at specific times.

The readers read instances and display them, and also display when an instance is disposed. The writer does nothing apart from writing (it doesn't even display what he wrote).

I use the following acronyms :
- W : writer
- R1,R2,etc : 1st reader, 2nd reader, etc
- D1,D2 : 1st set of data, 2nd set of data

Here is the test :
1- start R1,
2- start W,
3- W writes D1, i.e. 5 samples ranged with key 0 to 4 (thus 5 instances), filled with random values,
4- W writes D2, i.e. updates the 5 instances with random values,
5- start R2,
6- close W,
7- start R3,
8- close all,
9- restart ospl,
10- start R4,
11- wait 60s,
12- start R5.

What I expect it to do (I only mention useful lines) :
3- upon write of D1 by W : R1 reads D1
4- upon write of D2 by W : R1 reads D2
5- upon start of R2 : R2 reads D2
6- upon close of W : R1 and R2 detect disposal
7- upon start of R3 : R3 reads D2 which is disposed but not cleaned yet (still within the 60 sec time-frame),
10- upon start of R4 : nothing should happen with TRANSIENT, R4 should read D2 with PERSISTENT,
12- upon start of R5 : nothing happens since all instances were cleaned after 60 sec.

What happens :
- everything above happens apart from :
10- with PERSISTENT : R4 reads D1 ! (note: with TRANSIENT nothing happened in step 10, as expected)

The weird thing is that both TRANSIENT and PERSISTENT mode read correctly the latest data when readers join after the write, i.e. in step 5 and step 7, but the PERSISTENT modes magically changes his mind when restarting the OpenSplice infrastructure.

Every data is retrieved by listeners using the on_data_available method (C++).

Just in case it makes any difference, here are the sets of randomly generated data :

D1 :
key=0 val=1
key=1 val=7
key=2 val=4
key=3 val=0
key=4 val=9

D2 :
key=0 val=4
key=1 val=8
key=2 val=8
key=3 val=2
key=4 val=4

I'm either expecting the wrong results or performing a wrong test, either way I would be glad for any help.

Regards,
Vincent
Next message Niels Kortstee  posted on Friday, March 19, 2010 - 10:18 am
Hi Vincent,

Did you configure a persistent store directory in the OpenSplice configuration file? By default there is none, causing persistent to behave as transient.

Regards, Niels
Next message Vincent David  posted on Friday, March 19, 2010 - 10:33 am
Hi Niels,

Yes I did configure one, adding the following lines to ospl.xml under the DurabilityService token :
<persistent>
<storedirectory>D:/TMP/DDS/Pdata</storedirectory>
</persistent>

The folder "D:/TMP/DDS/Pdata" is writable. I checked its contents while OpenSplice is running and XML files generated (mystruct_topic.xml, mystruct_topic_meta.xml, ...) seem correct, especially qos do match what I've called in the code.

I clean the contents of "D:/TMP/DDS/Pdata"
everytime I test some other QoS in order to make sure existing recorded files do not conflict with new content.

Thanks,
Vincent
Next message Niels Kortstee  posted on Friday, March 19, 2010 - 10:49 am
Hi Vincent,

You were saying:
"They get random samples of the instance (almost always the first sample), which is not what I need => I need the latest samples."

Does this mean, you sometimes do get the latest samples? How long after writing D2 do you terminate ospl? Since data is written to disk asynchronously, part of the data may not have been written to disk yet when you terminate ospl. If that's the case you should get the following message in your ospl-info.log (in the ospl install dir):
"Durability service terminating but 'x' persistent messages remain in the queue and are not stored on disk."

Regards. Niels
Next message Vincent David  posted on Friday, March 19, 2010 - 03:58 pm
Hi again Niels,

With some old tests, I've been able not to get the first samples, but it happened when I was toying alot with QoS which crashed pretty often (especially when I forgot to clean the persistence folder) but it did not get the latest samples either, they were just random samples.

Now that my code is stable, the readers always get the first samples when I restart the OpenSplice infrastucture, no matter how many times I restart it, how many reader/writers I run, or how many times I clean the persistence folder.

After writing D2, ospl is still up for few seconds (I'd say ~15sec) because I have the time to run few readers in my test procedure. I checked the log file you mentioned which does not display the error, and I opened the file which contains data in the persistence folder, and the latest samples are copied, which means ospl is not stopped "too soon". However, the file also contains the first samples, and I'm unsure if they should be here since I told the Topic QoS to keep at most 1 sample per instance.

In case it helps understanding my problem, I've attached the source code + .idl that I test. In fact I just modified the C++ pingpong dcps example and wrote my own writer/reader, even though it is not a 'pingpong' program anymore. The writer is Ping and the reader is Pong.

Considering one already has the dcps examples extracted in his hard disk, testing this code is as simple as overwriting ping.cpp, pong.cpp and pingpong.idl located in examples\dcps\standalone\C++\PingPong and rebuild the project. Then run Pong.exe for reader and Ping.exe for writer.

application/zipsource code and .idl based on the PingPong dcps example
pingpong.zip (3.9 k)


Thanks,
Vincent
Next message Niels Kortstee  posted on Monday, March 22, 2010 - 05:15 pm
Hi Vincent,

Just been looking at your code and found what the issue is. You've set the resource_limits.max_samples_per_instance to 1. This causes the 2nd sample you write for an instance to be 'rejected' and therefore you get the 1st sample for every instance when you re-start. You probably only want the last sample for every instance and the QoS policy to use for that is the history QoS policy. By setting it to KEEP_LAST and a depth of 1 (this is the default btw), the service will only keep the last sample per instance.

So simply leave the resource_limits and history in the TopicQoS set to their defaults (respectively unlimited and KEEP_LAST, 1), you'll get your desired behaviour.

Best regards, Niels
Next message Vincent David  posted on Tuesday, March 23, 2010 - 11:07 am
Hi Niels,

I looked closely to your solution, unfortunately it did not solved the problem.

The max_samples_per_instance that I set to 1 is the one from the durability_service, not the one from resource_limits. Anyway, if I set max_samples_per_instance to 1 (or not) to durability_service or resource_limits or none of them or both, I get the same results : always the last samples when the OpenSplice infrastructure is up, and always the first samples after restarting it.

What remains a mystery (yet a misery) to me is why one's configuration, whatever it is, would behave differently when the OpenSplice infrastructure is up and when restarting it. As far as I know, setting the durability kind to PERSISTENT exists solely for the purpose of having the same behavior in these cases, granted that hard disk usage (speed and size) can support it.

Thanks,
Vincent
Next message Vincent David  posted on Tuesday, March 23, 2010 - 04:15 pm
I finally managed to find the problem. In fact, durability_service.history_kind is not set by default. The solution is as simple as adding the following line for the Topic QoS :

qos.durability_service.history_kind = KEEP_LAST_HISTORY_QOS;

I'm pretty sure I checked this value in my debugger some time ago, but maybe it was back when I toyed alot with QoSes.

I'm also pretty sure that I should not have to initialize this value by myself.

Well, now it works as expected, both in TRANSIENT and PERSISTENT modes.

Many thanks Niels for your support.

Best regards,
Vincent
Next message Hans van 't Hag  posted on Tuesday, March 23, 2010 - 05:16 pm
Hi Vincent,

Maybe a suggestion to use the product-specific mailing-lists (see http://www.opensplice.org/cgi-bin/twiki/view/Community/MailingLists#developer_op ensplice_org) for your technical questions.

W.r.t. your issue, I suspect that your problem is more related to not properly initializing your QoS structures which is actually required as otherwise you could get random values for those parameters that you don't explicitly set (such as the history_kind).

Regards,
Hans
Next message Vincent David  posted on Wednesday, March 24, 2010 - 10:00 am
Hi Hans,

Thank you for the mailing list link. I'll make sure to subscribe ASAP.

As for the QoS initialization, I always call functions such as get_default_topic_qos in order to fill the structures before using them, but for an unknown reason the history_kind is not set.

Regards,
Vincent
Next message Lucia  posted on Wednesday, July 14, 2010 - 01:04 pm
Hi Vincent,

I have the same problem with my application in Java...

My data are persistent. I would like to read latest samples when I restart OpenSplice services...but it doesn't work as I expect...the subscriber gets the first samples.

Could you please summarize all "your" policies for topic, reader and writer in order to check if I set them correctly?

I thank you in advance.

Regards,
Lucia
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Post as "Anonymous"
Enable HTML code in message
Automatically activate URLs in message
Action: