As of April 12th, you must go to Progress SupportLink to create new support cases or to access existing cases. Please, bookmark the SupportLink URL and use the new portal to contact the support team.
This is a question on best practices for ensuring clients can always find new/modified records.
I see that when a row is modified, the Kinvey metadata shows the last modified time to second (not millisecond) accuracy. That’s OK.
1) Is there a chance that due to an internal rollover in the milliseconds it takes code to run, that the time meta data time could be shown as 00:02:59 but the record actually inserted into the database at 00:03:00?
2) Is there a chance, due to load balancing, that a record in a database might record its metadata at 2:59 but not shown up in our local instance until later like 3:00?
These questions revolve around ensuring that a client can accurately ask “What’s new since I last checked?” and never miss anything.
For instance, I’m trying to avoid this scenario:
2:50 Client checks for updates.
2:59 UPDATE requested on a row. (Sets last Modified to 2:59)
3:00 Client checks for updates since 2:50. Nothing
3:00 Row update becomes visible on DB. (Last Modified still at 2:59)
3:10 Client checks for updates since 3:00. Nothing because last update was recorded at 2:59
Is there a better way to check for what’s new than a last modified time?
The chances are pretty low, but I suppose this could theoretically happen. What kind of update/insert frequency do you expect on your app?
If your app's concurrency is high, one workaround is if the client instead checks for new items since "last_I_checked - Xseconds", and also takes care of deduping. It is obviously a little more work for you. In the meantime, we are looking at how we can make timestamp updates truly atomic.
S
Stephen Dodd
said
over 9 years ago
I understand that the chances are low. However, this will end up being one of those bugs that hits clients randomly that we'll never be able to track down in production.
I don't mind taking care of de-duping, particularly if it were sporadic. Would be preferable not to get the same update twice each time.
We'll be having 10's of thousands, possibly hundreds of thousands of users with unknown but likely high concurrency meaning that any possibility of failure will eventually be hit.
Can you explain further the "last_I_checked - Xseconds"?
I
Ivan Stoyanov
said
over 9 years ago
The idea is to basically add some overlap between the intervals such that requests in progress would have flushed through the system. The bigger the overlap, the more dupes you'll have, the smaller it is, the higher chances for your scenario to happen. With the overlap being 2s:
2:50 Client checks for updates the first time
3:00 Client checks for updates since 2:48
3:10 Client checks for updates since 2:58
S
Stephen Dodd
said
over 9 years ago
Having overlap is certainly possible, though not ideal.
My question is, what is the minimum overlap we could have to guarantee that we won't lose data? What's the longest it could take for a row to show up in a collection?
Do Kinvey servers ever get sharded? Is there a potential for a rare long delay in the order of, say 10s or a minute or longer?
I'm opent to suggestions of a more dependable system for clients to asks "What's new?" This must be a common use case.
I
Ivan Stoyanov
said
over 9 years ago
Setting the overlap to 2s is the right thing to do in this case.
In the meantime, it looks like MongoDB, our underlying data store is adding support for atomic timestamping, so as soon as the release is out, we'll look to change our implementation to that.
S
Stephen Dodd
said
over 9 years ago
Thanks. It'll sure be helpful to have atomic timestamping.
In the meantime, I'm too scared that even 2s might not be enough in rare scenarios and I'm going with sequence numbers instead of timestamps. That is, a global sequence number is atomically updated and applied to every row during update. Clients ask for anything newer than the last sequence they received.
If anyone else is doing sequence numbers, you may wish to avoid this edge case scenario outlined and solved here: http://stackoverflow.com/questions/21586779/a-client-walks-into-a-server-and-asks-whats-new-problems-with-sequence-num/21624628#21624628
S
Stephen Dodd
said
over 9 years ago
Coming back to this.... It was recommended that 2s is a reasonable overlap. Do you know what the maximum time that could occur between posting an item and that item being available in the database? Could sharding or some other problem turn that duration into a long time like 10s or even minutes?
Stephen Dodd
I see that when a row is modified, the Kinvey metadata shows the last modified time to second (not millisecond) accuracy. That’s OK.
1) Is there a chance that due to an internal rollover in the milliseconds it takes code to run, that the time meta data time could be shown as 00:02:59 but the record actually inserted into the database at 00:03:00?
2) Is there a chance, due to load balancing, that a record in a database might record its metadata at 2:59 but not shown up in our local instance until later like 3:00?
These questions revolve around ensuring that a client can accurately ask “What’s new since I last checked?” and never miss anything.
For instance, I’m trying to avoid this scenario:
2:50 Client checks for updates.
2:59 UPDATE requested on a row. (Sets last Modified to 2:59)
3:00 Client checks for updates since 2:50. Nothing
3:00 Row update becomes visible on DB. (Last Modified still at 2:59)
3:10 Client checks for updates since 3:00. Nothing because last update was recorded at 2:59
Is there a better way to check for what’s new than a last modified time?