Jump to content

Recommended Posts

Hi gang!  I'm looking for a little advice.  I'm in the concept phase of a new website that I've been wanting to put together.

 

The data used by the site needs to be able to get sync'd to one or more desktop computers (using stand-alone desktop software) and one or more mobile devices (using an Android/iOS apps that I want to develop).

 

I can handle having the desktop/mobile apps contact the server, authenticate themselves and retrieve the data.

 

But I'm trying to figure out how to keep track of the latest version of each DB transaction and deal with conflicts such as two devices changing the same transaction. 

 

My thought so far is to create a "Devices" table and have the user add "Devices" to his/her account with unique names for the devices.  Then as a DB transaction gets added/edited/deleted, I'd have a table listing all transactions that need to be sync'd and which devices need updated.

 

But I'm wondering if anyone here has any better alternatives.  The devices need to be able to download the data and store it locally.  The devices will also need to be able to edit/add/delete transactions locally while offline.

 

Any and all advice would be greatly appreciated!

 

TIA!

I am not sure if this is what you are looking for, but you could try and implement something like git that could handle the transactions etc. It may take a bit of coding to figure it out, but it has methods to handle syncs with conflicts etc.

 

I have never really looked into it, but I saw git being used by a program called syncany and it did make sense.

Thanks Premiso.  I just spent some time reading through some of the docs on GIT.  It does sound intriguing.  But as best I can tell, it is an actual software pkg that has to be installed on the server and it also manages changes to files.

 

I'm a user on a shared hosting service and as such couldn't install GIT.  And I would need it to allow the user to "Checkout" data from the server's MySQL DB and then deal with any changes on commit--from what I've read I don't know if it could track the DB changes.

 

At the very least though, I now have an idea of how I want to deal with a DIFFERENT project I've been tossing around in the back of my mind.  :)

It really depends upon the granularity of the updating, and how imperative it is that you get all of the data. In any case, this kind of functionality is really something that has to be planned for since the start, as can have a major impact on how the database and/or its interface is designed.

 

The simplest way, if you really don't care about bandwidth and such, is to simply download a dump of the database every time. This is a very wasteful an inefficient way, even if it's guaranteed to work, so I don't really recommend it. Might work if the database updates are few and far between, with very little data.

Another option is to timestamp the updates, then have the client send the timestamp of its last record. Select all data inserted after this timestamp, and you'll be OK. Requires a bit more processing from the server, and you might lose a few rows of data if you have two rows inserted at the "same" time. It is, however, guaranteed to pull only what you need, and save a lot of bandwidth.

There are other options which includes third party systems, like git, or having the interface store a log which is replayed on the clients.

 

The overall requirement for all of these methods, however, is that you need an API for the clients to utilize. So that they can access the data outside of the normal method (the site itself), and have only the pertinent data returned to them.

It really depends upon the granularity of the updating, and how imperative it is that you get all of the data.

 

I'm creating a site that I HOPE to have monthly subscribers using.  Each subscriber could possibly EVENTUALLY have a few thousand rows of data in the main tables (the ones I'm most concerned with).  I can easily imagine each subscriber account having maybe two desktop apps and two mobile apps and any two could conceivably be accessing the data at the same time--although the chances of it happening at the same time aren't huge and I think the chances of two users editing the same exact row of data at the same time are even less.

 

In any case, this kind of functionality is really something that has to be planned for since the start, as can have a major impact on how the database and/or its interface is designed.

I TOTALLY agree with you here.  That's exactly why I'm asking about it now--I haven't written any code yet.

 

The simplest way, if you really don't care about bandwidth and such, is to simply download a dump of the database every time. This is a very wasteful an inefficient way, even if it's guaranteed to work, so I don't really recommend it. Might work if the database updates are few and far between, with very little data.

That definitely won't work--TOO MUCH bandwidth AND it would take forever.

Another option is to timestamp the updates, then have the client send the timestamp of its last record. Select all data inserted after this timestamp, and you'll be OK. Requires a bit more processing from the server, and you might lose a few rows of data if you have two rows inserted at the "same" time. It is, however, guaranteed to pull only what you need, and save a lot of bandwidth.

That sounds close to what I was leaning towards.

There are other options which includes third party systems, like git, or having the interface store a log which is replayed on the clients.

I'll have to take another look at that option to see how it plays out.

The overall requirement for all of these methods, however, is that you need an API for the clients to utilize. So that they can access the data outside of the normal method (the site itself), and have only the pertinent data returned to them.

 

Yep--I was already planning to do it as an api.

Ah, very good indeed. :) Don't know why I had the impression that you had the system already, or at least parts of it, but good to hear that you're still in the planning faze.

 

Considering the extra information you've posted above, I think I might have considered a log-replay solution for this one. Keeping the log for a certain amount of time, and if a client requested something further back in time do a (limited) DB dump. Alternatively keep individual logs for each day/week or whatever else is deemed a reasonable amount of time & data, which will allow you to play only the changed data without having a too huge log to read from.

Might even be that some database systems already have this log feature, and I just don't know about it. For that bit of knowledge you really should ask a DB expert.

  • 2 weeks later...

It would be better if you explained the kind of data and transactions you're dealing with. You mentioned each subscriber may have a few thousand rows; that's not a massive amount to keep synced on a phone if needs be. Of course though that's if they only need access to their own data, and not all of it. Personally, I don't understand why it needs to be supported offline anyway? I doubt most of the apps on my phone would even start up without an internet connection.

 

To me it just sounds like you need to have a central API that each device can request/push data from/to. If you're worried about two devices updating the same row in a short period of time, just add in a checksum. The last person to save will be alerted the data has been updated, shown the new data in comparison, and can then decide what to do.

Ah, brain storming, I love brain storming ...

 

As someone said without a description of the data and it's uses our suggestions could well be just wasted words ...

 

As it isn't mentioned I assume that you've already chosen SQL as the most suitable sort of storage and that your tables are optimized to the best of MySQL's ability...

 

Another pivoting point of this solution is scalability, even to select one timestamp in a very well indexed table is going to incur overhead when you reach into the millions of rows. A more scalable approach might be an event table, you basicaly want to implement a kind of rwlock or mutex, so define events as such. In that if someone wants to write ( insert a write event ) they either have to wait to insert, or your logic can decipher wether will have adverse affects based on the other events in front of it in the queue ... this way the table is only as busy ( and by proxy large as fired events can be erased ) as the website up to the second or so ... depending on uses a read could "block" or not, that's entirely dependant on what the writes are doing and what they are doing it too ... but I think you'll get the gist of what I'm saying ...

 

Also, also, consider the best place for such short lived data, personally I wouldn't go with SQL for the event table I describe because at a point that would have trouble scaling too ... I'd go for apc, or if the app has a distributed infrastructure of any kind I'd go with memcache/couchbase ...

Edited by krakjoe

[sry mods it cut me off, I waz scratchin me head ...]

 

better that all this checking go on by moving around blocks of memory than seeking a disk or holding up other connections, it fits with Web2.0 see, having pages change in response to actions rather than reload/redirect - even if the write were to block the user can have a pretty animation and something colourful to look at while your solution is working, that's all they want - a bit of action ...

Another pivoting point of this solution is scalability, even to select one timestamp in a very well indexed table is going to incur overhead when you reach into the millions of rows. A more scalable approach might be an event table, you basicaly want to implement a kind of rwlock or mutex, so define events as such. In that if someone wants to write ( insert a write event ) they either have to wait to insert, or your logic can decipher wether will have adverse affects based on the other events in front of it in the queue ... this way the table is only as busy ( and by proxy large as fired events can be erased ) as the website up to the second or so ... depending on uses a read could "block" or not, that's entirely dependant on what the writes are doing and what they are doing it too ... but I think you'll get the gist of what I'm saying ...

 

Millions of rows really isn't that much for a well-indexed table. MySQL can handle this no problem. The OP doesn't need to worry about scaling write operations right now; that situation is so far out of sight it's insane to be suggesting it at this stage.

 

Also, also, consider the best place for such short lived data, personally I wouldn't go with SQL for the event table I describe because at a point that would have trouble scaling too ... I'd go for apc, or if the app has a distributed infrastructure of any kind I'd go with memcache/couchbase ...

 

Who said it's short-lived data? And why are you suggesting caching servers WITHOUT a database? That's madness. What if the caching server crashed? There's a reason data is stored on a hard-drive and backed up regularly. Caching servers are meant to act as a middle man to prevent repeat queries for the same data, not be the source of data.

 

better that all this checking go on by moving around blocks of memory than seeking a disk or holding up other connections, it fits with Web2.0 see, having pages change in response to actions rather than reload/redirect - even if the write were to block the user can have a pretty animation and something colourful to look at while your solution is working, that's all they want - a bit of action ...

 

It fits with "Web2.0"? I hate that term, not that it has absolutely anything to do with hardware or data access, or animations, or Javascript, or AJAX, or CSS!

Wow! What a flurry of activity! Sorry it's taken so long to respond.

 

I used to use Quicken. Then Pocket Quicken was discontinued so I was no longer interested in Quicken. I've since used a few different replacements. There are quite a few features that I've wished that each of these used.

 

I've finally gotten tired of waiting for someone else to do it and want to create an app that will implement all of the features that I see as missing. I will of course be the first user. But I figured if I'm going to create this, I may as well open it up for others to use.

 

So that's the type of data.

 

Now I've considered REQUIRING the app to be online. But all too often, I'm in a location that has poor signal.

 

Google has figured out how to manage offline edits. I can create/edit/delete contacts or calendar events on my android phone--even if my phone is offline--or via the website.

 

I'd like to do that with my transactions in this app.

 

P.S. I'm envisioning that a subscriber might be a single person or a married couple.

Edited by doni49

If you want the kind of offline support that software offers, you should really be looking at writing a native Android/iOS app. You need to be able to read and write records to a database. HTML5 has introduced "localStorage" in modern browsers, but it's not exactly sophisticated; you just store a clob of JSON-encoded data and then decode it on the next request. It's not sufficient for something like this. Plus you need the website to be available offline, which would probably require some kind of manual interaction from the user to do it.

 

In terms of syncing the data created offline, just have a table within the database for logging transactions. Could even be a log file. When a connection is next available, check with the server the last datetime it knowns about and push all the changes since then, using the log table/file to locate all the data needed.

Edited by Adam

I realize that I need to create a native app. I intended to do that and it will store data locally. But I had hoped to have it sync the data to the server--have it get/post data via HTTPS. When the app connects to the server, it will provide credentials to ID the account and the server will authenticate and then make the appropriate data available as an XML file which the app would then use to update the local database. The app would also upload data in XML format which the server would process as appropriate.

 

I'm just trying to come up with a solution for the server side communications so that the data integrity is maintained.

 

P.S. Any and all comments are welcomed and appreciated. I came here hoping to brainstorm with y'all. :)

Edited by doni49
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.