Distributed Data
- We've been distributing data for a long time.
- The Internet (1.0) itself is just a collection of data.
- It's easy if there's one copy of the data.
- It's easy if there are multiple copies of the data and they are
all exact duplicates.
- Having multiple copies and versions can lead to problem.
- If version 1 says that Gordon Brown is Prime Minister, and version Red
says David Cameron is, which one is it?
- Moreover, versions can get saved in different formats.
-
Revision control systems can help here.
- However problems can continue when data forks, and multiple different
copies are kept.