Information Management

Information management in OSM-GB project is based on managing data and rules (rules and actions are referred as rules here). This information is managed through workflows and data flows for initial loading and for updating process. In this page, the following areas are covered:

=Data and Rules Structure= There are two main database used in the project:
 * Oracle Spatial installed on server 1
 * PostGIS installed on server 3

Oracle Spatial holds two sets of tables:
 * The raw OSM dataset (osm)
 * The detected bugs/fixes discovered in the dataset (osm-bugs)

The PostGIS database holds the result of applying the bug/fixes to the raw data (osm-gb). It also acts as the placeholder for importing raw OSM data and applying the frequent updates before transferring it to Oracle database. osm-gb database initially holds the raw data (before exporting to osm on Server 1) and then it holds the processed data (after updated by osm-bug on Server 1). There will be instances when the changes on osm-gb need to be reverted in an update workflow. Those reversions are done using the raw data stored previously on Server 1.

In any case, the system must make sure that -unless during the workflow runs- the raw data and the is on Server 1 and the processed data is on Server 3. The osm-gb database on Server 3 is continuously used by the Web Services installed on Server 2 in order to deliver to the external world.

The tables structure of osm and osm-gb databases are almost same as standard osm2pgsql schema. The main four tables are called planet_osm_line, planet_osm_point, planet_osm_polygon and planet_osm_road. The group of the tables are also referred as planet_osm_*. Because the osm2pgsql schema lacks unique primary keys for those four tables, the osm tables has an extra column added for this purpose.

Three tables are referred as the osm-bugs on Oracle database: planet_osm_line_bugs, planet_osm_point_bugs and planet_osm_polygon_bugs. The fourth table (planet_osm_road) is not considered in this group because this table in osm is in fact a subset of planet_osm_line, used to render the roads at low zoom levels. The discovered bugs are very unlikely possible to appear at low zoom levels. The structures of these three tables are same as the similar tables in osm and osm-gb, except that they have two extra columns: BUG (the description of the detected bugs) and FIX (the type of the applied fix – or even deletion)

The repository of rules and actions are stored in Server 1’s Radius Studio instance.

=Data Flow=

The following diagram describes the flow of data between the various tools installed in the three servers and how they are delivered through the Web Services:



=Data Download Workflow=

On Server 3 (PostGIS)
 * When database is initiated
 * When OSM data is not updated for more than a week

1- From geofabrik website, download GB dataset

2- Using osm2pgsql, import the data into osmgb database

On Server 1 (RS + Oracle)

3- Use OGR2OGR to import and replace new planet_osm_* tables from Server 3 to Oracle

4- In RS, run Data_Download_Session: apply rule/action on the whole database to produce planet_osm_*_bugs

5- Run the Java tool to apply the fixes in planet_osm_*_bugs records to new planet_osm_* tables on server 3’s PostGIS.



=Data Update Workflow=
 * When a frequent OSM data is available (daily)

On Server 1 (Oracle + RS)

1- Having planet_osm_*_bugs and using a Java tool (RevertChanges: TBD) revert the previous fixes or deletions done on Server 3’s osmgb database.

On Server 3 (PostGIS)

2- Using the frequent updates, use osmosis/osm2pgsql to update osmgb database

3- Using the frequent updates, make a table of newly created/modified IDs called UpdatedIDs on Server 1 (To be developed)

Again on Server 1 (Oracle + RS)

4- Use OGR2OGR to import and replace new planet_osm_* tables from Server 3 to Oracle

5- In RS (Data_Update_Session):


 * Load new planet_osm_*


 * Load old planet_osm_*_bugs


 * Load the table UpdatedIDs


 * Apply rule/action on the following features and make new planet_osm_*_bugs:


 * i)  The newly added features and their adjacent* features; or
 * ii) The newly modified features and their adjacent* features; or
 * iii) The features which are already in planet_osm_*_bugs and their adjacent features

6- Run the Java tool to apply the fixes in planet_osm_*_bugs records to new planet_osm_* tables on server 3’s PostGIS.

Note 1: In the above cases, by adjacent we mean the features which are either


 * In a defined proximity of the feature; or


 * Within the features; or


 * Containing the feature; or


 * Intersecting with the feature.

Note 2: In case iii above, the newly deleted features may already exist in planet_osm_*_bugs table. There is no need to exclude them because they will be filtered out when the rule cannot find the deleted ID in the new planet_osm_* tables.

Note 3: Again in case iii above, checking all the records in planet_osm_*_bugs table may seem redundant. This is in fact useful in the following cases:


 * When the rules/actions are changed


 * When the feature is no longer buggy because another feature is newly deleted.


 * When the effects of the new added/modified features on the old corrections have not been trapped in cases i and ii.



Data Update Scenarios
The above workflow supports implementation of the data update scenarios described in Frequent Update page.

=Rules Update Workflow= On Server 1 (Oracle + RS)
 * When the ruleset is changed

1- Having planet_osm_*_bugs and using a Java tool (RevertChanges: TBD) revert the previous fixes or deletions done on Server 3’s osmgb database.

2- In RS: Apply the changes to Data_Download_Session and Data_Update_Session

3- In RS: Remove records from planet_osm_*_bugs where a changed/deleted rule is involved

4- In RS: Make and run New_Rules_Session: Apply only the newly added or changed rules/action to the whole dataset and add the results to planet_osm_*_bugs

5- Run the Java tool to apply the fixes in planet_osm_*_bugs records to new planet_osm_* tables on server 3’s PostGIS.