Oracle8i Spatial User's Guide and Reference Release 8.1.5 A67295-01 |
|
This chapter describes how to load spatial data into a database, including storing the data in a table and creating a spatial index for it. This chapter refers to the relational Spatial model only.
There are two steps involved in loading raw data into a spatial database such that it can be queried efficiently:
Table 11-1 through Table 11-4 show the format of the tables or views needed to store and index spatial data. Note that these tables show the relational schema.
Table 11-1 <layername>_SDOLAYERSDO_ORDCNT | SDO_LEVEL | SDO_NUMTILES | SDO_MAXLEVEL | SDO_COORDSYS |
---|---|---|---|---|
<number> | <number> | <number> | <number> | <varchar> |
SDO_DIMNUM | SDO_LB | SDO_UB | SDO_TOLERANCE | SDO_DIMNAME |
---|---|---|---|---|
<number> | <number> | <number> | <number> | <varchar> |
SDO_GID | SDO_ESEQ | SDO_ETYPE | SDO_SEQ | SDO_X1 | SDO_Y1 | ... | SDO_Xn | SDO_Yn |
---|---|---|---|---|---|---|---|---|
<number> | <number> | <number> | <number> | <number> | <number> | ... | <number> | <number> |
SDO_GID | SDO_CODE | SDO_MAXCODE | SDO_GROUPCODE | SDO_META |
---|---|---|---|---|
<number> | <raw> | <raw> | <raw> | <raw> |
The process of loading data can be classified into two categories:
This process is used to load large volumes of data into the database and uses SQL*Loader to load the data.
This process is used to insert relatively small amounts of data into the database and is analogous to the INSERT statement in SQL.
Bulk loading can be used to import large amounts of legacy or ASCII data into a spatial database. Bulk loading is accomplished using SQL*Loader1.
Example 11-1 shows the format of the raw data and control file that would be required to load the data into the SDOGEOM table with the layer name ROADS. You can choose any format of ASCII data as long you can write a SQL*Loader control file to load that data into the tables.
Assume that the ASCII data consists of a file with delimited columns, and separate rows fixed by the limits of the table with the format shown in Example 11-1:
geometry rows: GID, ESEQ, ETYPE, SEQ, LON1, LAT1, LON2, LAT2
The coordinates in the geometry rows represent the end points of line segments, which taken together, represent a polygon. Example 11-2 shows the control file for loading the data into the geometry table.
LOAD DATA INFILE * INTO TABLE ROADS_SDOGEOM FIELDS TERMINATED BY WHITESPACE TRAILING NULLCOLS (SDO_GID INTEGER EXTERNAL, SDO_ESEQ INTEGER EXTERNAL, SDO_ETYPE INTEGER EXTERNAL, SDO_SEQ INTEGER EXTERNAL, SDO_X1 FLOAT EXTERNAL, SDO_Y1 FLOAT EXTERNAL, SDO_X2 FLOAT EXTERNAL, SDO_Y2 FLOAT EXTERNAL) BEGINDATA 1 0 3 0 -122.401200 37.805200 -122.401900 37.805200 1 0 3 1 -122.401900 37.805200 -122.402400 37.805500 1 0 3 2 -122.402400 37.805500 -122.403100 37.806000 1 0 3 3 -122.403100 37.806000 -122.404400 37.806800 1 0 3 4 -122.404400 37.806800 -122.401200 37.805200 1 1 3 0 -122.405900 37.806600 -122.407549 37.806394 1 1 3 1 -122.407549 37.806394 -122.408300 37.806300 1 1 3 2 -122.408300 37.806300 -122.409100 37.806200 1 1 3 3 -122.409100 37.806200 -122.405900 37.806600 2 0 2 0 -122.410800 37.806000 -122.412300 37.805800 2 0 2 1 -122.412300 37.805800 -122.414100 37.805600 2 0 2 2 -122.414100 37.805600 -122.412300 37.805800 2 0 2 3 -122.412300 37.805800 -122.410800 37.806000 3 0 1 0 -122.567474 38.643564 3 0 1 1 -126.345345 39.345345
Note that table ROADS_SDOGEOM exists in the schema before attempting the load.
In Example 11-3, the data resides in a single flat file and the data set consists of point, line string, and polygon data. The data uses fixed-position columns and overloaded table rows.
SDO_GID SDO_ESEQ SDO_ETYPE SDO_SEQ SDO_X1 SDO_Y1 SDO_X2 SDO_Y2
The corresponding control file for this format of input data is shown in Example 11-4
LOAD DATA INFILE * INTO TABLE NEW_SDOGEOM (SDO_GID POSITION (1:5) INTEGER EXTERNAL, SDO_ESEQ POSITION (7:10) INTEGER EXTERNAL, SDO_ETYPE POSITION (12:15) INTEGER EXTERNAL, SDO_SEQ POSITION (17:21) INTEGER EXTERNAL, SDO_X1 POSITION (23:35) FLOAT EXTERNAL, SDO_Y1 POSITION (37:48) FLOAT EXTERNAL, SDO_X2 POSITION (50:62) FLOAT EXTERNAL, SDO_Y2 POSITION (64:75) FLOAT EXTERNAL) BEGINDATA 1 0 3 0 -122.401200 37.805200 -122.401900 37.805200 1 0 3 1 -122.401900 37.805200 -122.402400 37.805500 1 0 3 2 -122.402400 37.805500 -122.403100 37.806000 1 0 3 3 -122.403100 37.806000 -122.404400 37.806800 1 0 3 4 -122.404400 37.806800 -122.401200 37.805200 1 1 3 0 -122.405900 37.806600 -122.407549 37.806394 1 1 3 1 -122.407549 37.806394 -122.408300 37.806300 1 1 3 2 -122.408300 37.806300 -122.409100 37.806200 1 1 3 3 -122.409100 37.806200 -122.405900 37.806600 2 0 2 0 -122.410800 37.806000 -122.412300 37.805800 2 0 2 1 -122.412300 37.805800 -122.414100 37.805600 2 0 2 2 -122.414100 37.805600 -122.412300 37.805800 2 0 2 3 -122.412300 37.805800 -122.410800 37.806000 3 0 1 0 -122.567474 38.643564 3 0 1 1 -126.345345 39.345345
Spatial uses standard Oracle8i tables that can be accessed or loaded with standard SQL syntax. Example 11-5 loads data for a geometry (GID 17) consisting of a polygon with four sides that contains both a hole and a point. Notice that the first coordinate of the polygon (5, 20) is repeated at the end to close the polygon.
INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (17, 0, 3, 0, 5, 20, 5, 30, 10, 30, 10, 20, 5, 20); -- hole INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (17, 1, 3, 0, 8, 21, 8, 24, 9, 24, 9, 21, 8, 21); -- point INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1) VALUES (17, 2, 1, 0, 9, 29);
The SQL INSERT statement inserts one row of data per call. In Example 11-5, the table had enough columns to store the polygon in a single row. However, if your table had fewer columns (or your polygon had more points), you would have to perform multiple inserts in order to match the table structure; the data would not wrap automatically to the next row. To load a large geometry, repeat the SDO_GID, SDO_ESEQ, and SDO_ETYPE, and increment the SDO_SEQ for each line as shown in Example 11-6.
INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 0, 1, 15, 1, 16, 2, 17, 3, 17, 4, 18); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 1, 4, 18, 5, 18, 6, 19, 7, 18, 6, 17); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 2, 6, 17, 7, 16, 7, 15, 6, 14, 7, 13); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 3, 7, 13, 6, 12, 5, 13, 4, 13, 3, 14); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_ Y3) VALUES (18, 0, 3, 4, 3, 14, 2, 14, 1, 15);
Once data has been loaded into the spatial tables through either bulk or transactional loading, a spatial index needs to be created on the tables for efficient access to the data.
Create an Oracle table called <layername>_SDOINDEX as follows:
SQL> create table <layername>_SDOINDEX 2 ( 3 SDO_GID number, 4 SDO_CODE raw(255) 5 );
For a bulk load, you can call the SDO_ADMIN.POPULATE_INDEX() procedure once to tessellate the geometry table and add the generated tiles to the spatial index table. The argument to this procedure is simply the name of the layer. The level to which the geometry should be tessellated and whether to use the fixed or the hybrid indexing technique is determined by values in the <layername>_SDOLAYER table.
If data is updated in or deleted from a specific geometry table, you can call SDO_ADMIN.UPDATE_INDEX() to update the index for one SDO_GID. The arguments to this procedure are the name of the layer and the SDO_GID of the designated geometry.
See Chapter 13, "Administrative Functions and Procedures" for a complete description of the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures.
Spatial provides two methods for spatial indexing, fixed and hybrid. Fixed indexing is recommended for the relational Spatial model.
Which tessellation algorithm is used by the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures is determined by the values of the SDO_LEVEL and SDO_NUMTILES columns in the <layername>_SDOLAYER table as shown inTable 11-5.
Oracle recommends using fixed-size cover tiles for indexing a geometry stored using the relational model.
The fixed-size tile algorithm is expressed as a level referring to the number of tessellations performed. To use fixed-size tile indexing, set the SDO_NUMTILES column in the <layername>_SDOLAYER table to NULL and the SDO_LEVEL column to the desired tiling level. The relationship between the tiling level and the resulting size of the tiles is dependent on the domain of the layer.
The domain used for indexing is defined by the upper and lower boundaries of each dimension stored in the <layername>_SDODIM table. A typical domain in a GIS application could be -90 to 90 degrees for latitude, and -180 to 180 degrees for longitude2, as represented in Figure 11-1.
If the SDO_LEVEL column is set to 1, then the tiles created by the indexing mechanism are the same size as tiles at the first level of tessellation. Each tile would be 180 degrees by 90 degrees as shown in Figure 11-2.
The formula for the number of fixed-size tiles is 4n where n is the number of tessellations, stored in the SDO_LEVEL column. Figure 11-3 shows fixed-size tiling at level 2. In this figure, each tile is 90 degrees by 45 degrees.
The size of a tile can be determined by applying the following formula to each dimension:
length = (upper_bound - lower_bound) / 2 ^ sdo_level
The length refers to the length of the tile along the specified dimension. Applying this formula to the tiling shown in Figure 11-3 yields the following sizes:
length for dimension X = (180 - (-180) ) / 2^2 = (360) / 4 = 90 length for dimension Y = (90 - (-90) ) / 2^2 = (180) / 4 = 45
Thus, at level 2 the tiles are 90x45 degrees in size. As the number of levels increases, the tiles become smaller and smaller. Smaller tiles provide a more precise fit of the tiles over the geometry being indexed. However, because the number of tiles generated is unbounded, you must take into account the performance implications of using higher levels. The SDO_TUNE.ESTIMATE_TILING_LEVEL() function can be used to determine an appropriate level for indexing with fixed-size tiles. See Chapter 14 for a description of this procedure.
Besides the performance aspects related to selecting a fixed-size tile, tessellating the geometry into fixed-size tiles might have benefits related to the type of data being stored, such as using tiles sized to represent 1-acre farm plots, city blocks, or individual pixels on a display. Data modeling is an important part any database design, and is essential in a spatial database where the data often represents actual physical locations.
In the following example, assume that data has been loaded into a layer called ROADS, and you want to create a spatial index on that data. This is accomplished by first creating a table ROADS_SDOINDEX and invoking the following procedure:
sdo_admin.populate_index('ROADS');
The value in the SDO_LEVEL column of the ROADS_SDOLAYER table can be used as a tuning parameter while tessellating objects. Increasing the level increases the number of tiles to provide a more precise fit of the tiles over the object. See the description of the ESTIMATE_TILING_LEVEL()function in Chapter 14 for information on estimating the tiling level in several different ways.
After the SDO_ADMIN.POPULATE_INDEX() procedure has been called to fill the spatial index, you should also create a concatenated index using the SDO_CODE and SDO_GID columns. The concatenated index helps the join to the <layername>_SDOGEOM table during a query. The SDO_GID values from the primary filter will come from the index instead of from the table.
If a geometry with an SDO_GID 5944 has been added to the spatial tables, update the index with the following procedure:
sdo_admin.update_index('ROADS', 5944);
Like the CREATE INDEX statement in SQL, the SDO_ADMIN.POPULATE_INDEX() procedure performs an implicit commit. The SDO_ADMIN.UPDATE_INDEX()
procedure, however, does not. Therefore, SDO_ADMIN.UPDATE_INDEX()
transactions can be rolled back.
The SDO_ADMIN.POPULATE_INDEX() procedure operates as a single transaction. To reduce the amount of rollback space required to execute this procedure, you can write a routine that loops and calls SDO_ADMIN.UPDATE_INDEX(). See Section A.3.1, "cr_spatial_index.sql Script" for more information.
This section describes a variation on the linear quadtree (Morton encoding) scheme that uses both fixed-size and variable-sized tiles as a spatial indexing mechanism. the terms hybrid indexing, hybrid tiling, and hybrid tessellation will be used interchangeably in this section. Spatial indexing with purely variable-sized tiles is not recommended for production systems and is not supported in this release.
To use hybrid tiling, the SDO_LEVEL and SDO_NUMTILES columns in the <layername>_SDOLAYER table must contain valid values. That is, both SDO_LEVEL and SDO_NUMTILES must be greater than one.
The SDO_NUMTILES column determines the number of tiles that will be used to cover a geometry being indexed. Typically, this value is small, such as 4 or 8 tiles. However, the larger the number of tiles, the better the tiles will fit the geometry being covered. This increases the selectivity of the primary filter, but also increases the number of index entries per geometry (See Section 12.3.2 and Section 12.3.3 for a discussion of primary and secondary filters.) The SDO_NUMTILES value should be larger for long linear spatial entities, such as major highways or rivers, than for area-based spatial entities such as county or state boundaries.
The SDO_LEVEL column determines the size of the fixed tiles used in hybrid indexing. Setting the proper SDO_LEVEL value may appear more like art than science. Performing some simple data analysis and testing, however, puts the process back in the realm of science. One approach would be use the SDO_TUNE.ESTIMATE_TILING_LEVEL() function to determine an appropriate starting SDO_LEVEL value, and then compare the performance with slightly higher or lower values. This, and other techniques, are described in Appendix A.
Assume that the ROADS layer has already been loaded. Furthermore, assume that the there is one row with valid values for the ROADS_SDOLAYER.SDO_LEVEL and ROADS_SDOLAYER.SDO_NUMTILES columns. To create the spatial index on ROADS, first create a table ROADS_SDOINDEX with appropriate columns:
SQL> create table <layername>_SDOINDEX 2 ( 3 SDO_GID number, 4 SDO_CODE raw(255), 5 SDO_GROUPCODE raw(255), 6 SDO_MAXCODE raw(20), 7 SDO_META raw(255), 8 );
Then, invoke SDO_ADMIN.POPULATE_INDEX('ROADS')
to build the spatial index.
After the SDO_ADMIN.POPULATE_INDEX() procedure has been called to fill the spatial index, you should also create a concatenated index on the SDO_CODE and SDO_GID columns. The concatenated index helps the join to the
<layername>_SDOGEOM table during a query. The SDO_GID values from the primary filter will come from the index instead of from the table.
If a geometry with an SDO_GID 5944 has been added to the spatial tables, update the index with the following procedure:
sdo_admin.update_index('ROADS', 5944);
Like the CREATE INDEX statement in SQL, the SDO_ADMIN.POPULATE_INDEX() procedure performs an implicit commit. The SDO_ADMIN.UPDATE_INDEX()
procedure, however, does not. Therefore, SDO_ADMIN.UPDATE_INDEX()
transactions can be rolled back.
The SDO_ADMIN.POPULATE_INDEX() procedure operates as a single transaction. To reduce the amount of rollback space required to execute this procedure, you can write a routine that loops and calls SDO_ADMIN.UPDATE_INDEX(). See Section A.3.1, "cr_spatial_index.sql Script" for more information.
1
See the Oracle8i Utilities User's Guide for information on SQL*Loader.
2
The transference of the domain onto a sphere or Mercator projection is left to GIS (or other) application programmers. Spatial treats the domain as a conventional X by Y rectangle.